Monitoring With Zabbix 2.2

Posted: 15th November 2013 by Josh in Uncategorized

Recently saw that zabbix 2.2 had released and was looking for something new to use for monitoring. Some of the features it had were things that I needed out of the box (primarily, monitoring client web applications, external web portals and some other scenarios (that word is important, and I will touch on that later).

Some folks use things like sitescope, or HP’s BPM monitors or (like I was doing) writing curl scripts that look for expected output. What’s nice about the scenarios feature is that Zabbix allows you to set up calls that look for response code or even expected strings of text.

On top of that, you can build a scenario by passing POST data (and the data you pass can be variables that you set up per scenario). So for me that was great. I need to monitor hundreds of client instances that generally follow the same patterns but only the url, username and password change.

It can also work with an agent to send over network traffic, cpu, disk, and memory usage. I think this will be helpful because our system is rather complex and distributed across many servers. Being able to see patterns from usage will be huge for us.

Expect a post to follow this on how to use something like the TamperData plugin to see the post data so you can plug it into your Zabbix scenarios.

My current obsession: Logging

Posted: 16th October 2013 by Josh in Uncategorized

We have something like 150 tomcat apps running at work. Each app is in a two way cluster. That’s 300 individual application logs. Toss in the accompanying logs and it’s 600.

Keeping the applications running smoothly is a part of our “DevOps” (I know it’s not a position/team but a way of thinking, forgive me for making it simple) team’s responsibilities. When a client reports an issue that we were not proactive about there are usually the following questions:

  • What node were they on when the issue occurred? Just one? both?
  • What time did the issue occur? Would the issue be in one of the “rolled logs” or the current log?

It’s pretty bad if the issue was both nodes. Our logs aren’t totally standard, so we can’t use Apache’s chainsaw to analyze them.

I’ve been looking at a Logstash toolchain to solve the issue. Essentially you have a service that spools and ships your logs to a central server. They’re received via redis, then indexed by another logstash process and then are pushed to Elasticsearch. The front end (kibana) gives you the datavis and searching tools. Everything goes to one interface. You can choose which servers you see data from, what paths, timeframes.

The current obstacles:

  • Our logs aren’t in a standard format. Configure logger to write a SECOND log that’s in json format for easy ingestion.
  • I have to make this change in 300+ places in production. That will come after I handle the integration test, qa and user acceptance environments. We’ll call it a conservative 600 applications in total.
  • These configs are in the older style and not the log4j.xml. I might as well fix that while I’m at it.
  • Validate these configs are in one of the legacy version control systems (CVS or SVN) and then migrate them to Git (and adjust the deployment accordingly).

I solved a similar problem in our batch environment last year. Our business users needed to be able to research their own issues, but I wasn’t about to set 100+ users on our linux systems just to view data loading logs. I ended up building a system that allows our users to pick their customer, then a date and view the logs there directly. It’s a decent solution for what they need, and it allowed me to write the tool that securely built the read-only symlinks in Ruby. It’s dynamic and maintains itself as far as adding/removing clients from the interface.

For the more technical users (DevOps, Development Support) I wanted them to have the full power of the logs, datavis, trending, and then be able to use those things for capacity planning and research. So rather than rebuild something I’d already done, I wanted to put something powerful together that was more flexible…and selfishly because I wanted to use these tools to add to my experience and learning goals.

One of the virtual machines in my home lab has been sitting there calling out to me…”Power me up, learn more puppet….leaaaarrrn!” So, tonight, I obliged.

Until now I’ve been doing my work in the Learn Puppet VM that Puppet labs provides for free ( ) and have been going through their exercises over lunches and in the evenings. With my nev environment at home being more conducive to bringing up and tearing down VMs I figured it was time to get this going. My wife is now running Linux too, so I can manage that box.

Not having to have done the setup on the VM, I didn’t really consider what was going to be necessary to get it going from scratch. I figured I’d do my tests with the Master and Agent on my server itself initially and then go from there. That was the plan at least.

I just whisked away and installed (DONT DO THIS)

$sudo apt-get install puppet puppet-master

And that’s where things started to go sideways. The problems weren’t apparent until I wanted to make the agent’s SSL certificate be friendly with the master, who were actually the same box.

I was getting many errors about the certificate being incorrect, issues with my FQDN (I’ll address those in a separate post). This is because there’s some special setup that you need to go through to get them running together. Even though the PuppetVM is running puppet enterprise I took a lot of my cues from there.

We’ll start this over from the beginning rather than a “fix a broken setup” perspective.

Step 1. Install Puppet Master:

This is in the default Ubuntu 13.04 Raring repos
$sudo apt-get install puppetmaster

But then go ahead and shut it down right away and verify that it’s not running:

$sudo /etc/init.d/puppetmaster stop
$ps -ef | grep puppet

Step 2. Blow away existing SSL configuration

At the time of this writing, the default puppet SSL configuration is in /var/lib/puppet/ssl so we can just blow away that SSL directory. Yes, it’s safe to do this. Puppet will recreate this structure when we restart it.

$sudo rm -rf /var/lib/puppet/ssl

Step 3. Generate proper certificate names in puppet.conf

puppet.conf is the config file for puppet. Mine was already populated with some attributes under [main] and [master].

The two values that matter from an SSL perspective and need to be set are


I’m not running a local domain at home (yet) so I just have the hostname to worry about. These two config values should match.

Step 4. Start puppet master back up via init.d & check the cert it generated (in the newly re-generated SSL directory)

$sudo /etc/init.d/puppetmaster start && ls -l /var/lib/puppet/ssl/certs/

Step 5. Install Puppet agent

$sudo apt-get install puppet

Step 6. Use puppet agent to test connect to puppet master

Because the certs should have the same name (because we did that in step 3) this should all be “OK” already.

$sudo puppet agent --no-daemonize --onetime --verbose

Nothing should be done (because you haven’t written any classes/manifests yet) but you could do something easy like create a file or ensure NTP is configured.

Commands that were helpful in researching this issue

I didn’t actually need all of these, but they were useful in the journey

puppet cert --list
puppet cert sign
sudo openssl verify -CAfile /etc/puppet/ssl/certs/ca.pem /etc/puppet/ssl/certs/

Puppets troubleshooting documentation on certificates:

Building the home virtualization lab

Posted: 4th September 2013 by Josh in Uncategorized

When I first got started in IT about 10 years ago virtualization wasn’t very big. I had several lower powered machines and that was my lab. Education, marriage, mortgage, kids, cars, are all expensive and I don’t have the space or money to maintain a bunch of boxes.

I’m still playing mad scientist at home though, so what’s a geek to do? Virtualization my friends, virtualization.

I’m a linux guy so this is themed towards linux. That means vmware’s esxi is out (I’d need a windows machine to use their management software as not all features are on their webUI yet). Hyper-V is also out. That leaves KVM derivatives and Xen.

Since KVM is right there in the kernel and it’s well supported that’s what I went with. Going to be using the “proxmox” VM setup, which offers a nice WebUI over top of KVM.

I ordered an 8 core AMD processor, 16gb of ram, an SSD for the boot drive and a compatable motherboard. I’ll be recycling the case and some 500gb HDDs.


Things that I’ll definitely be doing:

1. Replace: My media server is outdated, runs on a single core processor with 1.5gb of ram. It barely handles transcoding on the fly.

2. Setting up a 3-4 node learning environment. Logstash and Puppet primarily for now.

3. Getting a little deeper with Nagios/Icinga.


Parts show up tomorrow!



Physical Tools for IT

Posted: 30th August 2013 by Josh in Uncategorized

Someone was asking about what physical tools they should get as they were allocated some budget for tools. Thinking beyond software tools, physical tools have a place in IT as well. This is what I sent the person who was asking, I think it’s a good list if you have ~$1,000 (US) to spend. Obviously it’s not all needed at once but I feel it’s a great all around set. None of the links are affiliate, they’re just standard amazon links.

When I was a field tech, mostly working with structured cabling and networking equipment I took the following along with me, some of these things are no longer needed with VOIP phones though (the telephone specific stuff):

  • Klein Scissors, great for working with cabling.
  • Klein 11-in-1 screwdriver (I actually had two)
  • Harris puncdown tool ( I think fluke took them over though in quality) the little flip out hook is really useful when doing 66 blocks or panels – something like this (You can get cheaper here, but I burned through the ideal ones pretty fast. The harris ones seemed to last a while).
  • Multimeter (Fluke makes some nice ones)
  • Cable Tester – this will do phone, coax, cat5/6 and isn’t crazy priced. You can buy more remotes for it too:
  • Tone and probe kit (sends a signal over a cable and you can use the wand to figure out what cable it is. Killer for unmarked cables. You’ll probably want to take some jacks and make up your own adapters too) –
  • Banjo adapter is nice if you don’t want to make your own:
  • If you have traditional phone lines to deal with, a buttset (or linemans set) isn’t a bad thing to have either.
  • Some coax tools (crimper/stripper)
  • A drywall saw
  • A decent tool bag. I used a generic husky one from lowes.
  • A drill (keep the charger in the bag) and a decent bit set
  • A box of random screws and anchors
  • Bigass roll of velcro
  • Double sided tape
  • ALL the connectors (in bulk)
  • Electrical tape
  • Pull string anytime you fish a conduit or a pain in the ass space, leave a string.
  • Various standard hand tools, spackle isn’t bad to have and a putty knife, I had a cheap soldering iron and some basic stuff for it too. Sharpies (a whole pack of them). A level, a tape measure.
  • A label maker! Brother makes a bunch. Label everything!
  • Flashlight. I carried two. One small penlight on myself (streamlight) and one that shared the batteries with my drill.
  • Leatherman

Obviously this all doesnt go in one bag, but I had mine in my car/truck all the time. It’d be a lot easier in an office.

A year or so back a coworker had a couple Asus EeePC-900 netbooks that he no longer had a use for and asked if I could use them. They’re basically as powerful as a VM on my home server, but with screens so I figured I could find a use for them.

I toyed around with one of them using puppy linux attached to the living room TV…not really powerful enough for media streaming etc though. Lately though, I’ve been using the free time I can find to work on creating a home lab again. These netbooks work pretty well in that role.

I wanted to brush up on Nagios as I hadn’t used it recently. So I took the machine that had the weaker specs of the two and put debian on it. For my use I didn’t need X so I left it as a bare install (basically the webserver type build that’s pretty common out there) and then put it on the shelf in my rack with an ethernet connection.

SSH over and do some installation/configuration and it’s up and running nicely. I’ve got it checking on my home server’s http, mysql, ssh services and load and then some of my VMs as well.

I might take a look at icinga though, some folks were saying it’s nagios with some extra polish. The specs of the machine can handle the minor load, so it makes a good homelab monitoring solution.

EDC Bag Dump

Posted: 31st May 2013 by Josh in Uncategorized

This is a popular thing on a lot of the “lifehack” or IT forums etc I visit on the net. People do “pocket dumps” or “bag dumps” or show their “Every Day Carry” – I snapped this pic for one of those threads

  • Ipad 2
  • Gunnar Prescription Glasses
  • Aviator Sunglasses
  • Work Blackberry
  • HTC Thunderbolt (I’m just using it as an MP3 player, needed something with wifi/bluetooth)
  • Wallet
  • Mio Knockoff – I prefer this flavor to the real thing
  • Small grooming kit (nail clippers/scissors, file etc)
  • MotoROKR bluetooth headphones/headset
  • Skullcandy Titanium earbuds
  • Nintendo Controller Shaped Mint tin that has a basic med kit:
    • Bandaids
    • Tape (used to use this a lot during JiuJitsu)
    • Neosporin
    • Tylenol, Aleve, Sudafed, Excedrin, Motrin, Immodium – single use packs
    • Qtips
  • GridIt Organizer. Its essentially a web of elastic bands for holding random things.
    • Cables: IOS, Usb micro
    • Memory Sticks – usb/sd cards. Bootable linux systems/installers, quick sneakernet stuff etc
    • Iphone headphones (I keep these around because they have a mic built into them in case my BT ones run out of battery
    • USB Battery Pack
  • Fisher Space Pen
  • Small Pen Light (I’d really like a better one of these…maybe for fathers day!)
  • Keys – normal stuff, I keep a metal cased USB drive for resume/contacts etc on here
  • Leatherman Fuse – The IT guys’ lightsaber (some days at least)

So that’s the daily carry; though the keys, wallet, glasses, phone (always on call!) and leatherman are usually on me. The rest of the stuff sits nicely in a swiss army laptop bag (and laptop is in there too).

I was working on a script that I wanted to behave one way when run by hand on the server in question, and another if it was being called non-interactively (like through an automation suite, or remote ssh etc).

There were a lot of interesting ways to do this, but I settled on this one as it worked in the majority of environments that I tested it in (centos, ubuntu, mac os)

isInteractive=$(tty -s; echo $?)

In the case I mentioned above, I wanted some extra info on the screen if it was interactive. In a script it would look like this

[[ $isInteractive -eq 0 ]] && echo "This is interactive" || echo "This is not interactive"

#non shorthand
if [ $isInteractive -eq 0 ]; then
    echo "This is interactive"
    echo "This is not interactive"

I’m not going to touch the shorthand vs. non-shorthand debate here, there’s a time and a place for both. In practice if it’s non-interactive then the value would be “1″ (but “not zero” suffices). So now when I run something by hand I receive a format that lends well to a terminal. When my web app makes an SSH call and runs this script, it gets something different (as the full output can be bettered by CSS etc).

Another good use that I found for this is when in an interactive session I may not want to “> /dev/null”, and effectively it can be a debug mode as well.

Pretty useful!

Part reminder for myself to build some of this at home for playing around with, part for friends who are already doing this stuff. I’ll need to be somewhat generic here for obvious reasons.

I’ve been working with the development team to get all of the operations scripts (mostly bash but also some python, perl, ruby and then some PL/SQL as well) into a version control system. We decided to go with Git and because we’re already using Jira, Confluence, Crowd and a number of other Atlassian tools, we decided to go with Atlassian Stash.

Stash is kind of like Github or bitbucket for companies that don’t want to have their code hosted externally. It lets you create your own projects and permissions based at the project and repo level). With yesterday’s 2.4 release of stash it became a little easier because of the addition of repo level permissions (they were only at a project level from 2.3 and earlier).

The goal for the actual environment is to start with some basic configuration files, get them in a organized structure and then symlink to them. They’ll be owned by a new configuration user and then the only way to write changes to them will be to clone the git repository and go through the ‘pull request’ workflow. Then one of the senior members, such as myself, will approve that. Then we’ll use a tool called Jenkins (sometimes you’ll see it as its older version named hudson) to actually do the deployment. This gives us an audit trail and a way to control the execution. In time, all of our shell scripts will work in this manner.

The other project that I’m working on is log aggregation, but I’m still doing some testing/research there. The solution that I like the most so far is logstash along with some of the tools people use most often with it. In a nutshell it will aggregate all your logs (which helps with disk space on servers too) to a single server where they are consumed and then made available through an interface like kibana or greylog and you can do analytics on them. We’re hoping to do this with our application logs.

I really wanted to use/learn Nagios as part of this project but since I don’t have admin/root access it’s tough to push that project through another group as it’s just R&D right now.

A friend had asked me about an easier way to send all the output from a bash script to a log file. He said he was sick of doing something like:

echo "Some message" >> $logfile
grep pattern file >> $logfile

Those are simple examples, but having to put the >> $logfile into each line could be a pain and then you might miss things on commands where you weren’t redirecting the output; this would be applicable for workload automation (cronjobs, autosys).

Bash has a built in named ‘exec’ that can lessen this load for you (among other things):

exec >$logfile 2>&1

To explain, the exec command is taking over the current shell running it and then sending stdout and stderr to this file. Now any command/function etc that is run within the script will send its output to that file.

Edit: I though I should mention that I use 2>&1 above because I end up getting stuck working on systems with older versions of bash. In modern cases this has been replaced with “&>”