ALL ART BURNS

It does, you know. You just have to get it hot enough.

Thursday, March 2, 2006

I have an unhealthy fascination with metrics and graphs

For the past few years I’ve been working for a high tech company doing various tasks related to provisioning service on clients that make a regular connection to home base. That’s a rather wordy way of saying, “I make people pay us every month and shut them down when they don’t.”

About a year into my job I inherited the code that does all this and took the opportunity to rewrite the code from the ground up. One of my first self-assigned tasks was code instrumentation — generate logging messages that would reveal how the often the code did what and how long it took to do it. I’ve done this enough times that I’ve learned to make the output easy to graph or massage with other programs. (Most of my career seems to be discovering something needs to be done, deciding to do it, then having to make graphs to justify the time spent on the task.) I had no idea what the code did and how often it did it, thus there was no way for me to tell someone how many servers we’d need to add to support a given number of new clients. Logs and graphs seemed the obvious solution, with PERL and gnuplot being the obvious tools.

Now — several years later — I can call up massive graphs of server activity and point to various events in our product’s history: There’s where we launched a new optional feature in version N, and here’s where we made it a built-in feature a few versions later. There’s Christmas. Well, it’s not Christmas, really, it’s the first weekend after Christmas, because most customers don’t bother using it on Christmas day. There’s the day the power went out in the server room, there’s the day the router died and the failover didn’t, there’s the day, well, you get the idea.

One of my favorite things to look at is how the curves flatten over time. Some events only happen once a month, others once a week, others once a year and others only when the client is first activated. Due to random communication issues, every client doesn’t report to the service every day and over time the partial harmonics of the initial event (say, the first weekend after Christmas) slowly flatten out and turn into the fundamental of the daily connection. A spike of activity corresponds to a spike in server load and spikes in server load means we have idle equipment when there isn’t a spike (too many servers) or overloaded equipment when there is (not enough servers). We’d like to avoid overloaded servers as much as possible and not have equipment sitting around idle, so there’s a reason for me to pay attention to the graphs. And hey, this is something I like doing in the first place.

A few days ago I came up with a way to flatten the spikes within a day or two instead of within months or even years. It’s a painfully obvious solution and something we should have been doing all along to smooth out load on the servers. We didn’t suffer any problems, but that’s like saying I didn’t need to wear my seatbelt today because I didn’t get into an accident.

There’s just one problem with the fix: my beautiful graphs will turn into efficient, flat, boring, and completely uninteresting lines. A life unlived is not worth living, a life without interesting graphs is a life not worth graphing.
I need graphing in my life. I need to find something to graph soon lest I go into withdrawal.

Current options include:

  • Get a power meter for my bicycle trainer and start graphing that against my exercise routine and caloric intake. Also get a heart rate monitor and use that once it warms up and I’m riding on the streets again, then compare those graphs to the power meter graphs and my daily weight.
  • Put a weather station on the roof and compare it to our gas and electricity usage. This would require devising an optical recognition system that could read our ancient gas meter every minute and transmit the data over wireless. (Bonus points if I make it out of Lego Mindstorms, double points if it survives the winter.)
  • Put wattmeters on all my wall outlets and figure out why my electric bill is so high.
  • Sit down and type in the ~5 years worth of data I have on my truck and make some graphs. I’ve written down all maintenance, gas usage, and abnormal driving patterns since the day I bought it, might as well do something with the data.
  • Write a system monitoring app that compares how much time I waste waiting on apps and browsing the InterWeb to how much time I actually work, then have it display realtime, hourly, daily, and weekly summaries on a big screen over my desk.

Technorati Tags: , , ,

posted by jet at 23:02  

2 Comments »

  1. Plug-in wattmeters? Child’s play!

    Comment by jwz — 2006/03/03 @ 00:17

  2. Exactly!

    Comment by jet — 2006/03/03 @ 10:18

RSS feed for comments on this post.

Leave a comment

Powered by WordPress