Skip to main content

I Love Data

·3 mins

At work, we recently started to graph the time it takes to access a branch in Launchpad via SSH, both on our staging and production servers.

I love this so much. Having this data is liberating. It’s like turning on a light in a dark room: suddenly I can go from uncertain, careful, hesitant steps to bold, confident strides.

In fact, the metaphor stretches further. This new light has revealed objects of interest (that is, spikes in the time taken) that I wasn’t aware of before. I need to know how long these spikes last, how they correlate with load on the system and so forth. Soon there’ll be more graphs, and I’ll be able to correlate them and analyze them and suck on their delicious, numerical marrow (the light metaphor long abandoned).

Regardless of whether it’s liberating illumination or nourishing, savoury meat, this new graph makes me wonder why I don’t chart other things I care about. Having such graphs would help me see the impact of my actions. I could chart things like my bank balance, my waistline or the number of Latin words I learned this week. Then I could answer questions like “do I spend more on Tuesdays?” or “can I afford this iPod/cake/gerund?”. This is important, because when I am tempted with a sleek/delicious/perplexing iPod/cake/gerund, I fall back on my own judgment. I think it’s time to confess, dear reader, that my own judgment isn’t very good. And yet I continue to trust it.

The reason I don’t have such graphs is that they are inconvenient to maintain. A graph of branch access times is easy. All you need to do is describe how to get the measurement, and then do a bit of once-off set up. If you want to graph your body weight, you need to get on the scales at a fixed time and then look at the dial and adjust for parallax and then write a number down somewhere and maybe note down whether this is before or after a meal and then take the number and then add it to a spreadsheet. Most online banking sites I’ve seen are even less convenient than this.

Getting data on life is hard, but for programming it’s easy. What things do you care about on your project? Features, user experience, how fast bugs get fixed? Can you quantify these things? Can you make a pretty picture out of it? If so, do it now! Post your answers here, set up something like Cricket and then profit!

P.S. I can’t believe I got to the end of this post without saying how awesome the Canonical sysadmins are. Let me correct that now. The Canonical sysadmins are heck of awesome. They could simply walk into Mortor.