Data scientist (Shron LLC, ex-OkCupid)
Who are you, and what do you do?
I run a data science consulting firm in New York. Organizations with too much data come to me to figure out what they should do with it.
For the most part, people know me because of my previous job as the Data Scientist at OkCupid. My job was to munge and model the massive amount of information that OkCupid collected. Mostly it was helping Christian Rudder with his work on OkTrends and doing more traditional business analysis kinds of things.
What hardware do you use?
I’ve just slimmed down to one of the new 13” MacBook Airs. My apartment is typical tiny New York, and I got tired of having a giant screen taking up my kitchen table. I carry an iPhone 4. I often need a pen and paper, so I try to keep G-Tec-C pens and a Moleskine pretty much everywhere I go.
When I’m doing analysis, I spend most of my time connected to powerful BSD and Linux boxes. Typically they have upwards of 16 cores and gobs of RAM. SSDs make everything better.
To keep myself pain-free I’ve got a Fellowes wrist-rest, an Evoluent vertical mouse, and a Hag Capisco chair. The chair makes a huge difference; instead of sinking back into a blob all day, I spend probably half of the day leaning slightly forward with my legs engaged. My desk is sloped backwards a few degrees to help keep my wrists straight.
And what software?
Once you get past iTerm2 (with the light Solarized color scheme) I’m fairly old school. Everything happens in the shell except for browsing the web or playing games (may I recommend Machinarium if you haven’t played it?). Screen, vim, awk, sed, and lots and lots of Python. Python, Python, Python. We tend to have a small number of powerful computers rather than a big cluster, so I write a lot of map-reduce kind of things by hand, using the subset of Python that is basically a functional language, along with awk and sort.
NumPy and the SciKits (which is probably a great name for a fictional band) free my brain up to focus on applications instead of book-keeping. Ditto for pandas, a beautiful way to get R’s-DataFrame like functionality in Python. iPython is my other shell, though as my projects have gotten more complex I’ve been using it less.
I realized recently when I got my new computer at home that almost everything I own now is stored on other people’s servers or on a portable hard drive. Github, Dropbox, Gmail, Steam, and a high speed internet connection disincentivize actually storing anything locally in a meaningful way.
What would be your dream setup?
Someday perhaps I will go around carrying only a book, a change of clothes, a pen, a water bottle, a folding umbrella, and a little capsule that turns into my livelihood when opened. Rollable hi-res screen and keyboard, tiny computer the size of a cell phone or smaller but as light as a pen, with high-speed satellite connectivity anywhere on the globe. In this world, my sleeping bag, pad and windproof hammock weigh only a pound put together. For half of the year I travel the world, alone and with companions, with a small bag slung over my shoulder like Kwai Chang Caine. We sleep outdoors, travel on trains, and a few days of the week sit some place cozy and create beautiful software or solve interesting problems that improve the world.