Be Yourself: The Data Scientists You See In Public Are Not Representative

It needs to be said regularly —  it’s OK to be nothing like them

Photo: Randy Au

Hi, person who is, or wants to become, a Data Scientist!

Think of any data scientist you admire. Are they the founder of one or multiple startups and foundations? Do they have fancy titles and degrees like Senior Staff Chief Something, PhD at FAANG? Do they have a gajillion twitter followers and have the best shitposts? Do they blog/newsletter/write books faster than you can read? Is their Github chart greener than the plastic plants on your desk? Do they have more Kaggle prize money than your annual salary?

Do you look at yourself and are just speechless at the gap between where you sit and them “over there”? I’ve got something important to say to you.

You don’t have to be ANYTHING like those people. At all. They’re not normal. It doesn’t take nearly that much to work within the data sciences.

While these people are definitely awesome and can sometimes be an inspiration, they’re also standing atop unknowable intersections of luck, opportunity, and survivorship bias. Fame is rare by definition. If a bunch of data scientists resembled them, even just a little bit, they wouldn’t stand out any more. We’d have different heroes.

As data scientists, we’re supposed to be experts at understanding things like sampling, and hidden bias lurking in our data sets. Cutting through that chaff to get to the delicious grains of knowledge within is what we’re about. But thanks to human nature, it can be surprisingly difficult to turn that thinking inward.

The vast majority of people who work in data don’t live and breathe the stuff constantly. They have families to take care of, non-data hobbies to enjoy, and friends to play with. They don’t enjoy programming so much that they do it for fun at home. They don’t compulsively stalk arXiv to read the latest Machine Learning papers. They don’t go on speaking tours or constantly attend/run Meetups.

But despite not doing ANY of these things, these people will provide useful insights to their organizations, engineer robust systems, run experiments, and do important work. Only just quietly, unseen by the world at large. Even the unicorns don’t do all those activities simultaneously!

Do what you want

In my own case, Iwrite semi-regularly because it’s cathartic to get stuff written down. But I spend the majority of my non-work time cooking a tasty dinner, fixing things I break at home, and taking care of the baby. Work stays at the office, and none of my side projects has anything to do with data.

While we often hear about imposter syndrome, how the very people we admire feel skeptical that they’re worthy of the admiration they get, we hear less about just where the bar is before it’s acceptable to consider yourself part of this great community.

I’m here to say that the bar is, and should be, low. If you work with data, you’re doing stuff that is within the scope of the data sciences. The job title of “Data Scientist” is merely given to people who meet an arbitrary set of skills and experience as dictated by some hiring manager. It varies too much to be very sensible. It’s just noise.

Don’t get intimidated by all the people you admire. They’re high above folk like you and me exactly because they’re awe-inspiring unicorns.

Also, gate-keeping jerks can sod off.

Have a good holiday season, everyone.