Attention: As of January 2024, We have moved to counting-stuff.com. Subscribe there, not here on Substack, if you want to receive weekly posts.
Yeesh, July 1st seemed like the implosion of social media as we know it. So here’s that old crowdsourced spreadsheet of “places data people hang out”. Also feel free to find me on Mastodon and BlueSky.
Also, recovery is going quite well. But still somewhat shorter post than usual this week so I can rest a bit more.
Many years ago, while I was on my 2nd(?) layoff and doing the whole messy job search thing, I somehow wound up speaking to some people at one of the many ed-tech startups involving giving coding/data science bootcamps/classes that were popping up like weeds at the time. Mostly we were talking about my views about being a data analyst and doing data science and product work, and whether any of that would be interesting to work into whatever data curriculum they were developing.
Nothing really came of the conversation, but one offhand comment from the person I had been talking to — “you make it sound like anyone could be an analyst”. They said that because I was describing how I generally stick to using very simple, easy to understand methodologies and focus most of my energy on the domain knowledge of understanding the data itself. The fundamentals of doing analysis is a combination of domain knowledge, and the ability to carefully reason about the numbers being found.
At the time, (we’re talking around 2016 here) I would have largely agreed with the comment, I did think that most people can become an analyst. But now, I think there’s a lot of nuance that 6+ year ago me had failed to consider.
If I had to restate my beliefs now, I believe that everyone can learn to analyze, but a smaller (but still pretty large) subset would make good analysts.
The distinction is that there’s a step between “person who can analyze a situation” and “person who can analyze many different situations”.
Most people can analyze
At a fundamental level, analysis is taking data, reasoning about it, and coming to a conclusion. Most people are capable of doing this because it’s practically necessary for living in society. A lot of what we do in life is take information in, reason about it, then come to a conclusion or decision.
In a professional setting, I’ve talked to tons of people across all sorts of backgrounds and data-savviness and most of them can at least take information in, work with it a bit, and then put that information to use. Usually, these people aren’t familiar enough with the methods that data analysts use to pull and make sense of data to do the data work themselves, but every time someone follows along with a chain of reasoning you explain to them, there’s probably a chance they could have come up with that reasoning themselves if they knew the methodology and tools to do so.
The implication is that people are very likely to be capable of analyzing the data of their specific domains of expertise. They already have a solid understanding of the basic processes that underly the data they’re looking at, all they need is help learning when and how to apply the right methodologies to their problem at hand.
Consider the basic numerical tools that we use:
Counting
Ratios and percentages
Descriptive statistics
Sampling techniques, randomized, stratified, bootstrapping, etc
Experiments
Quasi-experiments
etc.
We regularly teach many of these things in single “Methods for Researchers” courses to first year graduate students. Those courses won’t impart mastery, but they provide more than enough knowledge for someone to become dangerous. Combine that basic knowledge with years of domain knowledge and you have a person who can probably analyze a situation and come to a pretty reasonable conclusion with some effort and time.
I would not be surprised if MOST data science folk actually come into the field this way. People are more likely to become excited or immersed in some topic first. They then learn the data analysis tools to make sense of the data they were coming across. A much small group of us probably entered with a burning interest in statistics and methodology and only later find outside domains to apply it to.
But not everyone’s an analyst
While I think many people enjoy understanding their domain just a little bit more to help them accomplish their goals, I don’t think everyone are as interested in being required to switch to other, often unrelated, problems over time. Being a data analyst means that you take your knowledge of data science tools and methodology, and apply them to different domains as needs arise.
In a sense, data analysts are generalists. Analysts that have to travel between projects and problem spaces won’t have the same amount of domain knowledge as a resident expert, but we make up for it by seeing similar situations with similar data and leveraging that knowledge to be effective. The transformations we apply to the data are still the same, even if the interpretation changes around us. Being good at the job is learning how to juggle the amount of confidence we derive from our methods and tools, with the humility needed to know that we don’t actually know anything about what these numbers “mean” without help.
I don’t know about you, but I don’t encounter too many people who are… mercurial… enough to fully embrace the “I am here to help you with math and computers!!”way of life while also remaining humble enough to actually listen to domain experts. You sometimes come across examples of people who don’t get that balance right and disasters result.
My own career followed the “become an analyst” trajectory because I love bouncing between fields. It’s a fundamental part of how my mind works and latches onto information. But if you were to ask me to teach others to do the same and I’d struggle to even envision what such a course look like. Perhaps somewhere out there, there’s a way to instill a similar kind of curiosity and pattern-finding across domains.
If you know how, please let me in on the secret!
Standing offer: If you created something and would like me to review or share it w/ the data community — my mailbox and Twitter DMs are open.
Guest posts: If you’re interested in writing something a data-related post to either show off work, share an experience, or need help coming up with a topic, please contact me. You don’t need any special credentials or credibility to do so.
About this newsletter
I’m Randy Au, Quantitative UX researcher, former data analyst, and general-purpose data and tech nerd. Counting Stuff is a weekly newsletter about the less-than-sexy aspects of data science, UX research and tech. With some excursions into other fun topics.
All photos/drawings used are taken/created by Randy unless otherwise credited.
randyau.com — Curated archive of evergreen posts.
Approaching Significance Discord —where data folk hang out and can talk a bit about data, and a bit about everything else. Randy moderates the discord.
Support the newsletter:
This newsletter is free and will continue to stay that way every Tuesday, share it with your friends without guilt! But if you like the content and want to send some love, here’s some options:
Share posts with other people
Consider a paid Substack subscription or a small one-time Ko-fi donation
Tweet me with comments and questions
Get merch! If shirts and stickers are more your style — There’s a survivorship bias shirt!