At home, I’m the primary cook because I’ve been treating cooking as a hobby since before I met my spouse. As some readers might know, I have a tendency to take my hobbies to absurd lengths, and somewhere along the journey of reading various techniques and styles of cookery, I wound up absorbing a lot of details about food preparation and food science.
The wonderful part about having access to that sort of knowledge is that I’m able to follow along when other advanced cooks discuss recipes. Common tasks like tossing garlic in hot oil right before stir frying is quickly described with the word “Bao” (爆). It’s quick to describe how to cut things by referring to whether it’s a mince, dice, chiffonade, julienne, or rolling cut. Complicated but fundamental details like how to check for doneness can often be skipped over because you can assume the listener will know how to tailor doneness to their liking. It all only makes sense to someone who’s spend a solid chunk of time working with food.
But on the flip side, this same knowledge causes trouble when my wife finds some random Chinese recipe on the internet and is unable to translate the exact details about what needs to be done. Sometimes it’s because she doesn’t know how to translate the terminology from the language the recipe is written in, or more often, even the recipe writer doesn’t use (or maybe know) any comparable terminology. A lot of time is wasted flipping back and forth through videos and pictures trying to figure out what needs to be done. It’s a poor experience all around.
This is the reality of jargon
To me, this story about my frustrations with certain recipe writers is a perfect picture of the power that jargon and professional terminology has making communication work. It’s downright painful at times when someone tries to communicate a technical topic, doesn’t use the jargon, and on top of that doesn’t make an attempt to clearly communicate without jargon. Because it’s often possible to communicate without jargon, but it takes more time and effort to do so.
These days, discussions about jargon very often touch upon the topic of gatekeeping, where people use inscrutable technical language as a weapon to exclude people from joining a community. “You don’t know why you should normalize the your feature vectors while bootstrapping your hyperparameter search of heteroscedastic tensors in Julia? You’re not a data scientist.”
Let’s be clear that gatekeeping continues to be a problem in our field and those particular asshats who use language as an exclusionary weapon can [graphic content omitted]. We’re luckily that a large and visible part of the data community is very welcoming and actively invites new people to join.
Like with all other technical fields, jargon is simultaneously used for both good and evil. In the past decade or so, a lot of rightful attention is being placed upon jargon for its many negative uses as the general population slowly learns to have conversations about the complex issues around power, language, inclusion, and exclusion. But today I want to highlight another important aspect of jargon.
Data science is too young to have much jargon of our own. but we borrow from many other fields, and will continue to do so.
Think about all the highly technical language that is used in data science today. There’s all the statistics related jargon that pops up all the time. All our AI/ML jargon essentially come straight from respective CS and math fields. There’s the computer science and software engineering jargon when we start talking about databases, pipelines, agile methodology, and tech. Then there’s all the business related jargon like “ARR” (Annual Recurring Revenue), LTV, etc. On top of it all, there’s the domain specific jargon that’s needed to work in any particular industry, from medicine, app stores, to finance.
I honestly can’t think of anything that’s particularly unique to data science where we use a word in a very specific sense that few people outside do. As a field, that barely had an identity for 15 years — forget about special terms we can barely agree on what to call ourselves yet (am I DS? ML eng? Data eng?).
Even while we don’t exactly have our “own” bit of jargon, we at a nexus of other people’s jargon. Being put in a position of working with ideas and people from a huge range of fields, we must walk a very complicated line of constantly having to code-switch between speaking with and without jargon, and even different flavors of jargon depending on who we’re talking to.
The communication aspect of data science is one of the most important parts of the job, to the point where some weird parts of the internet have apparently invented a job title I haven’t heard of, analytics translator (from 2018 O_o?). I’m happy a few years later that title seems to not have any traction, because it indicates that the work is being done and there’s no organizational gap. Hopefully it’s us data scientists taking up the work.
As a cross-discipline position, we work better with other functions when we have a deep understanding of the teams we communicate with. Being effective means we inevitably need to absorb and make use of their own technical language. We all know that we shouldn’t go deep into details about Bayesian statistics to a designer — speaking in clear plain terms that helps the designer understand what we mean is an accepted part of the job. But equally important, we need to learn how designers, or marketers, or policymakers, use their own technical terms. Those terms are a reflection of how they view and organize the world.
That’s what it means to “translate” between fields. It’s not as simple as converting everything into a single universal language, because such a thing doesn’t exist. You’re always going to adjust how you explain things to the audience by using words and metaphors they understand. Like what Hilary Mason did explaining AI to 5 different people in this video.
If the inventory folk see their world in terms of shipping dates, shipping containers, SKUs, and lead times, our analyses and models need to express the world in comparable terms too. When you’re providing finance with data they need to close the books by a certain date, you need to recognize they need the numbers to be as accurate as possible and not estimations.
We should embrace other people’s jargon
The only way we’re going to learn the jargon of other domains is to dive in and absorb as much as possible. I always argue that we live and die by our command of domain knowledge, and learning local stakeholder jargon is an important aspect of that. Which keeps things simple for us, learn enough about the fields of your partners so that you can understand, communicate and work with them and that will give you the jargon you need. All we then need to do is use the terms that we learned.
It’s an act of inclusion to use people’s own familiar words to communicate, and an act of exclusion when you use words they don’t know (unless you intend on explaining/teaching). So any discussion about how we use technical terminology needs to be grounded in context of how it’s wielded. Under that backdrop, do the good thing.
About this newsletter
I’m Randy Au, currently a Quantitative UX researcher, former data analyst, and general-purpose data and tech nerd. The Counting Stuff newsletter is a weekly data/tech blog about the less-than-sexy aspects about data science, UX research and tech. With occasional excursions into other fun topics.
All photos/drawings used are taken/created by Randy unless otherwise noted.
Supporting this newsletter:
This newsletter is free, share it with your friends without guilt! But if you like the content and want to send some love, here’s some options:
Tweet me - Comments and questions are always welcome, they often inspire new posts
A small one-time donation at Ko-fi - Thanks to the folks who occasionally send a donation! I see the comments and read each one. I haven’t figured out a polite way of responding yet because distributed systems are hard. But it’s very appreciated!!!