Attention: As of January 2024, We have moved to counting-stuff.com. Subscribe there, not here on Substack, if you want to receive weekly posts.
Last week Benn was writing about the gap between what data products promise to do, what they actually do, and what users actually use them for. In there, whilst eloquently describing how data products promise these all-in-one data paradises, we users are very stubbornly skeptical of the promise and thus don’t really adopt the tech. This forms a pessimistic feedback loop where products will promise such experiences and we largely ignore them. The lack of adoption often leads to the product failing, which further justifies to data folk why they shouldn’t bother with such tools.
That whole piece struck a chord with me. I couldn’t quite shake of an unsettling tension in my mind.
The image that stuck uncomfortably in my brain was my microwave. Mostly, I thought about how the most used button on all microwaves in the world is the “+30sec” button. Even extremely cheap microwaves these days come with all sorts of extra functionality, like turntables, sensor cooking, a digital keypad instead of a knob, and all sorts of other functions that are supposed to help users cook better and easier. But almost everyone chooses to ignore all that extra functionality and press one simple “go!” button multiple times until they get the desired result.
The microwave represents a ubiquitous consumer product that’s hit a local maximum of innovation. It does the job of heating stuff well enough that most people only need to use a single button on it. For a microwave nowadays to distinguish itself from competitors, I’m sure there are a bunch of industrial engineers and usability researchers in that industry looking into all sorts of new features and improvements. Despite all the energy invested into making a newer, better, microwave, most consumers won’t care.
This pattern of making a product that fulfills a need, and then constantly trying to innovate on top of that product to make it ever more easier and better for the user… very often leads to the strategic plan of answering the rhetorical question of “well, what if you could do EVERYTHING in the microwave?”. Since the core functionality of the microwave has been essentially solved, the only way to stay on the eternal treadmill of capitalism is to eventually bolt of new features and lower costs.
This image worries me because I see a lot of it in my work helping build software products over many years. It’s insidious, inexorable, and I’m honestly not sure if it’s a good or bad thing.
For example, let’s start with a hypothetical product with a core value proposition of “analyze your iris data and get results within few hours!”. Customers surprisingly like it and buy it. They’re even clamoring for improvements to IrisFixer. Given that it’s making a lot of money for us, we start listening to what the customers want. Some want a better UI. Some people want to import their data in something besides CSV. Other people want to export data to .xlsx. Other people complain that the product isn’t performant enough for their high speed Iris Identification as a Service product.
All of these requests sound reasonable, and with the help of UX researchers and data analysis, you figure out what features to prioritize based on all the metrics and focus groups the company can afford. Eventually, customers start complaining about how they love your product but their workflows are awkward and they want you to help. Wouldn’t it be nice if IrisFixer could import this other flower dataset? It’d be great if it could connect to normal databases and write updates. It’d be great if the data could be easily visualized without bringing up a 3rd party library.
But if you do any of that… you’ve already started walking down the path of building an all-in-one product. You can get here simply by listening to what your users want and giving it to them. You layer on convenience feature after convenience feature, all of which felt “obvious” and based upon careful use research. Before you realize it, you have the makings of an all-in-one product — and since you’re so close to launching all those features anyways, let’s just turn it into a official product strategy. It’s time to not just focus on working with iris data! Let’s be bold and try to work with all plant data! We’ll become the one program that every Iris dataset enthusiast will ever need!
The product strategy seems sound(?), it’s a logical conclusion of all the past investments you’ve made in the product. But then you start asking users what they’re actually doing with your product. Customers are opening up IrisFixer to download the Iris dataset and then save it to disk before closing the program. A smaller percentage are at least using the visualization functions. Almost no one uses the OpenOffice export functionality that you so proudly announced at IrisCon-2022.
You realize to your horror that you’ve created an Iris microwave. You’re THE preferred tool for downloading the iris dataset, but only a tiny fraction of users do anything else with your software. Do you continue working on those feature improvements?
You also realize that throughout this process, you’ve done countless rational studies and analyses to further this process along as the data scientist working on the product. Every incremental improvement made was considered the best of the choices available, and yet we wind up with a product where most users will use only a single feature.
Path paved by the removal of friction
Throughout the story above, I keep asking myself whether it is possible to avoid the fate of wanting to become an all-in-one product. I had a huge amount of trouble doing so because it just felt so weird and unnatural. But I think my answer is that products would need to be comfortable with certain kinds of friction existing.
It’s natural to want to remove friction in products, because friction tends to cause users to get frustrated and they eventually stop trying to use it. For example, it was really really hard to use Hadoop back when you had to essentially write all the MapReduce code manually and send the job in via the command line. People made it easier by first making simple UIs to submit and manage jobs, and later created full on SQL-compatible query interfaces to the data. Everyone celebrated this removal of this friction and we all got so much more work done.
But once you’ve made running queries easy, the next thing people would spend a lot of time on was taking the query results and analyzing it via a notebook or R. That meant getting the files off of HDFS and into your analysis software somehow. Wouldn’t it be better for everyone if we got rid of that friction by loading up your notebook for you and automatically downloading the data to a good location?
Well, in order to simply that we might have to start absorbing functionality out from one tool into another — specifically, out from your familiar tool and into our product. Suddenly the product gains a SQL console as well as an internal notebook interface. The product can smoothly move data from one to the other without generating intermediate CSV files. It’s faster! It’s easier! It’s… met with suspicion from all the users because it’s disrupting their comfortable workflow. The drive to lower friction for all users winds up leaving a bunch behind.
You could argue that data folk aren’t adopting the new highly integrated features as much because they’re just distrustful of issues like product lock-in. But I’m more inclined to believe that there’s something more going on — the removal of friction in one place might be creating a different kind of friction in another.
Friction from removing friction
Going back to cooking device stories for a moment, for the two months I’ve been forced to use my instant-pot electric pressure cooker as my primary cooking device due to ongoing home repairs. The forced time has made me hate most of the functions on the thing. The designers clearly intended it to be used by people who have very basic cooking skills. The thing has all sorts of beeps to tell users when the pot is hot enough to brown and sauté something, and more beeps for when to take food out.
To make a device this “smart”, the designers had to err in the direction of food safety and it overcooks things if you blindly follow the prompts without accounting for its timing quirks. I’ve actually finished cooking items like eggs in the time it takes for it to heat up and then beep to “add food”. I’ve now learned to just turn on various functions, ignore all vapid beeping, and cook as if it were an awkward hot plate. In removing a major point of friction for beginning cooks — knowing when to add food to cook and when to take it out — it had created even more friction for me making dinner.
When it comes to working with data tooling, we’re all generally experienced cooks. We know what we want to do with our data already, that use is often so unique to our specific problem that we don’t expect any product anywhere to provide the functionality off the shelf. Even if the functionality was offered, we think we know better or otherwise am required to do something unique. Integrated solutions that are supposed to be easy to use become problematic because we don’t agree with whatever design decisions the product made. So we export our work out to CSV and go back to our unique custom workflows.
Take this behavior and average it across every user and you’ll wind up seeing the familiar pattern — a handful of core activities are used by most people, then a rapidly falling long tail.
So, all-in-one data products are in a weird position of needing to accommodate a bunch of opinionated users who constantly think they know better and want to fork their work off into some custom thing. Unless the all-in-one manages to encompass enough of those use cases, most users are going to want to find a way to escape to their own workflows. This sounds fairly unlikely, and so everyone just does their own thing, the startup making a new tool can’t retain enough of a customer base to keep the lights on, the tech gets acquired by another company and the cycle begins anew.
Innovation is hard
A lot of product building work is incremental. We take something that people like, and improve upon it by adding features or making other changes. This includes all the work we do in data science involving building products. This means we’re all participating in the steady march towards making the product do everything our users could ever want.
But as people who are influencing important strategic and design decisions with data, it’s important for us to pay attention and call out the fact that maybe we’re scraping the boundary of a local maximum.
Standing offer: If you created something and would like me to review or share it w/ the data community — just email me by replying to the newsletter emails.
Guest posts: If you’re interested in writing something a data-related post to either show off work, share an experience, or need help coming up with a topic, please contact me. You don’t need any special credentials or credibility to do so.
About this newsletter
I’m Randy Au, Quantitative UX researcher, former data analyst, and general-purpose data and tech nerd. Counting Stuff is a weekly newsletter about the less-than-sexy aspects of data science, UX research and tech. With some excursions into other fun topics.
All photos/drawings used are taken/created by Randy unless otherwise credited.
randyau.com — Curated archive of evergreen posts.
Approaching Significance Discord —where data folk hang out and can talk a bit about data, and a bit about everything else. Randy moderates the discord. We keep a chill vibe.
Support the newsletter:
This newsletter is free and will continue to stay that way every Tuesday, share it with your friends without guilt! But if you like the content and want to send some love, here’s some options:
Share posts with other people
Consider a paid Substack subscription or a small one-time Ko-fi donation
Get merch! If shirts and stickers are more your style — There’s a survivorship bias shirt!
Love the comparison to microwaves. I've reduced my own microwave usage to a single button: +30 seconds. Sometimes it results in more button presses, sometimes less. But it's a single function that's always guaranteed to work, no matter my situation.
You're right that the data community tends to be very opinionated in how they build their solutions. It sounds like you take the perspective that most data tools first solve one thing then try to expand, offering too many functionalities that feel half-baked.
But what if the one thing the tool solves initially is the ability to build and launch custom solutions (a.k.a code)? I still hold the belief that data orchestration is the ultimate all-in-one tool. Its sole functionality is repeatable code execution. The more teams that figure out it can be used to run any solution under one roof (ETL pipelines, machine learning models, AI processes, etc.), the more it's going to gain in popularity and usage.
When I was a product manager we started trying to find things to get rid of. Our inspiration was Steve Jobs (and I am not an Apple person) when he said “We’re trying to make great products for people, and we have at least the courage of our convictions to say we don’t think this is part of what makes a great product, we’re going to leave it out.”