Many of you are probably familiar with this situation. You go through a lengthy process to instrument and measure something. After months of work with engineering, you proudly report that 10% of users that look at a certain page wind up clicking the important sequence of buttons to make a purchase.
Then, the inevitable question comes: “Is that good or bad?”
There’s a beat of silence as you think about your possible answers. The safest answer is probably “I don’t know, I’ll get back to you”. This is because all the data that you’ve collected so far only allows you to make descriptive statements about the world, you have done no work to make prescriptive statements. I’ve mentioned this before, quantitative methods are really good at saying what the state of the world is right now, it’s much less suited to saying what the world should be like because you can’t directly measure the future.
Normally I would just wave my hands and say that such prescriptive statements is primarily in the hands of either business stakeholders (who can say things like “we need the number to be 15% or it’s not profitable”), or researchers who can do a bunch of research to figure out what things should be. It’s generally a topic out of scope of whatever it is I’m writing about at the moment.
Today, I’m going to write a bit about the some of the things we can do to come out with prescriptive recommendations.
Broadly speaking, there’s an easier way assuming you have access to domain knowledge, and a harder way, which effectively builds that domain knowledge.
The easier, faster way — compare to others
I don’t know what YOUR specific response rate to the customer survey you just sent should be a priori, but I do know that almost every customer survey that I’ve ever emailed out (and didn’t forcibly inject into some kind of flow) generally had single digit response rates. Without any information about who’s receiving the survey and whether they’d be hyper-engaged, I’d default to guessing a 1% response rate and be pleasantly surprised if it’s actually 3-5%. If you’re seeing 0.1% there’s likely a problem somewhere.
Where did I pull those numbers out of? My past domain experience. I’ve seen enough similar situations to have a rough idea of where to expect things to be, at least to within an order of magnitude. I also have seen enough similar situations to know that if some things change, like if the survey sample is targetted very well, or is forced into a user flow, etc., then the response rates could be significantly different.
Obviously, we can’t experience every possible situation, so instead we can borrow from the experience of others. If we have a history of sending out previous surveys, we can use that as a reference point. We can ask other people who have done surveys about their experiences and response rates.
At this point, you’re essentially doing a form of research, benchmarking your survey response rate to others.
But, as with all research, you need to be extremely careful to make sure you compare apples to apples. You can very easily ask your favorite search engine what a good survey response rate is… and all sorts of vendors and survey service industry related places will happily tell you any number you want.
If you’re not paying attention, it’s very easy to find an answer that either gives you false doubt, or false confidence, about whatever number you’re getting. So you need to be very thoughtful about your benchmarks, which in turn means it takes a serious amount of work to do it well.
But after doing a bunch of such work, you might actually learn that no one has published anything that resembles your situation. Without reference points, you’re going to have do some work to try to figure out what’s going on.
Plan B: Figure out if “the metric can something else”
First, the thorough method
When you don’t have a reference point for how good a metric is, but have an idea of what direction you want it to be (e.g. higher is better), there’s a fallback strategy to try — see where you could potentially do better.
The basis premise behind this line of thinking is simple:
we know where we want the number to go (up or down)
we usually know that there’s a theoretical limit (e.g. not everyone who enters a physical store will spend >$50, not everyone, you shouldn’t have >100% response rate to a survey, etc)
we just don’t know where that limit is, nor how close we can get to it, but we’re very invested in trying to do better
so we map out our process, figure out how we’re doing along that process, analyze all the pieces, and see what can/can’t be improved
Hopefully you can see how this process often leads to a bit of recursive researching.
Let’s say you’re analyzing a funnel for your paid newsletter. You have a bunch of anonymous readers visiting your newsletter, then some portion of them sign up for the free version, and some smaller portion of people sign up for the fully paid version. At the end, you care about the percentage of readers who aren’t paid, how many convert to becoming paid. Currently the number is 1%, though you’d love for it to be higher. Where should it be?
Well, the high level funnel is simple. Readers -> free signup -> paid signup, just two big steps ending in a single metric. So you need to look into them to see where they are and whether they can be improved.
The first step is going from random readers to free signups. There’s lots of little things you need to know in order to understand and decide what can/can’t be improved. Who are the people coming to read? Where are they coming from? Are some people more or less likely to sign up? Is there something stopping them from signing up? Are they seeing the signup buttons? Are they hesitant about signing up?
Some of those questions can be answered with data. For example, you can figure out details like traffic sources and their respective conversion rates. Sometimes you’ll need to do tests and experiments to figure out why things are working as they are.
Other questions might require you to do qualitative research. Why do people hesitate to sign up? Can you convince them otherwise? Why do some traffic sources do better?
Eventually, after a ton of work and time, you can build a map of the first step and understand if it can be improved. You can then make the business decision on whether making those improvements is worth the effort (sometimes it is, sometimes it’s not).
Then you get to repeat the process for the second step.
While all this work might take a huge amount of time, you’ll eventually map out your entire process and be able to change the question you’ve been trying to answer all this time. (Hopefully you haven’t forgotten why you’ve embarked on a massive research program.) Instead of just asking “can this number be different” we’re now able to say “here’s the list of places we can make changes to try to improve things, which should we do?”
This would take way too long! Is there a shorter way?
Since very few of us have the time to execute a giant research program that could span months (or more) of work to answer a single “simple question”, we need to find some heuristics to cut some corners. Essentially, there is a giant causal map that should be able to model your problem, and you need to find all the important critical nodes, while ignoring nodes that don’t matter. Since we’re in industry and not searching for Truth in science, we can cut some corners.
Focus on places where you can make meaningful changes, “levers to pull” - It doesn’t help you improve your signups if find out that your extended family are MUCH more likely to sign up than strangers. There’s a finite number of them.
Identify the most likely friction points, places people tend to go away. Usually this involves asking people to type information in, or pull out a credit card, but also UI elements like large complicated forms. Can they be eliminated or made easier?
Use your prior experience to help prune the hypothesis space. Sure, it’s possible that the specific font in your logo image might somehow have an effect on your conversion rate… but, really, how much could the effect size be relative to making sure the “BUY NOW” button isn’t at the very bottom corner of the page?
Through the power of domain knowledge, you engage in the funnel mapping process from earlier but at a fraction of the time spent. It’s up to your skill and experience as a researcher to be able to identify what is worth looking into or not.
The hardest part is balancing the risk of accidentally pruning out a very important hypothesis against wasting time. In industry we’re trained to be pretty aggressive about pruning hypotheses to save time. If there are +10 percentage point improvements out there to found and acted on, we are willing to ignore a large number of the +0.5pp improvements. (Assuming we factor in implementation costs).
Ugh, why is this so hard?
Because while it’s easy to have a personal opinion about data, just make up whatever opinion that pleases you, it’s very difficult to build a convincing argument for why a number should be anything other than what it is right now.
Even our research-intensive method for “answering” the question is merely sidestepping the question. We’re still not sure where the number should be, but we think we can move it to a higher/lower point until it’s just not worth our energy to move any more.
Honestly, having an external business reason for the numbers is so much easier. Just declare, by fiat, that a number needs to be at a certain point. Maybe it’s not realistic, maybe it’s flat out impossible, but you can declare it so. You’ll later find out it’s impossible and just make a business decision about whether to continue working on the project or not.
Very often, such declarations wind up spurring a ton of intense research to figure out how to improve things, so in the long rung we might not save time. But the big benefit of doing it this way is that all that work has implicit approval from leadership. You’re not embarking on this huge journey based on an offhand question in a meeting.
That’s something… right?
About this newsletter
I’m Randy Au, currently a Quantitative UX researcher, former data analyst, and general-purpose data and tech nerd. The Counting Stuff newsletter is a weekly data/tech blog about the less-than-sexy aspects about data science, UX research and tech. With occasional excursions into other fun topics.
All photos/drawings used are taken/created by Randy unless otherwise noted.
Supporting this newsletter:
This newsletter is free, share it with your friends without guilt! But if you like the content and want to send some love, here’s some options:
Tweet me - Comments and questions are always welcome, they often inspire new posts
A small one-time donation at Ko-fi - Thanks to the folks who occasionally send a donation! I see the comments and read each one. I haven’t figured out a polite way of responding yet because distributed systems are hard. But it’s very appreciated!!!