Some Gamedev and Shoddy Data Arguments

With sketchy charts!

Apr 14, 2020

It’s surprisingly hard to think of topics to write about that aren’t The Virus when it pops up literally everywhere you look in every waking moment. But here we go!

A really fluffy alpaca in Hokkaido, Japan

This week we’re going to talk about a part of the video game market, doing data science within a company, and having to make some pretty weak data arguments for a job.

Last week, Valve, the privately-owned video game company that created Steam, the biggest online PC game distribution platform on the planet, put out this blog post (with an associated research appendix). The post is trying to show how new games being sold on the platform are doing, ostensibly to give the message that it is good to publish on the Steam platform because more games than ever are hitting revenue milestones.

This TL;DR is from the post itself.

If you peruse the comments on the post, you’ll see that the overall tone is one of anger, skepticism, and frustration. Many, myself included, called BS on the analysis. The big tell that something was up was the entire post (and the appendix) used graphs that show raw counts of games hitting certain thresholds of revenue earned within two weeks.

The blog post rests the majority of the argument that the number of games making $X dollars has gone up over time. They also used a weird linear trendline from 2014 to attempt to articulate that “if we didn’t open up our platform, all the games above the line wouldn’t have been accepted and had a chance to make that money”.

But the community quickly saw through the argument.

Game Volume Has Increased, A Lot

Over the past few years, Steam had experimented with letting more and more developers publish games on Steam. Indie developers starting in 2012 could submit games to Steam’s Greenlight program to hopefully get enough votes to be included in the store. It involved submitting a game, and Steam users could vote on them, and if the game accumulated enough votes, it was allowed onto the main store and the treasures within.

Greenlight allowed a steady trickle of smaller games onto Steam, but in 2017, the entire program was scrapped in favor of Steam Direct. Under the new program, any developer can submit a relatively small fee ($100/game that could be recouped with sales) to publish their game on Steam with fairly minimal review. This meant that, since 2017, there has been an explosion of games put on the platform.

The newer process did have its own issues. There has been an explosion of “shovelware” and low-effort, low-quality games appearing on the platform. That led to Valve needing staff at Steam dedicated to removing/blocking low-quality games. Even with this new level of curation, the total games released on Steam has exploded over the years. Over 8000 were released in 2019 alone, an average of 22/day, compared to 1651 in 2014.

Under this literal “throw everything at the wall and see what sticks” content strategy, any competent data analyst would instantly want to know what is the percentage of games released hit a certain level of revenue. That would be an approximate answer to the question of “how likely am I to succeed if I get on?” If the platform was being good to all (or even some) game developers, then that percentage should be rising. Instead we get a weak argument that only shows a numerator.

Luckily, there are ways to figure out what the denominator is. Game launch data is publicly scrapeable information, and a platform called Steam Spy has the data in its records (along with sales projections based off sampling users and seeing what they own).

Ars Technica did the legwork that most of us didn’t have the energy to do, and combined the two data sets. They came to the somewhat unsurprising conclusion that ~80% of games on Steam earn less than $5k in the first two weeks, (it has graphs showing the percentage over time too, check it out!). That definitely does not sound like good news to anyone thinking about making a living publishing games on Steam.

Why Even Make This Post?

There were business reasons that likely led to Valve wanting to make such a post. For years there have been increasingly loud complaints from smaller independent game developers that Steam has been doing worse for them.

Some devs reported a drop in year on year revenue and sales. Many indies have blamed Steam’s recent(ish) changes to their massive recommendation system for prioritizing already-popular big name “AAA” titles with multi-million dollar budgets, to the detriment of smaller indies. In my own personal experience, the small game publisher that I work with has also seen significant declines in sales on Steam starting exactly on the week when various algorithm changes went out. Traffic to the store pages of small indie games have massively tanked, as has revenue.

It used to be that if your game was accepted onto the Steam store, you’d be able to reap the benefits of the millions of active gamers (95 million monthly actives users in 2019) willing to spend money games. In that environment, even converting a tiny fraction of a percent means tens of thousands of dollars in revenue. That could be enough to keep a small team indie developers funded enough to work on their next project.

Now, the recommendation algorithm is apparently lending greater weight for games that have higher CTR and conversion rates. The algorithm creates a feedback loop where games that do well are given more traffic via prominent placing in various parts of the store. Indies wind up suffering because their niche, lower budget games with no advertising/press budget don’t have the same broad appeal as the big names. Subsequent algorithm updates that were supposed to improve game discoverability haven’t helped much.

With the situation changing, developers are increasingly questioning whether Steam is worth the 30% cut of gross revenue they take.

To be fair to Steam, people recognize that the 30% also pays for tons of other features like credit card processing fees, network gaming protocols, DRM, relevant sales tax handling, servers, and bandwidth costs, etc. But 30% is still a pretty hefty margin of profit to give away. It stings even more when, if your game sells a ton, Steam is willing to take merely a 25% cut (for sales between $10M and $50M), or even a 20% cut (for sales after $50M). Clearly there is some margin available to give away.

As Steam has become less of a ticket for financial success now, developers have responded by hedging their bets. There’s no exclusivity involved in publishing a game on Steam, so devs also publish on alternative distribution platforms like Green Man Gaming, GOG (Good Old Games), and Itch are popular. These platforms attract a different sort of gamer audience and/or give a better cut of revenue. Just like Valve is throwing tons of games up to see what sells, devs are putting games on all sorts of stores to see where things sell.

Some developers are even selling their games on their own sites, reaping 100% of the revenue in exchange for the trouble of running their own e-commerce systems.

Meanwhile, on the AAA side with big game developers, there have also been competitors coming up. Among the most recent notable entries, the developer Epic Games launched their own Epic Games Store, citing that the cost of steams 30% cut was too much. While EGS has a lot of criticism directed at it for aggressively paying developers large sums to launch titles exclusively (for a time) on their store, this does represent a threat to Steam’s fat margins. EGS’s revenue share, provided you’re allowed onto their store, is merely 12%.

So, About Those Ethics

From all that business-side stuff going on, you can see why someone within Valve may have felt a need to write post that can reassure developers that it is still worthwhile publishing on Steam. Maybe it started out as a developer outreach piece, or maybe it originated from the marketing team request, who knows.

What we do know is that Valve hires data people:

They clearly look for Economists, and Statisticians, people who definitely know how to pull meaning out of numbers and analyze complicated systems.

While Valve’s “Data Science - Other” job posting leaves a lot to be desired, I think it’s safe to assume that whoever did the analysis for this post and avoided mentioning a massive denominator change was aware of what they were doing. The person knew enough about data to exclude things like non-game releases, and use medians to analyze revenue gain across groups. Those are not the fingerprints of a novice analyst.

The data person involved in the process was likely asked to analyze the question of “are new games doing better on Steam?” with the explicit request that it would be a public blog post. Then, when that person did their analysis and found the unsatisfying result that ~80% of new games make very little money, the authors decided to write the most positive truth available to support their argument—“We’re seeing more games making >$5k, $10k, etc than ever!”

This is the kind of borderline ethical hair-splitting situation that we’re occasionally placed in when working in industry. It’s not outright lying, but there is a definite bias to the numbers.

Many Views of the Same Elephant

What remains a mystery is what metrics Valve is choosing to optimize for on their platform. I have no doubt that the current recommendation system generates more overall revenue for Valve than the older one, and perhaps pure revenue is a major component. It should be fairly obvious within their data that there is a rich-get-richer dynamic going on. Given how big, popular, AAA titles sell more and have higher price tags than indies, the feedback loop is very easy to trigger.

Their analysis argues that prior to 2012, many games wouldn’t have been allowed on Steam to begin with, so it’s possible that in their view, any success stories are icing on their publishing cake. I’m sure they have internal metrics that define some notion of what a good game is using things like game play time, % of games played to completion (via achievements), etc. An actual good game would sell more and trigger the feedback loop, because their system optimizes for that.

But for developers who are depending on sales to make a living, what may sound to staff like a rational, obvious, conclusion in line with internal metrics and thinking comes off as contrived and out of touch. There’s a misalignment with the values of the people viewing the same data.

Putting Forth An Argument

So what’s a data person to do when there’s a disagreement on views of reality? Well, given that the person likely had little choice but to publish something, they worked under the constraints given to them, and create a “technically correct” analysis that didn’t tell the full story, and let the expected backlash come.

The actual debate about whether games are actually doing better, and how to define, measure, and optimize for an end result, is a much larger internal project that is way beyond the scope of a blog post to have.

For the task at hand, the companies that hire us have a vested interest in published results putting them in a favorable light. Out of all the analyses that I assume had been run in the background building up to the published blog post, this was the most favorable they could come out with.

Models and analyses have a range of values that can be output given a certain set of reasonable assumptions. That analytic leeway, to choose model parameters that lean one way or another, to show evidence in favor of one outcome over others, is the ethical tightrope that we walk.

Depending on what agenda we’ve been hired to provide evidence for, we may reasonably choose to use the upper bound, lower bound, or some value in between. For example, if we were projecting future revenue, we may choose to send a conservative estimate to the budgeting team to lower the risk of overspending, while providing a more optimistic estimate for the sales team to encourage more aggressive sales targets. Both models are valid, but they differ based on the intended use and viewpoint.

The situation reminds me of what lawyers are asked to do. Part of their job is to put forth the best legal case for the clients that they represent, and that sometimes means using pretty bad arguments. The philosophy being that when both sides come up with the best evidence and arguments in their favor, the actual truth has a chance to come out. That said, there’s a difference between making a weak or bad argument, and a knowingly false one, which would be against the ABA’s rules for professional conduct for legal arguments and preserving the integrity of the adjuicative process.

I’m not advocating that data scientists view themselves as mercenaries that merely write the truth as dictated by our employers. Our world needs more truth and honestly, not less.

We don’t operate in an adversarial court. There’s no “other side” to present counter-arguments. We don’t operate under peer review. There are no guard rails around our ethics besides our personal integrity. Our field currently has no shared code of ethics at all. We don’t even have a professional body that even declares what our code of ethics should be.

So at the very minimum, we should strive to do no worse than lawyers when in a position of having to advocate for a given argument.

“Try to do no worse than lawyers”… Now that’s a sentence I didn’t expect to ever write in my life…

Counting Stuff