Attention: As of January 2024, We have moved to counting-stuff.com. Subscribe there, not here on Substack, if you want to receive weekly posts.
Normconf is coming in December 15th, 2022! It's free and everyone should register. I'm not speaking or otherwise doing anything for it, I’m just a fan and if somehow you haven’t heard of it yet, go and press the register button! The speaker list is awesome, the organizers are great, the price can’t be beat.
As a quick joke, I tweeted a fake normconf proposal (I was too busy over the summer and couldn’t think of a talk proposal when it actually mattered, okay?) about how all dashboards should self destruct unless someone uses the dashboard within a certain period of time. It was meant as a snarky commentary on the somewhat… rocky… relationship that many data scientists have with dashboards.
The vast majority of data scientists will make dashboards as some part of their work. Many of those dashboards will wind up being maintenance nightmares, ignored, deprecated, overlooked, and only once in a very blue moon we wind up creating one that’s actually used regularly. The general advice and best practice is for us to resist the urge to constantly make new dashboards for stuff because each and every one is a bit of tech debt added to our own credit card, and we’ll personally be paying that cost in the future.
Exercising self restraint in terms of making dashboards sounds like the obvious responsible thing to do, much like just not using your credit cards frivolously. But everyday, there are people who have good financial habits who are unexpectedly forced to use unhealthy credit card debt to stay afloat. For dashboards, we see that in situations like if powerful figures like executives or leads start demanding dashboards and data scientists wind up making them to avoid having an extended debate they don’t have energy for.
So, since we have lots of technical solutions to make building dashboards easier, let’s think about whether we need a technical solution for the organized destruction of dashboards. Along the way, we’ll also just explore what value, if any, do we get from having dying dashboards lying around all over the place.
Most dashboards self-destruct anyways
If we’re going to be honest about it, the whole reason why every single dashboard becomes a tech debt problem is because they are in a perpetual state of decay. Every dashboard will inevitably break due to infrastructure changes, business logic changes, new bugs, etc.. It’s a burden that only grows with time. If we commit to making the dashboard, we are implicitly committing to keeping that same dashboard functioning over time so long as people keep using it.
So in a very real sense, if we decide to just not do the maintenance, a dashboard would wind up in an inoperable state anyways. It’s a de facto self-destruction. Anyone who eventually looks at the decayed dashboard would find a bunch of database/SQL/whatever errors instead of useful information. ]
So when I’m proposing a potential self-destruct function, I’m talking about a more aggressive solution. Imagine if a dashboard has a setting that says “delete this dashboard if no one views it within 30 days”. Imagine if there’s a little countdown clock somewhere in the corner that reads “This dashboard will self-delete in 30 days (2022-10-06) unless someone opens it”. Essentially, putting it on the users to either keep using it, or it gets deleted.
Would this be any better, or worse, than what we have now?
Post publish update: Readers reminded me that many dashboarding systems automatically have functionality built in where any given dashboard’s data won’t update unless it has some recent views/activity. It’s a cost-saving feature since you won’t spend resources on queries no one cares to look at. What I’m considering here is more drastic than this level of deprecation.
Normal users likely won’t notice their dashboards are deleted
Once, I was working at a place where the team had maybe a hundred dashboards registered in the system that was related to their work — they ran the full gamut of various feature-specific dashboards to overall product-wide status dashboards. When I first joined the team, I was asked to go through all the existing dashboards, pluck out the most useful metrics from them, and consolidate things. I made a big list and slowly worked my way through every dashboard one by one.
Over 2/3rds of the dashboards I visited were broken in various ways.
A large number were flat out unable to pull data because the underlying tables had either been deprecated or changed so much that the SQL queries simply didn’t work any more. Others had functioning queries, but the business logic had changed so much that the metrics shown were nonsense numbers like constantly reading zero.
But the most surprising thing to me at the time was the fact that everyone was continuing to do their work just fine, even with all the dashboards broken — because no one was using the dashboards anyway.
The dashboards had worked for the days, weeks, maybe even months that the team need them to, and then the team moved on and the dashboard simply was allowed to rot in place.
So the act of deleting a dashboard is surprisingly not as disruptive as it would initially seem. There might be a bit of variance in how long a given dashboard’s useful life is, but I’m willing to bet that the median is pretty low.
Deleted dashboards cleans the air
Probably the most annoying thing about having hundreds, maybe even thousands of dashboards floating around is that they ruin the process of finding the dashboards that actually ARE maintained and useful.
The situation almost always winds up becoming a system where users individually curate their own sources of truth for their own purposes in personal browser bookmarks. There are documents and spreadsheets of “useful dashboards” that are passed amongst groups. This kind of tacit knowledge rarely spreads very far so groups of people start forming islands of dashboard usage. It very often makes a mockery of any search/browse functionality built into the dashboarding tools. Deleting dashboards regularly get rid of this problem, since only the freshest, most viewed things would remain to be seen.
It also means that all teams are LESS likely to want to create their own dashboard for some common metric because they’ll be able to more easily see if someone else has implemented their own thing already. Why would someone ask for their own custom definition of product adoption when there’s a metric and dashboard that everyone else is already referring to? A self-cleaning function also has a nice side effect where there will eventually be less cleaning that needs to be done in the first place.
But deleted dashboards destroy history, which is bad
Dead dashboards provide one interesting bit of information that only dead dashboards can provide — they’re historical artifacts. Once in a long time, I’m asked a question that requires me to understand how things like revenue were reported 5 years ago, because a modern report of the same data using a newly created query just somehow doesn’t match against historical records. Sometimes, the only way to understand what happened 5 years ago is to look at what people were doing back then. That includes (and sometimes completely depends on) looking at dashboards that were created by analysts in the period.
Data archaeology is a bit of an underappreciated art of nose diving into dusty broken relics of the past to try to recreate what exactly happened in a dataset. But work at a place with a long enough history of data, and it becomes a critical skill. The risk otherwise is to pull data to do very long trend analysis and come up with ridiculous conclusions because everything in the past is fake and broken.
Having dashboards self-delete would obviously get in the way of this critical activity. The obvious solution would be to just “archive” dead dashboards so that a determined person can always refer back to them. Just make it so that archived entries are hidden away from people in normal usage.
Yes, in a perfect world, no one would have to rely on the fragmented remains of dead dashboards to try to figure out what was going on in a certain period of history. But I’ve honestly had extremely limited luck going back into the engineering code source and tracking down a series of changes that would result in data artifacts. Engineers typically don’t comment their code changes in a way that easily clicks with how data folk work with data because any code change can have unexpectedly outsized effect on things are represented in data logs.
There surely must be some dashboards that are worth keeping around forever, right?
Yes, and by getting rid of everything else we’ll figure out what those are. It’s honestly never clear a priori what dashboards will stand the test of time. So just be making sure the time-to-live on dashboards is long enough, we should be fine.
That said, we probably prepare for some edge cases. There ought to be a backup plan for unlikely situations like what if everyone goes on vacation for a month or our clocks are all somehow reset to 0. Those situations could easily have EVERYTHING get deleted, and we’ll need a way to restore functionality. So long as there’s a way to drag things out of auto-archive/delete, it actually won’t be that bad as far as I can see.
So, maybe, it’s worth a try?
There’s a bunch of technical hurdles to overcome to implement an auto-retire feature in whatever dashboarding/visualization system you’re probably using. At my current employer, I haven’t the slightest idea who to talk to to propose such an idea, let alone get it implemented. So I ask that someone out there who has less infrastructure to deal with try this and tell me what it’s like.
Because I’m actually really curious if it’ll actually work out or not. I suspect it will, but you can never know without trying it.
Standing offer: If you created something and would like me to review or share it w/ the data community — my mailbox and Twitter DMs are open.
About this newsletter
I’m Randy Au, Quantitative UX researcher, former data analyst, and general-purpose data and tech nerd. Counting Stuff is a weekly newsletter about the less-than-sexy aspects of data science, UX research and tech. With excursions into other fun topics.
Curated archive of evergreen posts can be found at randyau.com.
Join the Approaching Significance Discord, where data folk hang out and can talk a bit about data, and a bit about everything else.
All photos/drawings used are taken/created by Randy unless otherwise noted.
Supporting this newsletter:
This newsletter is free, share it with your friends without guilt! But if you like the content and want to send some love, here’s some options:
Tweet me - Comments and questions are always welcome, they often inspire new posts
A small one-time donation at Ko-fi - Thanks to everyone who’s sent a small donation! I read every single note!
If shirts and swag are more your style there’s some here - There’s a plane w/ dots shirt available!