3 Comments

As someone really late to the party (I've entered the professional work force 3 years ago) - this post hits really close to home. I work as a data analyst, but I have ambitions of becoming a data engineer, and my strategy has always been slowly eking out more and more technical work for myself in my current position. These shadows and gaps between products is where I live and work.

I was lucky to work in a place where there are a lot of gaps that I can fill, but every time a new system is being introduced, I am really anxious that this one is going to fill all the needs, and I will be instructed to go back to ad-hoc reports, Jupyter, KIbana dashboards and talking to business people.

But that's not here yet. I spent at least 3 hours last week poking around in GitHub actions to make sure that when a specific config for a specific system is changed, an email is shot out to a team which is downstream-dependent and has a weird release cycle where they rarely do e2e tests and if we change a lot, they will not find out until much later down the release train. It even seems like it functions.

Also, once I've built a periodic reporting system (if you want more engineering lingo - "last-mile ETL") entirely based on Jenkins, just because our team did not know about the existence of Airflow and thought that it would be too complex. Apparently, using Jenkins as a "glorified crontab" is widespread?

Expand full comment

Tooling isn’t always a distraction. It can help rub the nose in the data deeply, engraving patterns and anomalies as some read transform send on process is designed tested and debugged. Before CAD emerged as a standard tool in the 80s, field geologists at the U.S. Geological Survey would spend much of the off season bent over drafting tables wielding colored pencils patiently coloring maps. This was not an obvious task for Ph.D. scientists.

What long experience had shown, however, is that those hours were among the most valuable in the report preparation process because of all the details that there was time to absorb and the time to be thinking about it in the back of the mind. That time spent can yield rich opportunities to detect the kinds of contradictions between data and theory that lead to paradigm shifts.

Nobody except maybe Brian Kernighan, remembers what purpose motivated Ken Thompson, Dennis Ritchie and their buddies at Bell Labs to tool up UNIX and C. It wasn’t in their job descriptions, no committee approved it and no project manager oversaw it.

Expand full comment
Comment deleted
Expand full comment