Discover more from Counting Stuff
The Many Ways of Learning Git
And other stuff about teaching stuff
Shinichi here looks about as confused as I feel
Starting this week at work, I’m going to be teaching a lot of fellow UX coworkers how to build something using The Cloud. By the time this publishes, we would’ve had our kickoff meeting. As one of the few people with any semblance of a data engineering background (even though it’s not all that much) within my local group, I was always helping put confusing user feedback into a clearer context. But it was getting clear that we could all benefit from learning about using The Cloud in a more general and holistic way.
With the exception of UX Engineers (people who actually code features/prototypes with a focus on usability and accessibility) most UXers do not have software engineering backgrounds. They’re writers, researchers, designers coming from a very wide set of disciplines, very few of which involve sitting at a computer and writing code.
It’s always a good idea for people involved in user experience to have a hands-on perspective about the things that they are working on, but not everything easily accessible. In a common example, sometimes UXers will experiment with using screen readers to get a rough sense of how someone with a visual disability might interact with their web site. They won’t be able to know every nuance that an actual real user would know, but at the very least they can understand the frustration when buttons don’t have the proper label and so only get read as “button” instead of “submit button”. Even a tiny bit of user empathy can take things from “unsuable” to “usable”.
The challenge ahead of us is that The Cloud is complicated tech that’s still changing rapidly. Think about how much technical knowledge is needed to understand something as simple as “hosting my blog in the cloud”. There’s multiple ways to do it, from fully managed stacks to running everything custom on a VM.
Why might we need blob storage? Why would we need caching or a CDN or clusters? How does IP networking work? What’s a web server, or a SQL server? How does it all “work together” to host a blog? Things that seem obvious and fundamental are actually embedded within a deep web of technical concepts that all lean on each other.
So I’ve gotten the OK from layers of managers to go and take a bunch of volunteer UXers on a dogfooding journey. We’re going to build something fairly complex in The Cloud as a group and learn first hand how utterly painful the process can be. My job is to make sure no one gets too stuck and to fill any engineering knowledge gaps they need to complete this journey.
I’m not really sure how I’m going to give everyone a decent taste of all this without overloading them. But we’re going to find out over the next few months. I won’t blast everyone here with the gory details, but expect a lot of data/tech topics sprinkled with pedagogy articles in the coming weeks ahead as my brain is occupied with explaining tech.
[Last minute update] We did have the kickoff meeting today. People seem excited, but who knows how it will actually go down. No turning back now! Hopefully I’ll have interesting stories to tell within a couple of weeks. No plan survives the contact with the enemy after all, and I did a ton of planning…
Step One: Setup and Git is Hard
First order of business, and what I spent well over a week in doing was a lot of prep and setup so that everyone joining our little project would be able to set up a dev environment. This included navigating corporate IT provisioning and security requirements to make sure we weren’t breaking any rules. Luckily, we won’t coming anywhere near any grey areas.
Next, we were going to hit our first major teaching moment very soon. We’re going to have to write code, or at the least, have configuration files for various systems that need to be checked in to a versioning system. That meant everyone is going to have to learn a modicum of git.
Overall git’s got a pretty shoddy UX. I use it on a semi-regular basis and still manage to shoot myself in the foot every so often. It’s also a very common thing people have to learn to get started in modern programming and data science, so I knew that someone out there must have written a good tutorial to give to complete beginners.
My core complaint with many online git tutorials is that they often wind up being essentially lists of “Follow this workflow, if you want to do Action X, use this incantation”. Git is ultimately a very complicated piece of software for keeping a graph representation of changes in sync across a bunch of distributed machines, it’s a complex tool because it’s a complex problem.
Explaining Git is Hard
Many “for beginners” explanations of git are very surface level and often make a few critical errors. First, in an effort to make things “beginner friendly”, they hide the graph-based nature of git from the reader. Instead they simplify things down to a variation of git clone, add, commit, push. Maybe branching might be covered.
Many tutorials also make the critical assumption that the reader has some sort of engineering background. There’s a lot of background terminology, and knowledge (why do you want to version files? what is branch/merge? what’s a diff?) that is sometimes assumed. Even concepts like BLOBs and file formats can be passed over as self-explanatory when they’re not for a non-eng audience.
Finally, most git tutorials involve using Github, the de facto git repository host to the world. That’s fine for many people, but since I’m not using Github for this, all those bits are somewhat irrelevant. Git’s fairly robust in terms of handling remotes the same, but the very detailed workflows around setting up SSH keys, forking and making pull requests, etc., don’t carry over very well.
So instead of searching and wading through literally thousands of very similar tutorials online, that had many of those flaws (even for ones targeted at non-programmers)…
I asked Twitter for examples that taught good mental models for git.
And Twitter delivered in spades! The question hit a sweet intersection between awesome data folk, and awesome engineer folk, all of which are teaching people new to coding how to code or otherwise learning themselves. Here’s some of the favorites that I plucked out of the things people sent me.
Git for Humans
First is Git for Humans, by @alicebartlett which doesn’t show any git commands at all. Instead it focuses on explaining the various terms and the story of why you’re using git in the first place—to version files, time travel while checking out, merging changes and distributing to remotes. Having this vocabulary means that someone who reads a more in-depth tutorial that does show actual commands will (hopefully) not be too overwhelmed.
Next up, I really liked the thoroughness of Atlassian Bitbucket’s git tutorial. The explanations are very well written, with an eye on being technically correct but clear. The one issue is that it is LONG. There’s a lot of content there, as befitting a complicated piece of software. It’s probably too long to tell a newbie to look at, but pointing to very specific sections seems doable.
I especially like the detailed command explanations in the “Getting Started” section, which takes the time to explain the difference between git’s three states of ‘working directory’ ‘staging’ and ‘commit’ when discussing something as “simple” as git add. The section on comparing workflows is also especially nice, even for experienced git users who may only be used to a single workflow.
Happy git and GitHub for the useR
Next, there’s Happy git and GitHub for the userR, by Jenny Bryan. In contrast to the above, it doesn’t explain in super depth about the what and why’s of git (that seems intentional). Instead it provides step by step guidance on getting git to work with RStudio, and chooses to teach by showing example workflows and how they can be adapted to a user’s needs.
What separates this resource from the thousands of “enter these commands” is that 1) there’s explanation on why something is done (or not done) and 2) it answers lots of questions along the way, like “how do I undo what I did? why didn’t we push yet?”
Learn Git Branching
Next is a completely different beast, git as a visualized game! Learn Git Branching will try to teach you commands one at a time for branching operations. It’s got a cute visualization of commits and branches on the side so you can see what the effects are.
Git from MIT’s Missing Semester lectures
The “Missing Semester of Your CS Education” series is a wonderful set of lectures about a bunch of topics not normally taught in a typical CS program, but are useful in a programming career. The lecture on version control (git) is a bit CS-y in how it discusses the topic, using terms like objets and references. It’s main goal is to show what the git data model is to a CS student, so that they can more easily grasp what the commands are doing. While not suitable for my non-tech audience, it’s a good reference for those who do have a CS background.
Oh Shit, Git!?! / Dangit, Git!?!
Swearing/no swearing versions of the same content. There are a huge amount of ways to mess up while using git. It is also very non-obvious how to fix such issues. Most people (myself included) will resort to using
git reset —hard to try to fix things, and if that fails, we’ll flat out delete and re-clone the whole repository.
Obviously, there are better ways to fix certain common issues. That’s where these sites come in, showing some of the more common failure modes and how t do them.
Also, I wanted to say thanks to everyone subscribed to this newsletter. With everyone’s support, we’ve managed to cross over 200 people subscribed. =O Hopefully I continue live up to everyone’s expectations and earn your continued support.