The Basic Naming Failure in git

Before you begin your developer journey, you need to understand a common stumbling block with git.

The git logo, multiple white linked nodes on an orange-pink background.

The version control system (VCS) title known as git suffers from a common issue in the development space—lousy naming. This perpetual issue is summed up in a wide array of jokes, but one common one is:

There are only two hard things in (computer science/programming/etc.): cache invalidation, off-by-one errors, and naming things.

The original quote (I think?) comes from Phil Karlton, a man who is conspicuously absent from Wikipedia other than as an occasional name drop. The joke spun off from it quickly and with incredible velocity. But the point of the joke ties in a few common frustrations in the dev world.

Naming things has always been a problem for engineers and even more for software engineers. The often-abstract nature of the informational world can make naming things problematic when multiple layers of abstraction interact. Upon reflection, many devs will momentarily flashback to horrific moments in their past when they encountered something named factorySchedulerServiceBeanFactoryBeanService.

Look... naming stuff is hard, OK? And one thing that often trips people up when they get introduced to the ubiquitous VCS called git is the confusing naming and functionality of push and pull.

So if you need a basic explanation of git because it makes no sense to you—especially if you've been using git for a while and still don't fully understand it—read on.

git basics

The best way to approach the issue of git's naming problem is by describing what git is and what it does.

The basic unit of information that git revolves around is the repository. This is a collection of many smaller units of information:

  1. A local database of commits given unique identifying values. These values are hash IDs and make no meaningful sense to our eyes and brains. Each commit is a read-only snapshot of some file structure and its contents.
  2. A local database of names, which includes branch names. These names can then be applied to commits, and the relationship between the name and hash ID becomes manageable for humans to work with. You only need to know the name of this branch is development instead of ae78100fcf75410dda090010fea4501ccfc5116e.
  3. A local repository, typically a file system or structure within a file system that the commits relate to. Remember, a commit is a snapshot frozen in time. Some commits are earlier snapshots—some are later. Some might represent the file system or structure in its current state.

There is more about git to know—especially surrounding repositories—but conceptually, this is how it should be understood. Now we can examine the naming failure that breaks so many people's understanding of git.

The opposite of git push is not git pull

Push and pull. Yin and yang. Good and evil. Give and take. Human language often has opposites that can be used to describe things.

So git push is the opposite of git pull, right? That's how language works!

That might be how language works, but that is not how developers naming things works. People are people and make people-mistakes. Factor in what happens when software titles become popular and used everywhere, and something interesting happens—suboptimal design choices made years ago become too entrenched to change.

It's why JavaScript and CSS are nightmares to work with. They became popular, and poor design choices from their past became entrenched globally within the code of every website. This is not a unique problem. It is quite literally everywhere, in nearly all software titles and systems in some manner or variation.

One of the most common commands to use in git is git push. This command calls up another computer somewhere (or your own computer, possibly) and says, "Hey, I have this list of commits. Here, take it." After that, the developer is done with the git push process.

What happens to the commits after that is up to the computer at the other end. Sometimes, the computer waits for someone else to command it to do something, like merge two branches. At other times, it might do something in an automated fashion. The important thing here is that the git push command notifies another system of a series of commits that have been made and their contents and sends that information to the other system.

There is another typical command used in git, which is git pull. This command consists of two different parts:

  1. Run git fetch
  2. Run another command to checkout or merge the commits received from git fetch

So, git pull is a combination of multiple possible commands, with the one requirement being git fetch. You'll note that the second part of git pull does not have a built-in inverse found within git push. The second part requires intervention, either from humans or an automated system.

So what is git fetch? git fetch is the opposite of git push!

git fetch reaches out to another computer (or your own computer, if used on local file systems) and asks for a list of commits and associated information. Whereas git push reaches out to another system and tells that system what commits exist, git fetch reaches out to another system and asks that system what commits exist. In the same way that push only consists of one action, fetch only consists of one action.

There is a lot more to this whole system. What is happening under the hood is more complicated, and properly working with git requires some more nuanced understanding if you want to become a power user. There are other naming irregularities found when using git as well—for instance, why is it so often called a pull request and not a merge request?

But the big thing to note in all of this nomenclature nonsense is that the difference between git push and git pull is not what is expected given their standard human language equivalents. This naming issue has caused many headaches for people just learning the VCS, and learning about it will go far in helping you work with one of the most popular pieces of software for developers ever created.