Deep dive into git: Git refs

In the previous post, we've seen how git store internally objects, what makes it fast and flexible, how objects in Git are identified by a hash and the relation between commit, tree and objects. In this post, we'll tolk about git refs.

Refs

Since we want to manipulate objects quite often in Git, it’s important to know their hashes. You could run all your Git commands referencing each object’s hash, like git show d67nel4, but that would require you to remember the hash of every object you want to manipulate.

To save you from having to memorize these hashes, Git has references, or refs. A reference is simply a file stored somewhere in .git/refs, containing the hash of a commit object.

Refs are stored as normal text files in the .git/refs directory. To explore the refs in one of your repositories, navigate to .git/refs. You should see the following structure, but it will contain different files depending on what branches, tags, and remotes you have in your repo:

       $ ls -F1 .git/refs 
      heads/
      master
      remotes/
      tags/
      v0.3

The heads directory defines all of the local branches in you repository. Each filename matches the name of the corresponding branch, and inside the file you’ll find a commit hash. This commit hash is the location of the tip of the branch. To verify this, try running the following two commands from the root of the Git repository:

    $ cd .git/refs/heads
    $ cat master
    ceb40ab150d96b1204a8fd98b4440a097cf16f34
    $ git log -1 master
    commit ceb40ab150d96b1204a8fd98b4440a097cf16f34
    Author: Mohammed Aboullaite <aboullaite.mohammed@gmail.com>

As you might noticed, branches are just references. To change the location of the master branch, all Git has to do is change the contents of the refs/heads/master file. Similarly, creating a new branch is simply a matter of writing a commit hash to a new file.

Of course, it’s possible to simplify this process. Git can tell us which commit a reference is pointing to with the show and rev-parse commands.

      $ git show --oneline master
      ceb40ab commit empty
      $ fd98b4440a097cf16f34
      ceb40ab150d96b1204a8fd98b4440a097cf16f34

The tags directory works the exact same way, but it contains tags instead of branches. The remotes directory lists all remote repositories that you created with git remote as separate subdirectories. Inside each one, you’ll find all the remote branches that have been fetched into your repository.

Special Refs

Git also has a special reference, HEAD. This is a symbolic reference which points to the tip of the current branch rather than an actual commit. If we inspect HEAD, we see that it simply points to refs/head/master.

  $ cat .git/HEAD  
  ref: refs/heads/master

It is actually possible for HEAD to point directly to a commit object. When this happens, Git will tell you that you are in a detached HEAD state, what it means basically is that you’re not currently on a branch.

in adition to HEAD, there are a few special refs that reside in the top-level .git directory. They are listed below:

FETCH_HEAD: The most recently fetched branch from a remote repo.
ORIG_HEAD: A backup reference to HEAD before drastic changes to it.
MERGE_HEAD: The commit(s) that you’re merging into the current branch with git merge.
CHERRY_PICK_HEAD: The commit that you’re cherry-picking.

Reflog

The reflog is Git’s safety net. It records almost every change you make in your repository. You can think of it is a chronological history of everything you’ve done in your local repo.

A Git reflog is a list of hashes, which represent where you have been during commits. Each time a branch is updated to point to a new reference, an entry is written in the reflog to say where you were. Since the branch is updated whenever you commit, the git reflog has a nice effect of storing your local developer's history.

Furthermore, the pointer in the reflog points to a commit object, which in turn points to a tree object, which represents a directory-like structure of folders and files. So whilst the reflog is active, you can go back and see what changes you have made, and even recover specific files from previous commit versions. With Git, you never really lose anything! even if you've done a filter-branch to re-write history, you're only a reflog entry away from getting it all back.

To see what the reflog is all about, run git reflog from an active Git repository. It might look something like this:

  $ git reflog
  ceb40ab HEAD@{0}: commit: commit empty
  2937af1 HEAD@{1}: commit: Commit empty file
  e3c3a05 HEAD@{2}: commit (initial): add file 1

The first number is simply the commit hash at the point the change was made. Even though these don't represent linear history, these are the sequence of actions taken on the local repository, in the order they were done.

The second is the state of HEAD, along with the number of changes. In this case, we have HEAD@{0}, which means where HEAD is now; HEAD@{1} is where HEAD was previously, and so on.

The final part is the type; whether it is a commit or an amended commit, and the commit subject. This is often helpful to remember where the code was, especially if it isn't part of the linear history. It also contains other operations, such as checkout, merge and reset.

That's all! I hope you found it useful :)

Ressources: