-
Notifications
You must be signed in to change notification settings - Fork 1
Introduction
- A Git project stores information in four separate storage areas.
-
The first of the four areas is the project directory on your file system, your working area. It's the place where you keep your card files, your card folders.
-
The second area is the all-important repository. This is arguably the main reason that you use Git in the first place. The repository contains the entire history of the project. When you commit stuff, it goes here.
-
In between these two areas, there is another intermediate area called the index. It's the place where you put your files before a commit.
-
Finally, the fourth area sits a bit to the side. It's a temporary storage area called stash. It's not nearly as important as the other three, but it's useful.
- If you want to really understand a Git command, then for most commands, you should ask two important questions.
-
The first question is how does this command move data across the four areas? Does it copy data from the index to your repository, for example. From the repository to the working area, does it delete any data from any of the areas and so on.
-
The second question you should ask is what does this command do to the repository specifically. The repository is the most important of the four areas, so how does this command change the data in there. Does it create new commits, does it move branches, does it move the head reference, and so on.
-
The first of the four areas is your working area, that is the project's directory on your file system.
-
This is where you work and you add it to your files, that's your code, and the like.
-
All of these changes like editing a file, or moving it etc happen in the working area. So the working area is so important to me.
-
However, Git doesn't care as much about it. For Git, the working area is a very temporary place. Git will generally respect the working area to avoid destroying data in there but don't assume that your data is safe until you have committed it. Once you commit your data, Git stores it in what it considers the really important area, the repository.
-
The repository is here in the
.git
folder. The most important data is in the directory that's called the object database here. -
There are a few different kinds of objects in the database:
- Some objects represent the content of a file at some point in the project's history. These objects are called blobs.
- There are other objects called trees that represent folders in the project.
- There are commits. Whenever you do a git commit, it creates a commit.
-
All of these objects are immutable. They can be created and deleted, but they can never be changed.
-
These objects are linked together in a structure that represents your project's history.
-
Each commit points to a graph of blobs and trees that represent your files and folders at the moment of that commit. For example, this commit is pointing at these blobs and trees and this commit is pointing at these blobs and trees.
-
So each commit is like a snapshot of your working area at a certain point in time. Also, two commits can share the same object. This means that these objects haven't changed between those two commits and that's the way that Git stores changes to your files and directories.
-
Each commit is pointing to its parent commits in the history. Each commit is a snapshot, i.e. a freeze frame of your project history so to say. So all of these commits taken together are a bunch of sequential snapshots, that is a slice of your project history.
-
References to commit are an important entry in it, they're called branches. That's what a branch is, a reference to a commit. And because it references the commit and the commits are linked together to form a history, the branch is basically the entry point to a history of commits. The same commit can belong to multiple branches.
-
Finally, there is a special pointer called head. There can be only one head. It's usually pointing to a branch, and that's the current branch, and the branch is pointing to a commit, so head is indirectly pointing to a commit, right, and that's the current commit.
-
Sometimes you can do operations that result in commits that cannot be reached from any branch. For example, if I delete a particular branch, then all these commits become unreachable. There is no branch pointing at them either directly or indirectly. They are not part of a history anymore, so Git will eventually delete them, garbage collect them if you wish.
-
You can visualize the index as something that stands between the working area and the repository. You generally don't move the data from the working area to the repository directly. You go through the index. That's why the index is also called the staging area. You stage your changes by adding them from the working area to the index and then you commit the changes from the index to the repository.
-
You probably think of the index as a transition area, a launch pad of sorts. In this mental model, the index is normally empty, then you add files from the working area to the index, you launch the file thing to the repository by committing that, and then the index is empty again. And in fact, that's pretty much how the index is implementing it.