Finally understand how git works, stop using it blindly, and learn the most important commands to become productive right away.
Have you ever found yourself in this situation ?
You have contributed a few changes to a repository that somebody told you to clone, and now comes the time to pull and merge remote modifications, before you can push your code to the world. (If you don't understand this git babbling, please bear with me, you will very soon enjoy confusing others with such sentences as well)
But now you start sweating. Are you going to execute this pull command ? your hand shakes as you're about to hit the return key.
I've been there. That was about fifteen years ago, but I still remember this feeling.
At the same time, we can't do without a version control system, can we?
And git is simply the most widely used and the best of all. I don't want to start a flame war, it's just a fact. Why that, might you ask ?
Well, for at least four reasons :
But git is also a rather complex tool, and it can be quite difficult to learn. Especially if you follow the official documentation and try to learn all of git right away. Or if you rightly bail out of the documentation and try to just learn a few commands, hoping it will be enough.
In this article, I will make my best to teach you git the right way.
You will learn:
For now, we will focus on a local git repo that we're going to create from scratch.
And in the next post, we will see how to use remote repositories to save our work and collaborate with other people.
To get started, you just need access to a terminal on a computer with git installed. This tutorial is written for *nix users, but I'm sure you can figure it out if you're using Windows (or get yourself a proper computer ;-) )
I encourage you to follow the instructions and type all commands by yourself, instead of just reading this article. It's going to stick better.
You don't need to clone a remote repository to use git.
In fact, you can initiate a brand new local git repository in any directory to start tracking your changes. And that's something I actually do very often, because I just feel uncomfortable when I edit files in a directory that is not under version control. The risk is too big : if I lose my work, I will have to redo it, and I hate that.
So let's create a test repository, which we're going to use in the whole tutorial.
mkdir test_repo
cd test_repo
git init
ls -a
You get:
. .. .git
The git init
command initialized git in the test_repo
directory. And it created the .git
hidden directory.
It's always nice to look into hidden directories:
ls -a .git
total 24
-rw-r--r-- 1 cbernet staff 23 Mar 31 22:03 HEAD
-rw-r--r-- 1 cbernet staff 137 Mar 31 22:03 config
-rw-r--r-- 1 cbernet staff 73 Mar 31 22:03 description
drwxr-xr-x 14 cbernet staff 448 Mar 31 22:03 hooks
drwxr-xr-x 3 cbernet staff 96 Mar 31 22:03 info
drwxr-xr-x 4 cbernet staff 128 Mar 31 22:03 objects
drwxr-xr-x 4 cbernet staff 128 Mar 31 22:03 refs
All the information about your repo will be stored here. There is no remote server involved, just this simple directory.
An interesting consequence is that if you delete .git
or its parent, test_repo
, you will lose your entire history! The solution to this potentially dramatic and rather probable event is to push your changes to a remote repository. We will come back to that in the next article.
For now, let's start using our repo. First, let's see what is the status of the repo:
git status
On branch master
No commits yet
nothing to commit (create/copy files and use "git add" to track)
That's not too interesting for now. Still, note that git always tries to help you by giving us hints of what to do next. Often, these hints are enough. Let's do what the hint says and create a file for ...
We start by creating a simple file with a single line. This is easily done from the command line:
echo 'hello world' > file.txt
cat file.txt
hello world
We check the status again:
git status
On branch master
No commits yet
Untracked files:
(use "git add <file>..." to include in what will be committed)
file.txt
nothing added to commit but untracked files present (use "git add" to track)
The file appears as "untracked", which means that it's not currently tracked by git.
We keep following the instructions and add the file to git:
git add file.txt
git status
On branch master
No commits yet
Changes to be committed:
(use "git rm --cached <file>..." to unstage)
new file: file.txt
Aha ! the file is set to be committed, or staged. As explained, we could unstage it by just following the hint. But don't bother remembering this, you'll get the hint to help you when you need it.
So now we can commit, without forgetting to provide a useful commit message:
git commit -m 'initial version'
[master (root-commit) ab86e28] initial version
1 file changed, 1 insertion(+)
create mode 100644 file.txt
If that's the first time that you're setting up git, you get a message like this:
[master (root-commit) 7e58b70] test
Committer: Colin Bernet <cbernet@lyocms23.lan>
Your name and email address were configured automatically based
on your username and hostname. Please check that they are accurate.
You can suppress this message by setting them explicitly. Run the
following command and follow the instructions in your editor to edit
your configuration file:
git config --global --edit
After doing this, you may fix the identity used for this commit with:
git commit --amend --reset-author
1 file changed, 1 insertion(+)
create mode 100644 file.txt
No worries, but please follow these instructions now, before going any further.
And we check the status again with git status
, which gives:
On branch master
nothing to commit, working tree clean
Here is a summary about what we did, and a bit of terminology:
To get information about your last commit, do:
git show
commit ab86e28862528ffdda5737f94242989cd9ef1f51 (HEAD -> master)
Author: Colin Bernet <colin.bernet@cern.ch>
Date: Wed Mar 31 22:19:15 2021 +0200
initial version
diff --git a/file.txt b/file.txt
new file mode 100644
index 0000000..3b18e51
--- /dev/null
+++ b/file.txt
@@ -0,0 +1 @@
+hello world
These is a lot of information here:
20ff75c4baff1a200
. This is the commit ID. The commit ID uniquely identifies this commit. When referring to a commit, you don't need to type in the whole number, just the few first characters. Git will recognize it anyway.HEAD -> master
. This means that we are on the branch master
(don't worry, we will discuss branches just a bit later).hello world
. This last part is a "diff", similar to what you would get with the diff command.Now let's focus on the last part of this printout, which shows the changes introduced by the commit.
This part has a very specific format : it's a patch.
You know, sometimes when you upgrade a piece of software on your computer, you download and apply a patch. This patch is simply the difference between the new version and the old version of the software. If you apply the patch to the old version, you get the new version.
Patches are very convenient, because they make it possible to upgrade software without having to download the full new version, only the difference between the new version and the one you have, which represents a much smaller amount of data.
If you really want to understand git, you need to understand patching.
So let's create and apply a patch manually.
First, we create a new file:
echo "hello $USER" > user.txt
cat user.txt
Then create a patch file:
diff -u file.txt user.txt > patch.txt
cat patch.txt
We see that patch.txt
now contains the changes between file.txt
and user.txt
, in the exact same format as in a git commit:
--- file.txt 2021-03-31 22:17:44.000000000 +0200
+++ user.txt 2021-04-02 08:20:05.000000000 +0200
@@ -1 +1 @@
-hello world
+hello cbernet
We can now apply this patch to file.txt
to change its contents to what is in user.txt
:
patch -u file.txt patch.txt
cat file.txt
Now file.txt
contains the line hello cbernet
, and not anymore hello world
.
We can patch a single file as we did above, and we can also patch a whole directory in one go.
In git, a commit only contains a patch, and a reference to the mother commits.
For example, just consider this sequence of three commits:
When you decide to go to the version corresponding to commit 3, here is what happens:
At this point, your working directory is in sync with the version of commit 3.
A git branch is a simple pointer to a commit ! nothing more.
Let's check the current status of our repository with git status
:
On branch master
Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git restore <file>..." to discard changes in working directory)
modified: file.txt
Untracked files:
(use "git add <file>..." to include in what will be committed)
patch.txt
user.txt
We have modified file.txt
with our manual patch above.
Now, check the commit history with git log
:
commit ab86e28862528ffdda5737f94242989cd9ef1f51 (HEAD -> master)
Author: Colin Bernet <colin.bernet@cern.ch>
Date: Wed Mar 31 22:19:15 2021 +0200
initial version
Currently, our history only contains one commit. Let's commit our modifications to file.txt
:
git commit -am 'greeting user'
In this command, we have used two options:
-m
for the commit message-a
which means "add all modified files to the staging area before committing". It is just a convenient shortcut that saves us a call to git add
.Now check the history again:
commit cfc738b3bea930d4f14d6d1b93ff3af9d4880a37 (HEAD -> master)
Author: Colin Bernet <colin.bernet@cern.ch>
Date: Fri Apr 2 09:20:24 2021 +0200
greeting user
commit ab86e28862528ffdda5737f94242989cd9ef1f51
Author: Colin Bernet <colin.bernet@cern.ch>
Date: Wed Mar 31 22:19:15 2021 +0200
initial version
We see that a new commit has indeed appeared.
Most importantly, note that the branch master
has moved from commit ab86e28862
to commit cfc738b3
.
Here is what happened during the commit :
cfc738b
was createdmaster
, which is just a pointer, was moved to this new commit.And what is HEAD
? just a reference to the branch currently in use.
Since master
is just a pointer, we can delete it without destroying any commit. It really doesn't play any special role despite its glorious name.
I'd like to show you that, but we can't delete the branch we're currently sitting on.
So we start by creating and moving to a new branch:
git checkout -b tmp
Then we check history again with git log
:
commit 1aa0c80b03e172526f5dfc1b9e89c2265810cc3d (HEAD -> tmp, master)
Author: Colin Bernet <colin.bernet@cern.ch>
Date: Fri Apr 2 09:58:28 2021 +0200
greeting user
commit ab86e28862528ffdda5737f94242989cd9ef1f51
Author: Colin Bernet <colin.bernet@cern.ch>
Date: Wed Mar 31 22:19:15 2021 +0200
initial version
We see that the new tmp
branch points to the same commit a master
and that we are now sitting on tmp
, since HEAD
points to tmp
.
Another way to look at your branches is to use:
git branch
This gives the list of all branches and indicates the current branch with a star:
master
* tmp
Now you can delete master with:
git branch -d master
Check that master was indeed deleted with git branch
or git log
, and then recreate and move back to master
:
git checkout -b master
See? nothing bad happened.
To conclude this section, let's see how git keeps track of branches.
They are stored in .git/refs/heads
:
ls .git/refs/heads
>
master tmp
There is a file for each branch in this directory.
If we look into the master
file, we see that it only contains the commit ID that master
is pointing to :
1aa0c80b03e172526f5dfc1b9e89c2265810cc3d
So far, we have only used basic git commands. That served our needs so far, but most commands can be very much improved with a few options.
Fifteen years ago, my colleague at CERN Giulio Eulisse passed me down his git config.
It has been immensely useful and now, it is my turn to pass it down to you. I hope you'll make good use of it.
Your git configuration file is in your home directory, in the file ~/.gitconfig
.
Here is (a portion) of mine:
[core]
excludesfile = /Users/cbernet/.gitignore_global
editor = nano
[user]
name = Colin Bernet
email = colin.bernet@cern.ch
github = cbernet
[color]
ui = true
[color "status"]
added = yellow
changed = green
untracked = cyan
[alias]
co = checkout
b = branch -vv
l = log --graph --all --abbrev-commit --date=relative --format=format:'%C(bold blue)%h%C(reset) %C(green)%ar %C(blue)%an%C(bold blue)%d%C(reset)%n %C(white)%s%n'
lt = log --graph --abbrev-commit --date=relative --format=format:'%C(bold blue)%h%C(reset) %C(green)%ar %C(blue)%an%C(bold blue)%d%C(reset)%n %C(white)%s%n'
l1 = log --pretty=oneline --decorate
s = status
There are several sections in this file:
co
: just a shortcut for git checkout
, which is used very often and takes too long to typeb
: shortcut to git branch
, with more informationl
: nice history log with a tree view, for the whole repolt
: same, but only for the current branchl1
: one-liner logs
: shortcut for statusNow, I suggest you to merge this config with yours, and to try the aliases. The command git l
, in particular, will make it much easier for you to understand what comes next.
If you like these aliases, feel free to go and thank Giulio on twitter!
In my git config, I specify a global gitignore file for all my git repositories. It contains the following lines:
*.pyc
*~
.idea
secrets
This file tells git to ignore all files and directories matching one of these patterns:
You could define a global .gitignore that suits your needs, or put a .gitignore file in each repository. I typically do both.
Until now, we built a rather simple tree with a single branch of only two commits.
Real-life projects are more complicated, with different features being developed at the same time on different branches.
In this section, we're going to make things a bit more complex.
First, I would like to remind you the status of our repository at the moment (`git l`) :
* 1aa0c80 8 hours ago Colin Bernet (HEAD -> master, tmp)
| greeting user
|
* ab86e28 2 days ago Colin Bernet
initial version
Let's assume that you want to start developing a new feature based on the first commit. We start by creating a new branch on this commit:
git co -b new_feature ab86e28
As expected, you're now sitting on branch new_feature
, which points to the commit on which you want to base your development:
* 1aa0c80 8 hours ago Colin Bernet (tmp, master)
| greeting user
|
* ab86e28 2 days ago Colin Bernet (HEAD -> new_feature)
initial version
Now let's do some development. We simply add a new line to file.txt
:
echo 'hello Joe' >> file.txt
cat file.txt
This gives:
hello world
hello Joe
Check the status of your repository with git s
, and commit with
git commit -am 'greeting Joe'
Then look at your tree with git l
:
* 824db2c 3 seconds ago Colin Bernet (HEAD -> new_feature)
| greeting Joe
|
| * 1aa0c80 8 hours ago Colin Bernet (tmp, master)
|/ greeting user
|
* ab86e28 2 days ago Colin Bernet
initial version
With Giulio's alias, we get a very clear view of what's going on.
The commits are ordered by commit time, and we see that the new_feature
and master
branches have diverged.
Now, do the following :
master
branch with git co master
, and check the contents of file.txt
new_feature
, check again the contents of this file.Don't be afraid to use branches:
After you've developed your new feature, you will want to integrate your changes into master
. Again, this branch is just a normal branch. But by convention, we take it as the "official" development path.
Let's do this.
To integrate the changes of new_feature
into master
, we need to merge new_feature
into master
. To do this :
git co master
git merge new_feature
Argh! we get a conflict !! :-)
Auto-merging file.txt
CONFLICT (content): Merge conflict in file.txt
Automatic merge failed; fix conflicts and then commit the result.
Don't worry, conflicts are perfectly normal, let's see how to solve it. The first step is to check the status with `git s`
On branch master
You have unmerged paths.
(fix conflicts and run "git commit")
(use "git merge --abort" to abort the merge)
Unmerged paths:
(use "git add <file>..." to mark resolution)
both modified: file.txt
Untracked files:
(use "git add <file>..." to include in what will be committed)
patch.txt
user.txt
no changes added to commit (use "git add" and/or "git commit -a")
Again, git gives us fairly precise instructions. We could abort the merge, or resolve the conflict by hand. The files in conflict are indicated (only file.txt
in this case, but you could get conflicts in several files).
We're going to resolve the conflict. For this, we open the file in conflict with our favorite editor, and we see:
<<<<<<< HEAD
hello cbernet
=======
hello world
hello Joe
>>>>>>> new_feature
Above the ====
, we see what we have on HEAD
, meaning the branch we're sitting on, that is master
. Under the separator, we see what's on the new_feature
branch.
Now it's up to you. You decide what you want to keep based on what you know. In any case, you need to remove the conflict markers. For example, you could edit the file to:
hello cbernet
hello Joe
That's it, you resolved the conflict. Then, just follow the instructions provided by git:
git add file.txt
On branch master
All conflicts fixed but you are still merging.
(use "git commit" to conclude merge)
Changes to be committed:
modified: file.txt
Untracked files:
(use "git add <file>..." to include in what will be committed)
patch.txt
user.txt
git commit
And you're done !
Conflicts are normal. Just keep cool.
Finally, we can check the history with git l
:
* f1ca7e2 3 seconds ago Colin Bernet (HEAD -> master)
|\ Merge branch 'new_feature'
| |
| * 824db2c 2 hours ago Colin Bernet (new_feature)
| | greeting Joe
| |
* | 1aa0c80 9 hours ago Colin Bernet (tmp)
|/ greeting user
|
* ab86e28 2 days ago Colin Bernet
initial version
The new merge commit (f1ca7e2) has two ancestors, and contains all commits in both history lines.
As we said before, a branch is simply a pointer to a commit. So if you delete the branch, the commit still exists.
Now that we have merged, we can get rid of obsolete branches, which is good practice.
git b -d new_feature tmp
git b
* master f1ca7e2 Merge branch 'new_feature'
A git tag is a pointer to a commit, just like git branches.
But unlike branches, tags don't move.
They are a way to bookmark important states in the development of a project. People use tags to mark the commit corresponding to a release of the software
Here is what our repository tree currently looks like:
* f1ca7e2 14 hours ago Colin Bernet (HEAD -> master)
|\ Merge branch 'new_feature'
| |
| * 824db2c 16 hours ago Colin Bernet
| | greeting Joe
| |
* | 1aa0c80 24 hours ago Colin Bernet
|/ greeting user
|
* ab86e28 3 days ago Colin Bernet
initial version
Let's assume that you want to release your software at the version corresponding to commit ab86e28
. You could tag it like this (using any name you want):
git tag v0.0.1 ab86e28
git l
* f1ca7e2 15 hours ago Colin Bernet (HEAD -> master)
|\ Merge branch 'new_feature'
| |
| * 824db2c 16 hours ago Colin Bernet
| | greeting Joe
| |
* | 1aa0c80 24 hours ago Colin Bernet
|/ greeting user
|
* ab86e28 3 days ago Colin Bernet (tag: v0.0.1)
initial version
Just like for branches, git keeps track of tags with simple files containing only the corresponding commit ID. You can find these files in .git/refs/tags/
.
To remove a tag, do:Git remote : How to Collaborate
git tag -d v0.0.1
Congratulations, you're at the end of the first part of my git tutorial.
So far, you have learnt how to work with a local git repository.
In particular, you learnt about these essential commands, make sure to remember them:
Basics
git init
: initialize git repositorygit status
of git s
: look at the current status of your repositorygit diff
: check the diffs between your working directory and your repositorygit add <file>
: add a file to the staging area, for the next commitgit commit -m <message>
: commitgit show
: look at the current commitgit log
: print commit historygit l
: get a nice tree view of the historygit lt
: same, but only for current branchgit l1
: history with a single line per commitBranches & tags
git checkout -b <branch>
: create a branch and move to this branchgit branch
or git b
: print all branches and show current branchgit branch -d <branch>
: delete branchgit merge <branch_to_merge>
: merge branch into current branchgit tag <tag> [commit]
: create a tag at current commit, or at specified commit.In the next posts, we will discuss :
Please let me know what you think in the comments! I’ll try and answer all questions.
And if you liked this article, you can subscribe to my mailing list to be notified of new posts (no more than one mail per week I promise.)
You can join my mailing list for new posts and exclusive content: