Using git bisect to find bad commit

Today, I am going to introduce a technique to easily find out the first bad commit in your git repository. With this technique, you can save a lot of time debugging a code introduced by certain commit. Let's get started!

First, assume that we have the following commits in our project ("Initial commit" is the first commit)

e1b2cef Yes, Another change
307bb41 Another change
ccdc070 Add footer
f6ce740 Add container
f258f1c Add styles
c1417d3 Initial commit

Let say that the bad commit in this case is f6ce740 Add container, it includes a certain annoying bug which is causing trouble in our project, and we don't know when this bug occurs, the best thing we can know for sure is that this bug does not present in the first few commits. Imagine that your HEAD is e1b2cef Yes, Another change, if you want to check the bad commit, here are several solutions:

  • Look at the commit history and check modified files on each commit: This is the simplest but also most time-consuming method. In above example, there are only a few commits. But try to think of a big project with thousands of commits. It is really hard (or impossible) to check each commit. Even though you can limit the number of commits on a specific file, there is still a large number of commits and most of them are irrelevant to the bad one
  • Use git bisect: awesome!

As you learned in university or some courses, there is a quick and powerful algorithm to search for an element in the sorted array, it is called "binary search". And git bisect is also based on this idea (do you notice the prefix bi- in the command?). Here is an illustration of our commit log:

---------+++++++++*

Here, - is normal (good) commit, + is commit containing bad code and * is current HEAD. Our job is to find out the first + in our commit log. This involves two steps:

  1. Identify a good commit, prior to first bad commit. And then identify a commit including bad code as close as possible to where you think the bad commit is. This is to reduce the number of commits we need to search (as you know, fewer means faster _). But in case you can not guess where the commit is, just choose any commit including bad code, it still works fine.
  2. Run git bisect to start searching

All right, let's go back to our example. We know that Initial commit is a good commit, and we know current Yes, Another change is the bad one. Here is how I run the command to check the bad commit, step by step.

First, we need to let git know that we are going to start bisect procedure:

git bisect start

Next, we will let git know which commit is bad, if the current commit is bad, you don't have to specify the hash

git bisect bad

Then, we let git know which commit is surely a good one (in this case, it is the Initial commit, we have to specify the hash)

git bisect good c1417d3

At this point, git start checking and it will let you know how many remaining commits to check and which commit it is checking now (git will choose the commit at the middle of your selected range at the beginning)

Bisecting: 2 revisions left to test after this (roughly 1 step)
[f6ce7408ad0826436c00eb94b7973a965362bc4c] Add container

Yeah, git automatically checks out the commit Add container. Our job is run our code and let git know if this commit contains bad code or not. In our case, this is a bad commit, so we let it know that it is bad

git bisect bad

(Of course, if this commit is good, we need to run git bisect good)
Yeah, now git knows this commit is bad, so it does not need to check all the commits later than this commit. You have just reduced the search range to a half (binary search)

Bisecting: 0 revisions left to test after this (roughly 0 steps)
[f258f1ce329e4a9dc29bc28e5b03b94838f50333] Add styles

Cool, it seems we are very close to know which commit is bad because after finishing the last step, there will be no remaining steps. And we now are at Add styles commit (again, this commit is in middle of the current search range). Let's say that this commit is good, we will run

git bisect good

Tada! git now is able to conclude which is the first bad commit

f6ce7408ad0826436c00eb94b7973a965362bc4c is the first bad commit
commit f6ce7408ad0826436c00eb94b7973a965362bc4c
Author: xuanchien <chien.study@gmail.com>
Date:   Tue Apr 7 10:06:05 2015 +0900

    Add container

:100644 100644 181eaba787ba21f0ba37f6e1f59d2b65f1484275 43c84923dba77ddd3697b5606a211e7aa166f68a M	index.html
:100644 100644 797174569262aaecbeb627eecc7076cdd762e2c8 74f2a3be79fb260233d18a64b24273dcd3baf658 M	style3.css

Do you see how cool git bisect is? It saves you lots of time and usually it should not take more than 20 steps to find out your bad commit (unless you have a really really big project).

Now, it is time for your to enjoy coding instead of banging your head against the wall. Let me know your comment. Happy coding!