Debugging code is traditionally more of an art than a science. Watching a master debugger work is like watching a master chef. There is so much that goes on in their brain that it seems like magic when they finally figure out the issue.
One of my core motivations for this blog was to document my debugging techniques. Being able to isolate and elaborate my techniques has made me a better debugger in general. I’ve found that explaining how I debug has made me consider debugging a science rather than an art.
I looked around in the recent commits for a few hours, but found nothing that pointed me towards the source of the issue. I kept looking and looking without any results. The fact that it happened recently told me there must have been a change somewhere. If I had kept on this path, I would have spent countless hours and eventually found the issue.
Instead of spinning my wheels and burning time, I decided to take a scientific approach to debugging. Traditional debugging involves looking at and comparing code. Profiling, analysis, and breaking on exceptions are all code-oriented debugging techniques. There are several implementations and opinions on how to debug, but they all focus on the code itself.
This means that there is a change in the code between Jan 1st, 2014 and today that is causing this issue. If I isolated my search to just those changes, it would be around 2,000 commits. That is still too much for me to look at, so I decided to refine my search again. I checked out my repos to the Feb 1st, 2014 version. If the bug happened in this version, I would know that there was a change sometime during the month of January that caused this bug. If the bug did not happen, then I could ignore the January changes as the source of this bug. The bug did not happen in the Feb version, so I was able to prune my commit set down to 1,500 commits.
I could have repeated this technique until I identified the exact commit by narrowing my dates even more. The bug did not occur on April 23rd but did on April 24th. It did not occur at noon but did at 3 PM on the 23rd. This narrowed my search down to one commit, which was ultimately the culprit.
The code base I’m using are a set of git repositories. Git is very helpful for this kind of debugging as it offers an easy command to checkout the repository at a specific date. This is the command I use to do the date checkout:
git checkout `git rev-list -n 1 –before=”YYYY-MM-DD HH:SS” BRANCH`
The project I’m working on is composed of a dozen different git repositories all tied together through maven dependencies. Checking out one repository at a specific date may not work as the other repositories contain different expectations and could use newer features not in the old version.
Making sure that I check out each repository at a specific date is the only way to get the project close to a working state. When doing time-oriented debugging, I create a shell script that goes into each repository, stashes the changes, runs the command, then repeats the process for the next repository. I also have a date variable at the top of the script to make the process of binary search just a little bit easier.
— Zach Gardner, [email protected]
This post originally appeared on May 7th, 2014 on zgardner.us.