A careful commit history will make code review easier

Written by Adrian Kjær Dalhaug at 2018-12-08 22:22:49 UTC.

I've noticed that a single commit with five lines of code will get a lot of comments and suggestions during code review. If you try to get 1000 lines reviewed, you'll probably get a green light. This phenomenon is called bike-shedding.

I would like to argue that the commit history presented during code review could tip the scale in the direction of bike-shedding or not. Too many times have I pushed the following series of commit messages:

First take on feature X.
Further improvements.
Added tests.
Fixed failing test.
Clean up.

There would often be another commit at the end in order to improve something brought up during code review.

A number of problems arises. The first one being that apart from the very first commit message, the rest of them doesn't tell a story outside of the review. When it's merged to master, it'll be a series of unrelated commits.

Another problem would be that all commits to master should pass the test suite -- at least in an ideal world. Having lots of commits that fail tests are not good.

The last problem is the intended audience. Who are those commit messages written for? What purpose do they serve for others? They are clearly written to accommodate the author, and not the reviewer or your fellow teammate from the future.

Those are the problems -- now let's look at the solution.

It isn't wrong to write such commit messages during development. The harm is done when pushing those changes as-is. I've never done a rebase in git and I'd like that to stop by trying a new tactic when doing changes over the next couple of weeks to make reviewing my changes a lot easier.

Before pushing my topic branches I'll from now on do a git rebase -i --autosquash HEAD~X, where X is the number of commits on this topic branch. This will open the $EDITOR and let you rewrite history to make it coherent and reviewable.

The trick is to separate the work into logical units. There are a number of tools at your disposal during an interactive rebase. You could reorder commits by moving the lines in such a way that each unit is together. Another smart move would be that each unit has exactly one commit by squashing them, and splitting unrelated work in the same commit into multiple commits by editing them.

Use autosquash if you discover a mistake with already committed code. By using the flag --squash <sha of original commit> during a new commit, it'll be marked as ready for squash when doing the rebase later on -- thus saving some time.

The last part of my new workflow would be how I'll incorporate changes as a result of a code review. I will rebase the changes into the corresponding commits and do a push. This would give me an error message which could be avoided with --force-with-lease. Using this flag over plain old --force is to make sure you'll not overwrite some history other might have pushed. It's safer that way.

To summarize:

One commit is one logical unit of work. Fixes later on are squashed into the one commit.
Write good commit messages aiding the reviewer.
Rebase to make the history coherent and easy to review.
Use --force-with-lease instead of --force.

I'll hopefully post a follow-up post in the future with the results of this workflow.

Comments? Mail them to: adrian@enpo.no.

Generated at 2018-12-08 22:22:49 UTC.