The second in our series of “What you always wanted to know…” blogs. Today Philip Armour, our newest Technical Consultant, shares his thoughts on Git commit messages.
What’s one of the ubiquitous features of a Source Code Management (SCM) tool?
It is, it seems, the existence of a ticket associated with each atomic set of changes (changes that are accepted in their entirety or not at all) made to a software asset over time.
In addition to recording how the software evolves, each one also includes a human-readable message which summarises why the change was needed. In short: they’re important.
Some years ago my main SCM tool was IBM Rational Synergy, and the ticket for this tool is called a task; in Perforce it is called a changelist. Incidentally, these were normally created prior to any files being changed.
Now that I have become a learner and user of Git, I know that the ticket in Git is the commit, and that I need to create one after I have finished some updates. This is also when the human-readable commit message is written.
Why it matters
Regardless of the SCM tool, good commit messages are important, perhaps just as important as writing meaningful comments within your source code.
Git sits fairly low down in the tool chain, and thanks to its distributed nature, if you’ve done a git fetch origin recently all this information should be in your local repository and very easy to access.
Updates to code normally span multiple source files, and this means that commit messages have a more ‘aerial view’ perspective than the ‘ground-level’ comments in source code. They tell the story of how the software has evolved; they can help greatly when your boss asks if bugfix ABC went into customer branch DEF (for example by using git log --grep="ABC"); and they represent a form of passive ‘to whom it may concern’ collaboration which results in hard-to-quantify but tangible and definite long-term benefits.
So how do we get the most of our Git commit messages?
When learning Git and messing around with our personal ‘sandbox’ repositories from the command line it is understandable to create quick commits like:
git commit -m “change to some code”
But this is probably not a great habit to keep hold of when modifying real production code.
It is better to simply use git commit and let Git invoke our default editor for us. When we do this, Git brings up a default message with some advisory comments, which we can then tailor to our specific requirements using the commit.template variable in our git config.
A template for good messages
According to the Pro Git reference (written by Scott Chacon and Ben Straub), a way to get good commit messages is to follow the template suggested by Tim Pope:
Short (50 chars or less) summary of changes
More detailed explanatory text, if necessary. Wrap it to
about 72 characters or so. In some contexts, the first
line is treated as the subject of an email and the rest of
the text as the body. The blank line separating the
summary from the body is critical (unless you omit the body
entirely); tools like rebase can get confused if you run
the two together.
Further paragraphs come after blank lines.
- Bullet points are okay, too
- Typically a hyphen or asterisk is used for the bullet,
preceded by a single space, with blank lines in
between, but conventions vary here
A key feature of the above is the one-line heading separated by a newline. Many writers on this topic also advise using the imperative present rather than the past – so “Fix ABC issue” rather than “Fixed ABC issue”.
Consistent good practice with regard to the commit message can improve the efficiency of our investigations and searches involving the commit messages, and will also encourages us to avoid SCM ‘sins’ like bundling logically-unrelated changes together in a single commit.
Commit messages and our ‘other’ tools
A very important topic is the integration of work in Git with other systems higher up in the tool stack. An obvious use case is where commits are associated with issues being tracked in a tool like Atlassian JIRA . It is then likely that part of our commit message policy is to include the ID of a relevant JIRA issue.
Atlassian has explored this area further with the concept of Atlassian Smart Commits. This makes it possible to embed commands into commit messages which can, for example, automatically transition a JIRA item when the commit is processed in the Atlassian tool suite.
We can greatly enhance the way we collaborate and document an evolving software asset by using code review tools like Gerrit Code Review or Atlassian Stash. In the case of Gerrit Code Review there is a 1:1 relationship between commits and review tickets, and the commit message itself forms part of the work to be reviewed.
A further good way of enforcing/automating policies with regard to commit messages is to use Git ‘hooks’ (small scripts which are triggered to run when specific events happen). These can be used to scan commit messages and if necessary block the commit process on the client and/or block the pushing of commits to a remote repo on the server side.
Be careful what you commit…
Having good policies in place for the content of commit messages is particularly important when you consider that git commit objects, including the message, become part of the immutable DAG (Directed Acyclic Graph) of Git’s database.
This means that rectifying a situation where historical commit messages have undesired content is far from trivial, and requires careful use of tools like git filter-branch to completely rebuild the DAG.
There is an interesting discussion of this topic in the Gitminutes episode with Roberto Tyley. The guest interviewee talks about cases where the git history of large in-house projects have had to be rebuilt to strip out sensitive information (emails, passwords) from all commit messages prior to the project becoming open-source and in the public domain, along with discussing the challenges inherent in this operation.
So given this potential difficulty of changing old commit messages, what happens if my commit and message for bugfix ABC turns out to be incorrect in some way, and I want that knowledge to be visible to future developers?
The Git ‘sticky note’…
Luckily, there is a lightweight solution. In cases where you need to associate some extra information with an old commit message, you can turn to git notes.
I can use this feature to add further information to a historical commit by typing:
git notes add
which will open the default editor. If I wish to add something else later I can use:
git notes append
I can also attribute different categories to my notes and configure Git to automatically push and fetch notes information to and from remote repositories.
A final thought
There we have our brief exploration of the Git commit message. There’s a lot more to the humble commit message than is immediately apparent – it’s definitely something that deserves some consideration.
So next time you find yourself typing:
git commit -m “misc changes to a bunch of files”
Just remember that future developers may have a frown on their face when they see it!