On good git pull requests
I love git for version control. Yeah it's a little crazy and the command line is a little intimidating.
Once you know what you're doing, though, there's no better way! There is so much flexibility and power there.
But, believe it or not, learning the git command line is the easy part.
Then comes the real challenge of satisfying the people reviewing your code.
And the first thing you may wonder is, "Why should I care if my code makes someone happy?"
You should always try to make it easier for someone to review your code. This applies to your teammates and to those you work with in the open source community. Making life easier is the whole point of working together!
But think for a second what will happen if they DON'T like reviewing your code..
If everyone hates reviewing your code, then your pull requests may sit and languish for days. With every day that passes, your eventual merge back into master will be more difficult.
While you are waiting for someone to review your pull request, you are also working in a new branch, on another feature.
Your co-workers are also working in their own branches. You're all working on the same files. Welcome to merge hell! You don't have any plans tonight do you?
If your code were a pleasure to review, though, your code could get in master quickly. You could avoid all this waiting and rotting and merging and conflicts. And your co-workers would like you better.
But it's easy to feel like a ping pong ball when others review your code. All developers have an opinion on the one true way to do anything. It's only natural that they think it the best and only way to do things.
I've had people complain that my PRs:
- had too many commits
- didn't have enough commits
- that each PR should have one commit
- that my commit messages were too long
- that my commit messages were too short
- that they never read commit messages anyway
- that I didn't have enough tests
- that my tests were pointless
- that I didn't lint
- that I didn't rebase enough
- that I shouldn't have rebased to preserve history
- that I didn't comment enough
- that comments are useless
Every single person thought they were right in the absolute sense. There is a right and a wrong and I was doing it wrong.
I felt like Schrodinger's developer, right and wrong about everything at the same time!
I am writing this post, to give some small advice on how to create an effective pull request. Follow my advice, and you will avoid some amount of pain during your journey.
How PRs are reviewed on GitHub
For this article, I will make the assumption that code review will take place on GitHub.
There are two types of developers during code review:
1. Show me the commits
2. Show me the changed files
GitHub Review Commits
GitHub Review Files Changed
Commits make more sense and the argument is pretty straightforward.
One day in 2005, Linus Torvalds sat high on a mountain. He thought that commits would be a swell way to document software development.
Each commit is a unit of work associated with a few files and a comment explaining the purpose of the commit.
To me, this is using git as intended. To review code any other way, you're kind of going against the design of git itself.
You won't go to prison, it's not a capital crime, but that is how I see it.
Then there are those who like a long list of changed files, with no rhyme or reason to them. This is not a high crime or anything. But as we will see, it is inferior to reviewing commits.
If you are new to git, then you will likely prefer the changed files way. I know that I did, when I started working with git.
And there was a reason I preferred it this way. I didn't know enough about git.
Back then, to me, git was an esoteric ritual for neckbeard magicians.
Once I committed my changes, I had no idea how to change that commit. In fact, I would have been afraid to change anything concerning git commits.
But did you know that you can delete commits? Change commit messages? Re-order commits, combine them, or split them out into more commits?
You can do all this by something called rebasing. Many people prefer reviewing commits from the
Changed Files tab, because they don't know how to rebase. We'll get into the specifics in a bit, but for now think of it as re-writing your commit history, because that's exactly what it is.
I am the type of developer that once I get something working, I like to commit my changes before I break it. When things go south and all hell breaks loose, I can reset my local changes back to the commit that was working:
git reset --hard
Then I can try again from that working copy. But when you don't know how to manage your git history, these tiny commits become problematic.
Think of this situation. You write some code and add a few files including
XYZ.b. You're happy with that so you commit it.
Later, you wake up in a cold sweat. You realize that XYZ was the dumbest thing you've ever done. Instead, you decide to create
LOL.b and the need for
XYZ.b is gone.
So, in a new commit, you delete
XYZ.b and add
But now you have a problem. You're working on a feature branch (or at least you better be to avoid the pits of hell). And now you have two commits that reference
XYZ.b which is now dead code.
You should rebase this code out of existence and we will talk about doing that in a minute.
There is an argument for leaving that code in your history. That you should want to see the history and the thought process of everything done.
I get that, but in this case it will make your pull request harder to read. It's not like we created
XYZ.b in March and decided to delete it in October.
We created this file in this branch and removed it in this branch. At this point, it's noise for the reviewer. They should never see this and have to do mental gymnastics around what happened.
And this is exactly why I preferred the
Files changed tab in GitHub when reviewing code. It prevented the viewer from seeing all this cruft.
The Files changed tab takes a diff between the end result and the branch you are merging into. And since
XYZ.b does not exist in master and does not exist in the end result, they never see it.
We had group code reviews with the whole team. When I started on this team, I explained my code from the commits tab. There would be all this code where I would have to say, "Oh but this file doesn't exist anymore. Or ignore this, it all changed in a later commit."
It made the code so much harder to follow for the rest of the team. They complained and I listened. They said, "You should rebase your commits before making a pull request."
So, being a good citizen I did what any decent person would do. I googled "git rebase" and scratched my head for a while.
Why is it called rebasing?
It makes me think of freebasing or about how git has 50 million commands with confusing names.
If they had named it git rewrite instead, then everyone would use it!
So let's say that we have a bunch of commits that are hard to follow. We created files. We removed files. We wrote code and then changed our minds. We have commit messages like
WIP. And all this nonsense is in our commit history.
To simplify things, let's imagine we have 4 commits. They're at the top of our history and they are ridiculous.
We want to squash them into one commit with a clear and concise message. We should rebase, it was born for this!
- Select the commit that's below the 4 commits that we hate
- Right-click on Rebase children interactively
SourceTree Interactive Rebase
Then you will see a dialog with the 4 bad commits. In this dialog you can re-order the commits, squash them, delete them and amend commit messages. So, let's squash.
SourceTree Rebase Dialog
- Select the top commit
- Click squash with previous 3 times
- Click Edit message
- Enter an inspirational commit message
- Click Ok in the message dialog
- Click Ok in the rebase dialog
Wallah! 4 confusing commits become 1 concise commit. All you have to do now is force push to your feature branch. Yes, with rebase you do have to force push. Don't do this to
master unless you know what you're doing..Actually, don't do this to
master at all..
In the rebase dialog, you could have also used the arrows to move the commits up and down in the history.
Here's an example of why you may do this.
You commit to file
abc.js. Later, you commit to
- Start a rebase that contains these two commits.
- Drag the last commit for
abc.jsto directly above the first commit for
- Squash with previous
- Edit message
- Update your commit message
SourceTree Edit Commit Message
- Click Ok
- Click Ok
- Force push
And be sure to clean up your commit messages. git will add a bunch of text about squashed commits and old commit messages, but I like to get rid of that stuff.
Deleting a commit is simple as well.
In the main SourceTree window with the list of commits:
- Click on the commit below the one you wish to delete
- Right-click on rebase children interactively
In the dialog:
- Click the commit you want to delete
- Click the delete button
- Click Ok
- Force push to master
When writing your commit messages, follow this pattern:
A short description of the commit (less than 80 chars) <-- empty newline here - Details about the commmit which can be as long as you wish - More details going on about every last edge case you tested
The Command Line
You can also rebase from the command line. If you do, then you will need to understand how to exit
vim (git uses vim to edit commit messages). You better have a PhD in applied cryptography if you're going to exit vim without frying your hard drive.
Here are the general steps for anyone who cares:
git rebase -i HEAD~4 # where 4 is the number of commits you want to rebase
This will bring up a screen in vim where you can do the same things. Re-order, squash, delete, etc.
Git Interactive Rebase Command Line
And when you're all done, vim will ask you for a new commit message. To do that:
- Press i for interactive mode
- Edit the message to your heart's content
- Hit Escape
:wq# that's colon wq as in write quit
- Force push that mess
Preparing your PR
Rebasing is also good for bringing your feature branch up to date with
Say you are ready to create a pull request.
Then do this:
git co master git pull git co feat/branch-name git rebase master
If there are no conflicts, then you're money. You've pulled master into your feature branch. You've rebased master onto your branch, or played your commits on top of master.
It's all semantics and no one understands this shit except Linus anyway. Go with it.
If there are no conflicts then force push to your branch.
If there are conflicts then let's deal with those.
Find the first conflict and resolve it.
git add -u git rebase --continue
Keep doing that until it stops complaining. I have an alias
git next to do both those steps at once. Which is nice.
Once there are no more merge conflicts, force push your branch.
Point of order though. Sometimes the rebasing gods are not kind. At times a rebase does not turn out as intended. In fact, a rebase can turn into the stuff of nightmares.
If this happens to you and you're in a bad state then do:
git reset --hard git rebase --abort
And poof everything is back to normal.
Another point of order. Sometimes you try to rebase master onto your branch a couple times and it's a no-go. In this case, don't waste a lot of time. Abort the rebase and
merge master into your branch instead. This may not keep the history as nice as a rebase, but merging is generally easier.
And now for the grand
My entire feature branch workflow in a nutshell.
git co master git pull git co -b feat/best-branch-ever
Lots of hectic committing happens
At this point, the history is a mess, and I have lots of commit messages like
Work In Progress. But, on the bright side, we have conquered the feature.
Now, I will squash all the commits for this feature, into a single commit with a message like
Work In Progress
Then I will force push to my branch (making a backup of my work)
git reset HEAD~
Be careful with that command, you can lose work.
What this command will do is uncommit everything in the last commit. All your changes for the feature will be pending and ready to re-commit.
But now you will have the benefit of hindsight. You can group the files into the perfect logical groups for each commit with a detailed message on the why.
Once everything is re-committed, push everything to your branch.
Now you are ready to create that pull request. And this time, someone may actually read it.
The reviewer won't need to go to the
Changed Files tab anymore.
Instead, they can go to the
Commits tab. They can see the files in each logically grouped commit. With stunning commit messages and clarity.
The way God and Linus Torvalds intended.