Stacked pull requests to make large code changes reviewable

Stacked Pull Requests

Stacked pull requests allow you to split large pull requests into smaller more digestible pieces. This way, the reviewers can easier follow the code and give better feedback.

But what exactly are stacked pull requests, why do we need them, and how do we create stacked pull requests?

Code reviews are hard but worth it. However, this equation changes when the code change becomes too large. Research confirms this: the more code to review, the less effective the review.

So, we should aim to review small code changes.

In this blog post, I will show you a technique, also referred to as stacked pull requests, that I teach in my code review workshops. This technique helps engineers split code changes into smaller and more coherent pieces.

In return, reviewers have an easier time understanding the code under change and therefore give better feedback in a shorter time.

So, are you ready? Good!

Let’s talk about stacked pull requests

Stacked code reviews work with most of the common tools that are built on Git and allow you to review code, like GitHub, GitLab, and Bitbucket.

Let’s start by imagining we are working on a new feature. We want to add the functionality to add and delete items to and from a shopping card.

Let’s also imagine that, normally, we use a workflow based on feature branches for development. This means that we branch off from the main branch to do our work in isolation on a dedicated feature branch. After we finish, we merge this work back into the main branch.

Note that stacked pull requests are also useful during trunk-based development. Always if we have made too many changes (more than, say, 400 lines of code) and we want to ease the burden on the reviewers.

But back to the example. Let’s say we start development on the feature branch. We commit frequently, and after some time, we open a pull request comprising all our changes.

Working with feature branches.

Working with feature branches.

But, as you can imagine, this is a rather large code change. Most likely, the change will comprise more than 400 lines of code. And 400 lines are known to be an upper bound for the maximum amount of code review change size.

So, if we open a pull request for this change, the only option the reviewer has is to use our commit history to review it in smaller chunks.

Using the commits to review the code, the reviewer can follow our development process and also reviews smaller changes. But commits reflect the developer’s perspective and, most of the time, aren’t a great way to help the reviewer understand the code unless the code author is very strict and skilled in making coherent, self-contained, and good-to-follow commits.

So, we are not going to do that. Instead, we will use stacked pull requests. You will find a comparison to commit-based reviews at the end.

So, what are stacked pull requests?

When we use stacked pull requests, we create several individual pull requests based on chunks of the work that make for great reviewable units.

So, instead of coupling our code reviews to our commits, we take on the perspective of the reviewer. To do so, we first think about how we could break up our larger feature into more coherent and easier-to-understand work items.

After we have identified how to split the feature into several work items, we can then create dedicated branches for each of the work items and stack them on top of each other. Similarly, our pull requests will only highlight the changes that happened on each of the stacked branches.

This way, the reviewer reviews smaller changes and can easily identify how the work items are built upon each other. This approach also helps to keep track of the bigger picture, i.e., the whole feature.

Let’s look at an example:

To split our features “adding and removing items from and to a shopping card” into several work items, we can, for example, decide to go with an approach reflecting our system’s architecture.

Based on this decision, we start by making changes to the database layer. Then, we implement the business logic. And finally, we made the necessary changes to the UI.

Working with Stacked Pull Requests

We begin our work by creating a feature branch for the first set of changes (let’s call it fb-database, for indicating that we work on this branch on the database layer for the feature).

The git command we need for this is the git command to create a branch and check it out:

git checkout -b fb-database

git checkout -b fb-database

As you can see from the screenshot, we are on the development branch before we create the *fb-database *branch. With this command, we now create the fb-database branch and also switch to it.

Then, we do our work and make several frequent commits until we reach a state where we think the database layer is in a good shape.

git add [somefilesthatchanged anotherfile yetanotherfile]

Now, we push create a remote branch for fb-database and push our changes to it.

git push --set-upstream origin fb-database

Then, we open a pull request just for those coherent database-changes.

Opening the first pull request for this review stack.

Opening the first pull request for this review stack.

Tip: An even better approach is to open a “Work in progress” pull request as soon as we create the branch. This way, our colleagues can follow our progress and get already a heads-up on our work.

Opening this pull request will look exactly like we would do for any pull request. We open a pull request comparing the code of branch fb-database to the code on the development branch.

This means that reviewers see the changes that we made in comparison to what happened in the development branch. Right now, there is no difference from the example before (except that there are fewer changes that happened in this pull request).

stacked pull request pr1 1

First stacked pull request – similar to feature branch-based workflow

Creating the first stacked pull request

After we finish our first work item, we continue to work on the next work item for our feature. But instead of creating another feature branch based on the development branch, we create a branch that is based on the previous branch fb-database.

git checkout command

This means that all the changes we made on the branch fb-database, are already part of the code on this new feature branch, i.e. fb-businesslogic. It also means that we can continue working without being blocked, while others have the time to review our previous changes.

So, now, we are working on the changes to the business logic. And again, we open a pull request.

But now, this pull request isn’t based on the development branch. Instead, it is based on the branch fb-database.

Opening the second pull request command

Opening the second pull request – this one is based in fb-database

This highlights that those changes are connected, and, as a result, the reviewer sees what changed between the fb-database and the fb-businesslogic branch.

stacked pull request business


Two stacked pull requests based on two consecutive branched (fb-database and fb-businesslogic)

So, now we have implemented the UI changes. And once again, we branch off our previous feature branch (fb-businesslogic) to make those changes. We also open another pull request. Again, the PR points to the previous feature branch fb-businesslogic, instead of to the development branch.

Final stacked pull requests database business ui

Three stacked pull requests based on three consecutive branched (*fb-database, fb-businesslogic, fb-ui*)

Reviewing Stacked Pull Requests

So, how do we review those stacked pull requests?

The idea is that the reviewer starts with the pull request that was opened first. Then, they compare the code on the development branch with the code on the fb-database branch. These are also the first changes we made, and all other changes build upon those.

As a next step, the reviewer reviews the pull request containing the business logic. And finally, the reviewer reviews the pull request with the UI changes.

So the review direction is: fb-database -> fb-businesslogic -> fb-ui

Merging stacked Pull Requests

It’s important that reviewers do not merge code changes right after they review them. All stacked pull requests should be treated as Work In Progress pull requests until all stacked pull requests have been reviewed. This is because, we must start merging from the top, i.e., the last opened pull request. Doing otherwise will result in a state where we cannot merge our pull requests anymore.

In our example, once the reviewers reviewed all pull requests, we would merge the UI changes into the business logic changes. Then, we merge the business logic changes into the database changes. Finally, we merge the changes on the branch fb-database (which now comprises all changes) into the development branch.

So, our merge direction is: fb-ui -> fb-businesslogic -> fb- database

Merge directions of stacked pull requests

The merge direction of stacked pull requests follows the orange arrows.

Handling Feedback and Keeping Branches in Sync

One drawback of the stacked pull request approach is the overhead of branches that we need to keep in sync.

For example, if a reviewer comments on our pull request handling with the database changes, we would make the requested changes on branch fb-database. But this means that we have to update and sync the fb-businesslogic and the fb-ui branch.

Let’s imagine we changed the fb-database branch after receiving some feedback. Now, we have to update the fb-businesslogic branch. We have at least two options to sync our branches:

  1. merge the two branches (resulting in 1 merge commit on branch fb-businesslogic)
  2. rebasing the fb-businesslogic branch on the changes of branch fb-database

For both options, we first have to commit and push the changes to the fb-database branch. Then, we switch to the fb-businesslogic branch, which we want to “know” of the changes on fb-database.

git checkout fb-businesslogic

Sync using git merge:

On the fb-bussinesslogic branch, we use the merge command. Then, we push the merge commit to the remote fb-database branch so it’s visible in the pull request:

git merge fb-database 
git push

That’s it. fb-database and fb-businesslogic are in sync again. Now, we can do the same for fb-businesslogic and fb-ui. If we have merge conflicts, we have to resolve them first and then commit and push our changes.

Sync using the git rebase:

We can achieve a similar result by using the git rebase command. So, after we applied our changes to fb-database and switch to the branch fb-businesslogic, we use rebase to sync the changes.

If there is no merge conflict, we are fine. Otherwise, we have to resolve the merge conflict, and then continue the rebase using the command:

git rebase fb-database 
[git rebase --continue] #optional if you have merge conflicts
git push

All commands to apply changes

Complete sync commands starting from applying changes to fb-database

Summary

We looked at a technique called stacked pull requests that can help code reviewers understand larger code changes more easily. Stackedpull requests allow you to break up large code changes and to create dependent pull requests that improve the code review process for the reviewer.

Even if the process looks tedious or scary to you, I hope you take the time to try it out on your own. Only that way you can see whether this technique makes sense for you.

Maybe you need to do it a couple of times (2-3 pull requests) before you get the hang of it. But after that, you will be used to this workflow and have integrated it into your daily routine.

Now, before I let you go, let’s compare stacked pull requests to commit-based code reviews.

Comparison to commit-based code reviews

Most tools, such as GitHub, GitLab, and Bitbucket, provide you with a list of commits that happened within a pull request.

This is handy because reviewers can click on each of the commits and see what changed just within that commit.

Review tree

In theory, reviewing the code changes based on commits means that the reviewers review smaller code changes and also that they can follow the progress of the developer.

The problem of commit-based reviews

Unfortunately, most of the time commits represent the developer’s progress and aren’t meant to ease the review. This means that in some cases, the reviewer might look at changes that are undone in a subsequent commit. Also, often commits tangle together several unrelated changes.

Some commits might not even compile, or they aren’t tested. Well, from experience, I can say that many developers have a hard time committing coherent, self-contained, and working commits. Commits often just represent progress points of the developer (like breakpoints before leaving the office or going for lunch).

Still, reviewing based on commits is frequently done because when we face large pull requests, we might clutch any straw we can get.

But even if our commits represent great reviewable items, like containing all the changes we made on each individual branch, commit-based reviews still have a main drawback to the stacked pull-request approach:

Reviewers cannot approve, reject, or ask for changes on the level of individual commits. Even commenting on code changes of a singular commit is handled differently from commenting on the whole pull request. Leaving comments on a particular commit might be hard to trace for the code author and reviewer, as they do not show up in the pull request discussion.

Related work

Well, I hope I have given you a good summary of the stacked pull request approach. This should serve as another mental model and workflow to deal with code reviews within GitHub, GitLab, or Bitbucket.

Finally, I want to highlight some other blog posts that discuss the topic of stacked pull requests:
Unfortunately, the post of Grayson Koonce is not available anymore. Yet, here is a short comparison to the approach I explained: Grayson highlights a workflow in which the developer isn’t working on several branches as they do the work.

Instead, they create the stacked pull requests in retrospective by unstaging all committed changes from a single branch and using git add --patch command to divide the work.

Tip: When you have large code reviews, it can be a great practice exercise that I also sometimes use in my workshops to break up large code changes with this technique.
Note, though, that it is an error-prone and daunting task to manually separate the changes. Still, you can learn a lot about single code coherence, the single responsibility principle, and your habit of mingling and tangling unrelated changes together.

Finally, git add patch is a wonderful tool when you are dealing with relatively small and local changes that you want to separate. I use it on a regular basis.

A very similar blog post describes this technique when working with the Phabricator code review tool.

On StackOverflow, several people discuss how to generate dependent pull requests, and there are a couple of good suggestions on how to overcome this limitation in GitHub.

Over the last few months, also tools appeared that automate or assist with the creation of dedicated review branches. For example, have a look at ghStack.


BTW, next to my code review workshops, I also run a newsletter providing you with awesome code review tips every other week. If that sounds interesting, get your bi-weekly dose of code review best practices below!

This article first appeared on https://awesomecodereviews.com. Last updated: January 17, 2023

Profile picture of Michaela Greiler

Written by Dr. Michaela Greiler who is obsessed with making code reviews your superpower. Learn more about her workshops.