Mono vs multi repo

I’m at the starting point of a large long project and am wondering what would be the best approach to storing the code. Based on my experience with a monolith repo in the past I tend to go with multi repo approach and I feel like some of the monolith issues I faced are similar when using a mono repo.
I noticed that Aurelia VNext is moving to a mono repo setup and I seem to notice more people seem to take the same approach. What is the motivation for Aurelia to move to a mono repo?

The biggest drawbacks that I see when going mono repo are:

  • Large clone containing code you don’t need for development or at least a challenge to only get subsets of code
  • Eventual performance on merges, fetch, pull
  • Looking at project history due to amount of commits, trying to find something specific
  • Challenge to maintain CI&CD flow, need for lots of filters to prevent running tests on untouched (unrelated) code.
  • Inability to do fine grained access control

Drawbacks I see in a multi repo setup

  • Keeping shared deps up to date like coding standards (seems to be fixable with tooling)
  • Finding all repo’s needed to start developing (meta repo seems to solve this issue)
  • Debugging could be challenging when working with published (minified) packages
  • Releasing dependent code to production (keeping track of what has to go first)

What do you think? Why would you choose one over the other?

1 Like

Let me share some of my findings in working with vNext monorepo for the past 7 months or so.

Your concerns

  • Large clone containing code you don’t need for development or at least a challenge to only get subsets of code
  • Eventual performance on merges, fetch, pull

Unrelated to vNext (since we don’t have the commit traffic that a big commercial project would have), but this should generally be a non-issue unless the dev team works on a virtual desktop infrastructure with disposable dev environments that has bandwidth and load issues. I have been cloning and regularly updating some very busy repos like Microsoft’s TypeScript and even if I have to wait a few seconds, it doesn’t really bother me.

  • Looking at project history due to amount of commits, trying to find something specific

It does get hard to find stuff, especially if you don’t know exactly what you’re looking for, but this in part is also because github doesn’t have a very good UI for browsing through commit history.
It also depends on your workflow. Generally it’s easier to find stuff via issues/PRs and grab the specific commit from those.

  • Challenge to maintain CI&CD flow, need for lots of filters to prevent running tests on untouched (unrelated) code.

Doing proper CI/CD always requires you to have someone on your team who knows this stuff well. I like to think I do, and I set up a fairly advanced CI/CD workflow with CircleCI.
I can say that filters were the least of my concerns, but we just run all unit/integration tests on every commit. In part because we want to know the total combined coverage, but also because you can never predict with absolute certainty whether a change in one package indeed does not affect any other ones. The automation is there to do that verification for you, so you don’t need to think or worry about it.

The very expensive tests (e2e) are simply filtered to run only after a merge to master.

  • Inability to do fine grained access control

If you need this, then that’ll be a valid hard blocker because there’s no way to do that with github.

My concerns

  • The tooling is not very mature. TypeScript only supports this setup “natively” since a few months and there have been a significant number of small issues hampering productivity. The most common one is the language service bugging out and not resolving things correctly, leading to frequently having to do a full clean+rebuild and restarting VS Code. I’m sure it will be solved at some point, but so far it has been quite annoying to get this several times a day during a large refactor.

  • Overall complexity. Although we have a pretty solid setup right now for vNext including automated nightly releases and all that sort of goodness, it cost me many many weekends of tedious trial-and-error to get everything just right. Every project is different. You will run into issues that just wouldn’t happen with a multi repo setup, related to package reference discrepancies, relative references being resolved wrongly, subtle problems in multi hierarchical tsconfig/tslint, etc, I can name dozens more. Everything can be solved, but it will be a time sink to get it all working smoothly.

Conclusion

With all the drawbacks (some of which are pretty major), I am still glad we went for this approach and I still think it is worth it, especially on the long term, because:

  • Only one place to manage issues
  • Only one repo to clone
  • Cross-package integration testing is a breeze (this is an absolute hell in multi repo setup) and proper testing is very important
  • Easy to keep versions consistent and releases compatible: I do recommend having only one version number across packages. So bump them all even if only one package is updated, it allows you to tell users “all packages must be the same version” and guarantees compatibility.
  • Cross-package refactoring is easy and relatively safe (as opposed to tedious to outright dangerous in a multi repo setup)

Those are all the points I can think of right now. So: it’s an investment, but I believe it pays off.

But you need to have someone comfortable with, and willing to figure out all the intricacies because it’s certainly not plug-and-play!

2 Likes

@fkleuver thank you for sharing your insights, this is certainly helpful in making a choice!

2 Likes