It's worth noting with a few clicks from the linked article, you can find that this company is (at least according to LinkedIn) a single person. Which explains how the whole company can fit into a repo. But also makes you question how valuable the "insights" here are, like obviously a single-person project should be using a monorepo...
Ah, so "our" company is referring to "me and Claude"? Actually. Claude might be a pretty good co-founder. Half the job is therapy conversations anyway. :)
have you ever heart that google is also one repo? at least it was until 2015. don’t know the story later. So it doesn’t have to be one person company. yet they are making billions
I'm not making any claims about monorepo being good or bad and I'm fully aware large companies have monorepos (or at least very large repos). I'm saying that the fact it's a one-person "company" needs to be taken into account when talking about how applicable their experience is to other companies.
i am actually eagerly waiting for someone to show the real-deal: actually everything in a github repo, including 'artfiacts', or atleast those artifacts which can't be reconstructed from the repo itself.
maybe they could be encrypted, and you could say "well its everything but the encryption key, which is owned in physical form by the CEO."
theres a lot of power i think to have everything in one place. maybe github could add the notion of private folders? but now thats ACLs... probably pushing the tool way too far.
maybe they could be encrypted, and you could say "well its everything but the
encryption key, which is owned in physical form by the CEO."
I don't see how this is any different from most projects where keys and the like are kept in some form of secrets manager (AWS services, GHA Secrets, Hashi Vault, etc.).
I am a huge monorepo supporter, including "no development branches".
However there's a big difference between development and releases. You still want to be able to cut stable releases that allow for cherrypicks for example, especially so in a monorepo.
Atomic changes are mostly a lie when talking about cross API functions, i.e. frontend talking to a backend. You should always define some kind of stable API.
Can you explain this comment? Are you saying to develop directly in the main branch?
How do you manage the various time scales and complexity scales of changes? Task/project length can vary from hours to years and dependencies can range from single systems to many different systems, internal and external.
The complexity comes from releases. Suppose you have a good commit 123 were all your tests pass for some project, you cut a release, and deploy it.
Then development continues until commit 234, but your service is still at 123. Some critical bug is found, and fixed in commit 235. You can't just redeploy at 235 since the in-between may include development of new features that aren't ready, so you just cherry pick the fix to your release.
It's branches in a way, but _only_ release branches. The only valid operations are creating new releases from head, or applying cherrypicks to existing releases.
That's where tags are useful because the only valid operations (depending on force push controls) are creating a new tag. If your release process creates tag v0.6.0 for commit 123 your tools (including `git describe`) should show that as the most recent release, even at commit 234. If you need to cut a hotfix release for a critical bug fix you can easily start the branch from your tag: `git switch -c hotfix/v0.6.1 v0.6.0`. Code review that branch when it is ready and tag v0.6.1 from its end result.
Ideally you'd do the work in your hotfix branch and merge it to main from there rather than cherry picking, but I feel that mostly because git isn't always great at cherry picking.
> Suppose you have a good commit 123 were all your tests pass for some project, you cut a release, and deploy it.
And you've personally done this for a larger project with significant amount of changes and a longer duration (like maybe 6 months to a year)?
I'm struggling to understand why you would eliminate branches? It would increase complexity, work and duration of projects to try to shoehorn 2 different system models into one system. Your 6 month project just shifted to a 12 to 24 month project.
The reason I said it would impact duration is the assumption that the previous version and new version of the system are all in the code at one time, managed via feature flags or something. I think I was picturing that due to other comments later in the thread, you may not be handling it that way.
Either way, I still don't understand how you can reasonably manage the complexity, or what value it brings.
Example:
main - current production - always matches exactly what is being executed in production, no differences allowed
production_qa - for testing production changes independent of the big project
production_dev_branches - for developing production changes during big project
big_project_qa_branch - tons of changes, currently being used to qa all of the interactions with this system as well as integrations to multiple other systems internal and external
big_project_dev_branches - as these get finalized and ready for qa they move to qa
Questions:
When production changes and project changes are in direct conflict, how can you possibly handle that if everyone is just committing to one branch?
How do you create a clean QA image for all of the different types of testing and ultimately business training that will need to happen for the project?
It depends a lot on a team by team basis as different teams would like different approaches
In general, all new code gets added to the tip of main, your only development branch. Then, new features can also be behind feature flags optionally. This allows developers to test and develop on the latest commit. They can enable a flag if they are interested in a particular feature. Ideally new code also comes with relevant automated tests just to keep the quality of the branch high.
Once a feature is "sufficiently tested" whatever that may mean for your team it can be enabled by default, but it won't be usable until deployed.
Critically, there is CI that validates every commit, _but_ deployments are not strictly performed from every commit. Release processes can be very varied.
A simple example is we decide to create a release from commit 123, which has some features enabled. You grab the code, build it, run automated tests, and generate artifacts like server binaries or assets. This is a small team with little SLAs so it's okay to trust automated tests and deploy right to production. That's the end, commit 123 is live.
As another example, a more complex service may require more testing. You do the same first steps, grab commit 123, test, build, but now deploy to staging. At this point staging will be fixed to commit 123, even as development continues. A QA team can perform heavy testing, fixes are made to main and cherry picked, or the release dropped if something is very wrong. At some point the release is verified and you just promote it to production.
So development is always driven from the tip of the main branch. Features can optionally be behind flags. And releases allow for as much control as you need.
There's no rule that says you can only have one release or anything like that. You could have 1 automatic release every night if you want to.
Some points that make it work in my experience are:
1. Decent test culture. You really want to have at least some metric for which commits are good release candidates.
2. You'll need some real release management system. The common tools available like to tie together CI and CD which is not the right way to think about it IMO (example your GitHub CI makes a deployment).
TL:Dr:
Multiple releases, use flags or configuration for the different deployments. They could all even be from the same or different commits.
I don't see how you're avoiding development branches. Surely while a change is in development the author doesn't simply push to main. Otherwise concurrent development, and any code review process—assuming you have one—would be too impractical.
So you can say that you have short-lived development branches that are always rebased on main. Along with the release branch and cherry-pick process, the workflow you describe is quite common.
They don’t do code reviews or any sort of parallel development.
They’re under the impression that “releases are complex and this is how they avoid it” but they just moved the complexity and sacrificed things like parallel work, code reviews, reverts of whole features.
Not sure what GP had in mind, but I have a few reasons:
Cherry picks are useful for fixing releases or adding changes without having to make an entirely new release. This is especially true for large monorepos which may have all sorts of changes in between. Cherry picks are a much safer way to “patch” releases without having to create an entirely new release, especially if the release process itself is long and you want to use a limited scope “emergency” one.
Atomic changes - assuming this is related to releases as well, it’s because the release process for the various systems might not be in sync. If you make a change where the frontend release that uses a new backend feature is released alongside the backend feature itself, you can get version drift issues unless everything happens in lock-step and you have strong regional isolation. Cherry picks are a way to circumvent this, but it’s better to not make these changes “atomic” in the first place.
If your monorepo compiles to one binary on one host then fine, but what do you do when one webserver runs vN, another runs v(N-1), and half the DB cluster is stuck on v(N-17)?
A monorepo only allows you to reason about the entire product as it should be. The details of how to migrate a live service atomically have little to do with how the codebase migrates atomically.
That's why I mention having real stable APIs for cross-service interaction, as you can't guarantee that all teams deploy the exact same commit everywhere at once. It is possible but I'd argue that's beyond what a monorepo provides. You can't exactly atomically update your postgres schema and JavaScript backend in one step, regardless of your repo arrangement.
Adding new APIs is always easy. Removing them not so much since other teams may not want to do a new release just to update to your new API schema.
But isn't that a self-inflicted wound then? I mean is there some reason your devs decided not to fix the DB cluster? Or did management tell you "Eh, we have other things we want to prioritize this month/quarter/year?"
This seems like simply not following the rules with having a monorepo, because the DB Cluster is not running the version in the repo.
Maybe the database upgrade from v(N-17) to v(N-16) simply takes a while, and hasn't completed yet? Or the responsible team is looking at it, but it doesn't warrant the whole company to stop shipping?
Being 17 versions behind is an extreme example, but always having everything run the latest version in the repo is impossible, if only because deployments across nodes aren't perfectly synchronised.
This is why you have active/passive setup and you don't run half-deployed code in production. Using API contracts is a weak solution, because eventually you will write a bug. It's simpler to just say "everything is running the same version" and make that happen.
Do you take down all of your projects and then bring them back up at the new version? If not, then you have times at which the change is only partially complete.
I would see a potentially more liberal use of atomic, that if the repo state reflects the totality of what I need to get to new version AND return to current one, then I have all I need from a reproducibility perspective. Human actions could be allowed in this, if fully documented. I am not a purist, obviously.
Blue/green might allow you to do (approximately) atomic deploys for one service, but it doesn't allow you to do an atomic deploy of the clients of that service as well.
Why that? In a very simple case, all services of a monorepo run on a single VM. Spin up new VM, deploy new code, verify, switch routing. Obviously, this doesn't work with humongous systems, but the idea can be expanded upon: make sure that components only communicate with compatible versions of other components. And don't break the database schema in a backward-incompatible way.
So yes, in theory you can always deploys sets of compatible services, but it's not really workable in practice: you either need to deploy the world on every change, or you need to have complicated logic to determine which services are compatible with which deployment sets of other services.
There's a bigger problem though: in practice there's almost always a client that you don't control, and can't switch along with your services, e.g. an old frontend loaded by a user's browser.
The only way I could read their answer as being close to correct is if the clients they're referring to are not managed by the deployment.
But (in my mind) even a front end is going to get told it is out of date/unusable and needs to be upgraded when it next attempts to interact with the service, and, in my mind atleast, that means that it will have to upgrade, which isn't "atomic" in the strictest sense of the word, but it's as close as you're going to get.
each deployment is a separate "atomic change". so if a one-file commit downstream affects 2 databases, 3 websites and 4 APIs (madeup numbers), then that is actually 9 different independent atomic changes.
I can guarantee that your codebase is spaghetti of conditional functionality that no developer understands, and that most of those conditionals are leftovers that are no longer needed, but nobody dares to remove.
Feature flags are a good idea, but they require a lot of discipline and maintenance. In practice, they tend to be overused, and provide more negatives than positives. They're a complement, but certainly not a replacement for VCS branches, especially in monorepos.
Not OP, but I think building feature flags yourself really isn’t hard and worth doing. It’s such an important component that I wouldn’t want to depend on a third party
I agree, but it's hard to get the nuances right. It's easy to roll out a feature to half of your user base. It's a bit harder to roll a feature out to half of users who are in a certain region, and have the flag be sticky on them.
We use Unleash at work, which is open source, and it works pretty well.
I generally agree, but see some more nuance. I think feature-flagging is an overloaded term that can mean two things.
First, my philosophy is that long-lived feature branches are bad, and lead to pain and risk once complete and need to be merged.
Instead, prefer to work in small, incremental PRs that are quickly merged to main but dormant in production. This ensures the team is aware of the developing feature and cannot break your in-progress code (e.g. with a large refactor).
This usage of "feature flags" is simple enough that it's fine and maybe even preferable to build yourself. It could be as simple as env vars or a config file.
--
However, feature flagging may also refer to deploying two variants of completed code for A/B testing or just an incremental rollout. This requires the ability to expose different code paths to selected users and measure the impact.
This sort of tooling is more difficult to build. It's not impossible, but comparatively complex because it probably needs to be adjustable easily without releases (i.e. requires a persistence layer) and by non-engineers (i.e. requires an admin UI). This becomes a product, and unless it's core to your business, it's probably better to pick something off the shelf.
Something I learned later in my career is that measuring the impact is actually a separate responsibility. Product metrics should be reported on anyway, and this is merely adding the ability to tag requests or other units of work with the variants applied, and slice your reporting on it. It's probably better not to build this either, unless you have a niche requirement not served by the market.
--
These are clearly two use cases, but share the overloaded term "feature flag":
1. Maintaining unfinished code in `main` without exposing it to users, which is far superior than long-lived feature branches but requires the ability to toggle.
2. Choosing which completed features to show to users to guide your product development.
(2) is likely better served by something off the shelf. And although they're orthogonal use cases, sometimes the same tool can support both. But if you only need (1), I wouldn't invest in a complex tool that's designed to support (2)—which I think is where I agree with you :)
I like keeping old branches but a lot of places ditch them, never understood why. I also dislike git squash, it means you have to make a brand new branch for your next PR, waste of time when I should be able to pull down master / dev / main / whatever and merge it into my working branch. I guess this is another reason I prefer the forking approach of github, let devs have their own sandbox and their own branches, and let them get their work done, they will PR when its ready.
Squashing only results in a cleaner commit history if you're making a mess of the history on your branches. If you're structuring the commit history on your branches logically, squashing just throws information away.
I’m all ears for a better approach because squashing seems like a good way to preserve only useful information.
My history ends up being:
- add feature x
- linting
- add e2e tests
- formatting
- additional comments for feature
- fix broken test (ci caught this)
- update README for new feature
- linting
With a squash it can boil down to just “added feature x” with smaller changes inside the description.
If my change is small enough that it can be treated as one logical unit, that will be reviewed, merged and (hopefully not) reverted as one unit, all these followup commits will be amends into the original commit. There's nothing wrong with small changes containing just one commit; even if the work wasn't written or committed at one time.
Where logical commits (also called atomic commits) really shine is when you're making multiple logically distinct changes that depend on each other. E.g. "convert subsystem A to use api Y instead of deprecated api X", "remove now-unused api X", "implement feature B in api Y", "expose feature B in subsystem A". Now they can be reviewed independently, and if feature B turns out to need more work, the first commits can be merged independently (or if that's discovered after it's already merged, the last commits can be reverted independently).
If after creating (or pushing) this sequence of commits, I need to fix linting/formatting/CI, I'll put the fixes in a fixup commit for the appropriate and meld them using a rebase. Takes about 30s to do manually, and can be automated using tools like git-absorb. However, in reality I don't need to do this often: the breakdown of bigger tasks into logical chunks is something I already do, as it helps me to stay focused, and I add tests and run linting/formatting/etc before I commit.
And yes, more or less the same result can be achieved by creating multiple MRs and using squashing; but usually that's a much worse experience.
You can always take advantage of the graph structure itself. With `--first-parent` git log just shows your integration points (top level merge commits, PR merges with `--no-ff`) like `Added feature X`. `--first-parent` applies to blame, bisect, and other commands as well. When you "need" or most want linear history you have `--first-parent` and when you need the details "inside" a previous integration you can still get to them. You can preserve all information and yet focus only on the top-level information by default.
It's just too bad not enough graphical UIs default to `--first-parent` and a drill-down like approach over cluttered "subway graphs".
stacked diffs are the best approach and working at a company that uses them and reading about the "pull request" workflow that everyone else subjects themselves to makes me wonder why everyone is not using stacked diffs instead of repeating this "squash vs. not squash" debate eternally.
every commit is reviewed individually. every commit must have a meaningful message, no "wip fix whatever" nonsense. every commit must pass CI. every commit is pushed to master in order.
Not everyone develops and commits the same way and mandating squashing is a much simpler management task than training up everyone to commit in a similar manner.
Besides, they probably shouldn't make PR commits atomic, but do so as often as needed. It's a good way to avoid losing work. This is in tension with leaving behind clean commits, and squashing resolves it.
The solution there is to make your commit history clean by rebasing it. I often end my day with a “partial changes done” commit and then the next day I’ll rebase it into several commits, or merge some of the changes into earlier commits.
Even if we squash it into main later, it’s helpful for reviewing.
At work there was only one way to test a feature, and that was to deploy it to our dev environment. The only way to deploy to dev was to check the repo into a branch, and deploy from that branch.
So one branch had 40x "Deploy to Dev" commits. And those got merged straight into the repo.
Good luck getting 100+ devs to all use the same logical commit style. And if tests fail in CI you get the inevitable "fix tests" commit in the branch, which now spams your main branch more than the meaningful changes. You could rebase the history by hand, but what's the point? You'd have to force push anyway. Squashing is the only practical method of clean history for large orgs.
Also rebasing is just so fraught with potential errors - every month or two, the devs who were rebasing would screw up some feature branch that they had work on they needed and would look to me to fix it for some reason. Such a time sink for so little benefit.
I eventually banned rebasing, force pushes, and mandated squash merges to main - and we magically stopped having any of these problems.
We squash, but still rebase. For us, this works quite well. As you said, rebasing needs to be done carefully... But the main history does look nice this way.
True but. There's a huge trade-off in time management.
I can spend hours OCDing over my git branch commit history.
-or-
I can spend those hours getting actual work done and squash at the end to clean up the disaster of commits I made along the way so I could easily roll back when needed.
Squash loses the commit history - all you end up with is merge merge merge
It's harder to debug as well (this 3000line commit has a change causing the bug... best of luck finding it AND why it was changed that way in the first place.
I, myself, prefer that people tidy up their branches such that their commits are clear on intent, and then rebase into main, with a merge commit at the tip (meaning that you can see the commits AND where the PR began/ended.
"squash results in a cleaner commit history" Isn't the commit history supposed to be the history of actual commits? I have never understood why people put so much effort into falsifying git commit histories.
Here is how I think of it. When I am actively developing a feature I commit a lot. I like the granularity at that stage and typically it is for an audience of 1 (me). I push these commits up in my feature branch as a sort of backup. At this stage it is really just whatever works for your process.
When I am ready to make my PR I delete my remote feature branch and then squash the commits. I can use all my granular commit comments to write a nice verbose comment for that squashed commit. Rarely I will have more than one commit if a user story was bigger than it should be. Usually this happens when more necessary work is discovered. At this stage each larger squashed commit is a fully complete change.
The audience for these commits is everyone who comes after me to look at this code. They aren’t interested in seeing it took me 10 commits to fix a test that only fails in a GitHub action runner. They want the final change with a descriptive commit description. Also if they need to port this change to an earlier release as a hotfix they know there is a single commit to cherry pick to bring in that change. They don’t need to go through that dev commit history to track it all down.
There are several valid reasons to "falsify" commit history.
- You need to remove trash commits that appear when you need to rerun CI.
- You need to remove commits with that extra change you forgot.
- You want to perform any other kind of rebase to clean up messages.
I assume in this thread some people mean squashing from the perspective of a system like Gitlab where it's done automatically, but for me squashing can mean simply running an interactive (or fixup) and leaving only important commits that provide meaningful information to the target branch.
“Falsifying” is complete hyperbole.
Git commit history is a tool and not everyone derives the same ROI from the effort of preserving it. Also squashing is pretty effortless.
I'm very fortunate to not have to use PR style forges at work (branch based, that is). Instead each commit is its own unit of code to review, test, and merge individually. I never touch branches anymore since I also use JJ locally.
people talk about "one change, everywhere, all at once." That is a great way to break production on any api change. if you have a db and >2 nodes, you will have the old system using the old schema and the new system using the new schema unless you design for forwards-backwards compatible changes. While more obvious with a db schema, it is true for any networked api.
At some point, you will have many teams. And one of them _will not_ be able to validate and accept some upgrade. Maybe a regression causes something only they use to break. Now the entire org is held hostage by the version needs of one team. Yes, this happens at slightly larger orgs. I've seen it many times.
And since you have to design your changes to be backwards compatible already, why not leverage a gradual roll out?
Do you update your app lock-step when AWS updates something? Or when your email service provider expands their API? No, of course not. And you don't have to lock yourself to other teams in your org for the same reason.
Monorepos are hotbeds of cross contamination and reaching beyond API boundaries. Having all the context for AI in one place is hard to beat though.
100%, this is all true and something you have to tackle eventually. Companies like this one (Kasava) can get away with it because, well, they likely don't have very many customers and it doesn't really matter. But when you're operating at a scale where you have international customers relying on your SaaS product 24/7, suddenly deploys having a few minutes of downtime matters.
This isn't to say monorepo is bad, though, but they're clearly naive about some things;
> No sync issues. No "wait, which repo has the current pricing?" No deploy coordination across three teams. Just one change, everywhere, instantly.
It's literally impossible to deploy "one change" simultaneously, even with the simplest n-tier architecture. As you mention, a DB schema is a great example. You physically cannot change a database schema and application code at the exact same time. You either have to ensure backwards compatibility or accept that there will be an outage while old application code runs against a new database, or vice-versa. And the latter works exactly up until an incident where your automated DB migration fails due to unexpected data in production, breaking the deployed code and causing a panic as on-call engineers try to determine whether to fix the migration or roll back the application code to fix the site.
To be a lot more cynical; this is clearly an AI-generated blog post by a fly-by-night OpenAI-wrapper company and I suspect they have few paying customers, if any, and they probably won't exist in 12 months. And when you have few paying customers, any engineering paradigm works, because it simply does not matter.
The only way to do a sane migration without downtime is to have tha application handle both schema versions at the same time. This is easily doable with a monorepo.
I’m not sure why you made the logical leap from having all code stored in a single repo to updating/deploying code in lockstep. Where you put your code (the repo) can and should be decoupled from how you deploy changes.
> you will have the old system using the old schema and the new system using the new schema unless you design for forwards-backwards compatible changes
Of course you design changes to be backwards compatible. Even if you have a single node and have no networked APIs. Because what if you need to rollback?
> Maybe a regression causes something only they use to break. Now the entire org is held hostage by the version needs of one team.
This is an organizational issue not a tech issue. Who gives that one team the power to hold back large changes that benefit the entire org? You need a competent director or lead to say no to this kind of hostage situation. You need defined policies that balance the needs of any individual team versus the entire org. You need to talk and find a mutually accepted middle ground between teams that want new features and teams that want stability and no regressions.
The point is that the realities of not being able to deploy in lockstep erode away at a lot of the claimed benefits the monorepo gives you in being able to make a change everywhere at once.
If my code has to be backwards compatible to survive the deployment, then having the code in two different repos isn’t such a big deal, because it’ll all keep working while I update the consumer code.
The point is atomic code changes, not atomic deployments. If I want to rename some common library function, it's just a single search and replace operation in a monorepo. How do you do this with multiple repos?
> If I want to rename some common library function, it's just a single search and replace operation in a monorepo. How do you do this with multiple repos?
Multiple repos shouldn't depend on a single shared library that needs to be updated in lockstep. If they do, something has gone horribly wrong.
They do, it's just instead of it being a library call it's a network call usually, which is even worse. Makes it nigh impossible to refactor your codebase in any meaningful way.
But if you need to rename endpoint for example you need to route service A version Y to compatible version in service B. After changing the endpoint, now you need to route service A version Z to a new version of service B. Am I missing something? Meaning that it doesn’t truly mater whether you have 1 repo, 2 repos or 10 repos. Deployments MUST be done in sequence and there MUST be a backwards compatible commit in between OR you must have some mesh that’s going to take care of rerouting requests for you.
You just deploy all the services at once, A B style. Just flip to the new services once they're all deployed and make the old ones inactive, in one go. Yes you'll probably need a somewhat central router, maybe you do this per-client or per-user or whatever makes sense.
You can do phased deployments with blue green, that's what we do. It depends on your application but ours has a natural segmentation by client. And when you roll back you just flip the active and passive again.
> This is an organizational issue not a tech issue.
It’s both. Furthermore, you _can_ solve organizational problems with tech. (Personally, I prefer solutions to problems that do not rely strictly on human competence)
We have a monorepo, we use automated code generation (openapi-generator) for API clients for each service derived from an OpenAPI.json generated by the server framework. Service client changes cascade instantly. We have a custom CI job that trawls git and figures out which projects changed (including dependencies) as to compute which services need to be rebuilt/redeployed. We may just not be at scale—thank God. We're a small team.
Monorepo vs multiple repos isn't really relevant here, though. It's all about how many independently deployed artifacts you have. e.g. a very simple modern SaaS app has a database, backend servers and some kind of frontend that calls the backend servers via API. These three things are all deployed independently in different physical places, which means when you deploy version N, there will be some amount of time they are interacting with version N-1 of the other components. So you either have to have a way of managing compatibility, or you accept potential downtime. It's just a physical reality of distributed systems.
> We may just not be at scale—thank God. We a small team.
It's perfectly acceptable for newer companies and small teams to not solve these problems. If you don't have customers who care that your website might go down for a few minutes during a deploy, take advantage of that while you can. I'm not saying that out of arrogance or belittlement or anything; zero-downtime deployments and maintaining backwards compatibility have an engineering cost, and if you don't have to pay that cost, then don't! But you should at least be cognizant that it's an engineering decision you're explicitly making.
Exactly. Monorepo-enjoyers like to pretend that workspaces don't a) exist, and b) provide >90% of the benefits of a monorepo, with none of the drawbacks.
> At some point, you will have many teams. And one of them _will not_ be able to validate and accept some upgrade. Maybe a regression causes something only they use to break. Now the entire org is held hostage by the version needs of one team. Yes, this happens at slightly larger orgs. I've seen it many times.
The alternative of every service being on their own version of libraries and never updating is worse.
atomic updates in particular is one of those things that sounds good to the C-suite, but falls apart extremely badly in the lower levels.
months-long delays on important updates due to some large project doing extremely bad things and pushing off a minor refactor endlessly has been the norm for me. but they're big so they wield a lot of political power so they get away with it every time.
or worse, as a library owner: spending INCREDIBLE amounts of time making sure a very minor change is safe, because you can't gradually roll it out to low-risk early adopter teams unless it's feature-flagged to hell and back. and if you missed something, roll back, write a report and say "oops" with far too many words in several meetings, spend a couple weeks triple checking feature flagging actually works like everyone thought (it does not, for at least 27 teams using your project), and then try again. while everyone else working on it is also stuck behind that queue.
monorepos suck imo. they're mostly company lock-in, because they teach most absolutely no skills they'd need in another job (or for contributing to open source - it's a brain drain on the ecosystem), and all external skill is useless because every monorepo is a fractal snowflake of garbage.
I really have never been able to grasp how people who believe that forward-compatible data schema changes are daunting can ever survive contact with the industry at scale. It's extremely simple to not have this problem. "design for forwards-backwards compatible changes" is what every grown-up adult programmer does.
You always have this problem thats why you have a release process for apis.
And monorepo or not, bad software developers will always run into this issue. Most software will not have 'many teams'. Most software is written by a lot of small companies doing niche things. Big software companies with more than one team, normally have release managers.
My tipp: use architecture unit tests for external facing APIs. If you are a smaller company: 24/7 doesn't has to be the thing, just communicate this to your customers but overall if you run SaaS Software and still don't know how to do zero-downtime-deployment in 2025/2026, just do whatever you are still doing because man come on...
I used to be against monorepos... Then I got really into claude code, and monorepo makes sense for the first time in my life, specifically because of tools like Claude. I mean technically I could open all the different repos from the parent directory I suppose, but its much nicer in one spot. Front-end and back-end changes are always in sync this way too.
Opening Claude from the parent directory is what I do, and it seems to work pretty well, but I do like this monorepo idea so that a single commit can change things in the front end and back end together, since this is a use case that's quite common
Yeah, I used to hate it, but as I was building a new project I was like, oh man, I can't believe I'm even thinking of doing this, but it makes more sense LOL Instead of prompting twice, I can prompt once in one shot and it has the context of both pieces too. I guess if I ever need them to be separate I can always do that too.
Except of course rollout will not be atomic anyway and making changes in a single commit might lead Devs to make changes without thinking about backwards compat
Even if the rollout was atomic to the servers, you will still have old clients with cached old front ends talking to updated front ends. Depending on the importance of the changes in question, you can sometimes accept breakage or force a full UI refresh. But that should be a conscious decision. It’s better to support old clients as the same time as new clients and deprecate the old behavior and remove it over time. Likewise, if there’s a critical change where you can’t risk new front ends breaking when talking to old front ends (what if you had to rollback), you can often deploy support for new changes, and activate the UI changes in a subsequent release or with a feature flag.
I think it’s better to always ask your devs to be concerned about backwards compatibility, and sometimes forwards compatibility, and to add test suites if possible to monitor for unexpected incompatible changes.
Rollout should be within a minute. Let's say you ship one thing a day and 1/3 things involve a backwards-incompatible api change. That's 1 minute of breakage per 3 days. Aka it's broken 0.02% of the time. Life is too short to worry about such things
You might have old clients for several hours, days or forever(mobile). This has to be taken into account, for example by aggressively forcing updates which can be annoying for users, especially if their hardware doesn't support updating.
I changed my biggest project to a monorepo based on the same issue. I tinker with a lot of the bleeding-edge LLM tools and it was a nightmare trying to wire them all up properly so they would look at the different bits. So I refactored it into one just to make life easier for a computer.
I've been a big fan of monorepos for awhile, but like the author, not a huge fan of using e.g. yarn workspaces. React Native can get pretty pissy with hoisting. I just started putting things like implementation plans and PRDs in the repo and I'm loving it so far. It helps give AI more of the context to make good choices.
And think about what it’s like for humans as well—-spreading a feature over several repos with separate PRs makes either a mockery of the review process (if the PRs have to be merged in one repo to be able to test things together), or significantly increases cognitive overhead of reviewing code.
Claude Code can actually work on multiple directories, so this is not strictly necessary! I do this when I'm working on a project whose dependencies also need to be refactored.
Seems like a limitation/assumption that is introduced by the tooling (Claude) and could also be improved in the tooling to work equally well with multiple repos.
Yeah it reads like it, and if a random AI detector (GPTZero) is to be believed it's pretty much all AI generated.
Crazy that nobody can be bothered to get rid of the obvious AI-isms "This isn't just for...", "The Challenges (And How We Handle Them)", "One PR. One review. One merge. Everything ships together." It's an immediate signal that whoever wrote this DGAF.
The obvious tell for me is when the article is packed full of 'Its not just x, it's y' statements. I am not sure why LLMs gravitated so heavily towards their current style of writing. Pre LLMs, I can't recall seeing that much written content in that format. If I did, it was in short form content.
I hadn't come across GPTZero before and wondered if it worked. Just testing on a sample of my blog posts (I do one each year) I got a 100% AI generated mark for a post in... 2022, and 2023. Both before AI tools were around.
Not to say this post isn't AI generated but you might want a better tool (if one exists)
Yeah, it's got a real issue with false positives. And I've tried a bunch of other tools (Sapling, ZeroGPT, a few others) and actually GPTZero was the best of the bunch. The others would miss obviously AI generated content that I'd just generated to test them.
I've had a blog post kicking around about this for a while, it's CRAZY how much more expensive AI detection is than AI generation.
In my mind content generated today with AI "tells" like the above and a general zero-calorie-feel that also trip an AI detector are very likely AI generated.
Pff the mental list of what I can’t use when I write is getting pretty big. Em dashes are done for, as are deep dives, delving, anything too enthusiastic, and Oxford commas…
A text either has value to you or it doesn’t. I don’t really understand what the level of AI involvement has to do with it. A human can produce slop, an AI can produce an insightful piece. I rely mostly on HN to tell them apart value-wise.
No, can’t say I noticed it. But I’m not a native English speaker. For me the AI transforms my poor Dunglish (Dutch-English) into perfect English. I do tell it to not sound like an American waiter though.
Yes, we're looking for some other human sharing something interesting. There is no requirement to put things out into the world. So when somebody shares something to a discussion board like HN the hope is that if I'm going to spend my time reading it, they spent the time to write it. If I wanted to read an AI response I could just ask it "Tell me about how you could organize an entire business in a monorepo".
Or honesty about the author. If it's written by ChatGPT, say that. If I start to read an article with the expectation of it being written by a human, then see something like this, I instantly check out.
> Last week, I updated our pricing limits. One JSON file. The backend started enforcing the new caps, the frontend displayed them correctly, the marketing site showed them on the pricing page, and our docs reflected the change—all from a single commit.
If you ask an AI that question, it would tell you all the ways this is a bad idea, which isn't in this article (which is one of the reasons I think this wasn't written by AI, but just formatted by it)
Human articles on HN are largely shit. I would personally prefer to see either AI articles, or human articles by experts (which we get almost none of on HN)
Agreed. Especially when a lot of people just pick out x, y and z thing as if it's the definitive sign of AI, disregarding the possibility of it being normal outside of their own writing and what they read. Not to mention cultural differences. That certain characters or ways of structuring text have become more pervasive lately is a sign, yes, but it does not mean that the presence of it in a text is anything definitive towards the use of AI.
It's almost as if when you seek to find patterns, you'll find patterns, even if there are none. I think it'd benefit these kinds of people to remember the scientific "rule" of correlation does not equal causation and vice versa.
> When you ask Claude to "update the pricing page to reflect the new limits," it can...
wat. You are running the marketing page from the same repo, yet having an LLM make the updates? You have the data file available. Just read the pricing info from your config file and display it?
This post is obviously (almost insultingly) written by AI. That being said, the idea behind the post is a good one (IaC taken to an extreme). This leaves me at a really weird spot in terms of how I feel about it.
It's weird it looks like only a small % of comments on here have caught on to the obvious LLM-ness of it all (I missed it the first go-around but on second read, you're is absolutely correct).
I'm wondering once the exceedingly obvious LLM style creeps more and more into the public mind if we're going to look back at these blog posts and just cringe at how blatant they were in retrospect. The models are going to improve (and people will catch on that you can't just use vanilla output from the models as blog posts without some actual editing) and these posts will just stand out like some very sore thumbs.
It feels like intellectual dishonesty when it's not declared at the top of the article. I have no issues with AI, when the authors are honest about their usage. But if you stamp your name to an article without clear mention that LLMs wrote at least a significant piece of it, it feels dishonest and I disconnect from it.
Company website in the same repo means you can find branding material and company tone from blogs, meaning you can generate customer slides, video demos
Going further, Docs + Code, why not also store Bugs, Issues etc. I wonder
I built something like this at my previous startup, Pangea [1]. Overall I think looking back on our journey I'd sign up for it again, but it's not a panacea.
Here were the downsides we ran into
- Getting buy in to do everything through the repo. We had our feature flags controlled via a yaml file in the repo as well, and pretty quickly people got mad at the time it took for us to update a feature flag (open MR -> merge MR -> have CI update feature flag in our envs), and optimizing that took quite a while. It then made branch invariants harder to reason about (everything in the production branch is what is in our live environments, but except for feature flags). So, we moved that out of the monorepo into an actual service.
- CI time and complexity. When we started getting to around 20 services that deployed independently, GitLab started choking on the size of our CI configuration and we'd see a spinner for about 5 minutes before our pipeline even launched. Couple that with special snowflakes like the feature flag system I mentioned above, eventually it got to the point that only a few people knew exactly how rollouts edge cases worked. The juice was not worth the squeeze at that point (the juice being - "the repo is the source of truth for everything")
- Test times. We ran some e2e UI tests with Cypress that required a lot of beefy instances, and for safety we'd run them every single time. Couple that with flakiness, and you'd have a lot of red pipelines when the goal was 100% green all the time.
That being said, we got a ton of good stuff out of it too. I distinctly remember one day that I updated all but 2 of our services to run on ARM without involving service authors and our compute spend went down by 70% for that month because nobody was using the m8g spot instances, which had just been released.
Did you use turbo, buck or Bazel? Without monorepo tooling (and the blood, sweat, and tears it takes to hone them for your use cases), you start hitting all kinds of scaling limits in CI.
We had python scripts that generated GitLab CI/CD yaml [1]. Tooting my own horn here, but it was super cool to ship fairly fast for the first year or so. By the end, we had something like 5 MB of yaml, but in order for the GitLab SaaS backend to process it, it took something like 32 gigs of ram on their MergeRequestProcessor SideKiq worker.
They had to open a whole epic in order to reduce the memory usage, but I think all that work just let us continue to use GitLab as the number of services we grew increased. They recommended we use something called parent/child pipelines, but it would have been a fairly large rewrite of our logic.
I promise I only self promote when it is relevant, but this is exactly what I am building https://nimbalyst.com/ for.
We build a user-friendly way for non-technical users to interact with a repo using Claude Code. It's especially focused on markdown, giving red/green diffs on RENDERED markdown files which nobody else has. It supports developers as well, but our goal is to be much more user friendly than VSCode forks.
Internally we have been doing a lot of what they talk about here, doing our design work, business planning, and marketing with Claude Code in our main repo.
I’m curious about the authors experience with monorepo for marketing. I’ve found that using static site generators with nontechnical PMs resulted in dissatisfaction and more work for engineers that those PMs could handle independently in Wordpress/Contentful. As a huge believer in monorepo, I’d love to hear how folks have approached incorporating nonengingeers into the monorepo workflows.
So the insane thing I do is I don't use worktrees. I am using multiple Claude code instances on the same project doing different things at the same time like one is editing the CSS for the login screen while another one is changing up the settings section of the project.
yep. if the project is large enough, there are usually changes to be made that don't overlap, allowing multiple agents to work concurrently without work trees.
for example I can have a prompt writing playwright tests for happy paths while another prompt is fixing a bug of duplicated rows in a table because of a missing SQL JOIN condition.
The thing I dislike about monorepos is that people don't ship stuff. Multiple versions of numpy and torch exist within the codebase, mitigated by bazel or some other build tool, instead of building binaries and deb packages and shipping actual products with well-documented APIs so that one team never needs to actually touch another team's code to get stuff done.
The people who say polyrepos cause breakage aren't doing it right. When you depend across repos in a polyrepo setup, you should depend on specific versions of things across repos, not the git head. Also, ideally, depend on properly installed binaries, not sources.
That makes sense when you depend on a shared library. However, if service A depends on endpoint x in service B, then you still have to work out synchronized deployments (or have developers handle this by making multiple separate deployments).
To be fair, this problem is not solved at all by monorepos. Basically, only careful use of gRPC (and similar technology) can help solve this… and it doesn’t really solve for application layer semantics, merely wire protocol compatibility. I’m not aware of any general comprehensive and easy solution.
> However, if service A depends on endpoint x in service B, then you still have to work out synchronized deployments (or have developers handle this by making multiple separate deployments).
In a polyrepo environment, either:
- B updates their endpoint in a backward compatible fashion, making sure older stuff still works
OR
- B releases a new version of their API at /api/2.0 but keeps /api/1.0 active and working until nothing depends on it anymore, releasing deprecation messages to devs of anyone depending on 1.0
I leverage git submodules and avoid the same pitfalls of monorepo scale hell we had 20 years ago. Glad it works for you though. I feel like this is the path to ARR until you need to scale engineering beyond just you and your small team. The good news here is that the author has those domains segregated out as subfolders so in the future, he/she could just pull that out into its own repo if that time came.
Still adverse to the monorepo though, but I understand why it's attractive.
"Conclusion
Our monorepo isn't about following a trend. It's about removing friction between things that naturally belong together, something that is critical when related context is everything.
When a feature touches the backend API, the frontend component, the documentation, and the marketing site—why should that be four repositories, four PRs, four merge coordination meetings?
The monorepo isn't a constraint. It's a force multiplier."
You could easily scoff the same way about some number of API endpoints, class methods, config options, etc, and it still wouldn't be meaningful without context.
There may not be a universally correct granularity, but that doesn't mean clearly incorrect ones don't exist. 50+ services is almost always too many, except for orgs with hundreds or thousands of engineers.
What’s the value proposition? I mean the fact that you have
- frontend
- backend
- website
is already confusing to me.
I understand that one commit seems nice, but you could have achieve this with e.g. 3 repos and very easily maintain all of them. There’s a bit of overhead of course, but having some experience working with a team that has a few „monorepos” I know that the cost to actually make it work is significant.
I love the idea. It's bold. But, I hate it from an information architecture perspective.
This is something that is, of course, super relevant given context management for agentic AI. So there's great appeal in doing this.
And today, it might even be the best decision. But this really feels like an alpha version of something that will have much better tooling in the near-future. JSON and
Markdown are beautiful simple information containers, but they aren't friendly for humans as compared with something like Notion or Excel. Again I'll say, I'm confident that in the near-future we'll start to see solutions emerge that structure documentation that is friendly to both AIs and humans.
Well written, anticipated my questions about pain points at the end except one: have you hit a point yet where deploying is a pain because it’s happening so frequently? I understand there’s good separation of concerns so a change in marketing/ won’t cause conflicts or anything to impact frontend/ but I have to imagine eventually you’ll hit that pain point. But fwiw I’m a big fan of monorepo containing multiple services, and only breaking up the monorepo when it starts to cause problems. Sounds like author is doing that
I really want the world to move on from monorepos to multirepos. Git submodules set multirepos back by 10 years, but they still make more sense. The are composable!
For me, integrating features that spans multiple repositories means coordinating changes, multiple PRs, switching branches on many repos to do testing. Quite time consuming. I did use submodules but I find monorepo easier to manage
Interesting approach to giving LLMs full context. My only concern is the "no workspaces" approach; manual cd && npm install usually leads to dependency drift and "it works on my machine" issues once you start sharing logic between the API and the frontend. It’s a great setup for velocity now, but I'm curious if you've hit any friction with types or shared utils without a more formal monorepo tool?
I used to dread this approach (it’s part of why I like Typescript monorepos now), but LLMs are fantastic at translating most basic types/shapes between languages. Much less tedious to do this than several years ago.
Of course, it’s still a pretty rough and dirty way to do it. But it works for small/demo projects.
Each layer of your stack should have different types.
Never expose your storage/backend type. Whenever you do, any consumers (your UI, consumers of your API, whatever) will take dependencies on it in ways you will not expect or predict. It makes changes somewhere between miserable and impossible depending on the exact change you want to make.
A UI-specific type means you can refactor the backend, make whatever changes you want, and have it invisible to the UI. When the UI eventually needs to know, you can expose that in a safe way and then update the UI to process it.
This completely misses the point of what sharing types is about. The idea behind sharing types is not exposing your internal backend classes to the frontend. Sharing types is about sharing DTO definitions between the backend and the frontend. In other words, sharing the return types of your public API to ensure when you change a public API, you instantly see all affected frontend code that needs to be changed as well. No one is advocating for sharing internal representations.
Protobuf is decent enough, I've used Avro and Thrift before (way way before protobuf came to be), and the dev experience of protobuf has been the best so far.
It's definitely not amazing, code generation in general will always have its quirks, but protobuf has some decent guardrails to keep the protocol backwards-forwards compatible (which was painful with Avro without tooling for enforcement), it can be used with JSON as a transport for marshaling if needed/wanted, and is mature enough to have a decent ecosystem of libraries around.
Not that I absolutely love it but it gets the job done.
Good Christ. Imagine having decided that your price structures should be a JSON file instead of persisted in a database and then thinking that any decision made by that person/team is a good idea.
I look forward to when we see the article about breaking the monorepo nightmare.
Sometimes this sort of thing is not a bad idea. If it's a simple data structure that doesn't change very often, you get an admin interface (vi), change tracking, and audit trail for free. Just think of it as configuration rather than data and most folks would think it's normal to do this.
I have a question about Monorepo. Do companies really expose their entire source code all in one repo for their devs to download ? I understand that people can always do bad things if they want but with monorepo, you are literally letting me download everything right ?
This is probably different between startups and enterprises. My background is purely startups, and I can't imagine not having access to 100% of the code for the company I work.
I work at Google, and yes. We use a monorepo for absolutely everything you can think of. But good luck getting that code off a corp device without being caught!
While not talked about on HN as much, the big corps doing monorepo use something like Perforce which has "protects" tables allowing very granular access control
Hosting a developer environment remotely that you SSH into is very common. That’s how you would approach working with a monorepo that has any serious size to it.
Fuck yes I love this attitude to transparency and code-based organization. This is the kind of stuff that gets me going in the morning for work, the kind of organization and utility I honestly aspire to implement someday.
As many commenters rightly point out, this doesn't run the human side of the company. It could, though, if the company took this approach seriously enough. My personal two cents, it could be done as a separate monorepo, provided the company and its staff remain disciplined in its execution and maintenance. It'd be far easier to have a CSV dictate employees and RBAC rather than bootstrapping Active Directory and fussing with its integrations/tentacles. Putting department processes into open documentation removes obfuscation and a significant degree of process politics, enabling more staff to engage in self-service rather than figuring out who wields the power to do a thing.
I really love everything about this, and I'd like to see more of it, AI or not. Less obfuscation and more transparency is how you increase velocity in any organization.
This is sort of a whole product, but it’s hardly managing the whole company. Financials? HR? Contracts? Pictures of the last team meeting?
It just looks like a normal frontend+backend product monorepo, with the only somewhat unusual inclusion of the marketing folder.
It's worth noting with a few clicks from the linked article, you can find that this company is (at least according to LinkedIn) a single person. Which explains how the whole company can fit into a repo. But also makes you question how valuable the "insights" here are, like obviously a single-person project should be using a monorepo...
Ah, so "our" company is referring to "me and Claude"? Actually. Claude might be a pretty good co-founder. Half the job is therapy conversations anyway. :)
have you ever heart that google is also one repo? at least it was until 2015. don’t know the story later. So it doesn’t have to be one person company. yet they are making billions
I'm not making any claims about monorepo being good or bad and I'm fully aware large companies have monorepos (or at least very large repos). I'm saying that the fact it's a one-person "company" needs to be taken into account when talking about how applicable their experience is to other companies.
Google isn’t a monorepo since the acquisition of Android. That one never made it into google3.
Yes but AI! AI!
Not even infrastructure as code is in the repository from what one can see.
i am actually eagerly waiting for someone to show the real-deal: actually everything in a github repo, including 'artfiacts', or atleast those artifacts which can't be reconstructed from the repo itself.
maybe they could be encrypted, and you could say "well its everything but the encryption key, which is owned in physical form by the CEO."
theres a lot of power i think to have everything in one place. maybe github could add the notion of private folders? but now thats ACLs... probably pushing the tool way too far.
https://dev.azure.com/byteterrace/Koholint/_git/Azure.Resour...
How close do you think this is? Deploys everything but the actual backend/frontend code.
At a previous job we put compilers and standard libraries in version control, with custom tooling to pull the right version for what you need.
We used p4 rather than git though.
I am a huge monorepo supporter, including "no development branches".
However there's a big difference between development and releases. You still want to be able to cut stable releases that allow for cherrypicks for example, especially so in a monorepo.
Atomic changes are mostly a lie when talking about cross API functions, i.e. frontend talking to a backend. You should always define some kind of stable API.
> including "no development branches"
Can you explain this comment? Are you saying to develop directly in the main branch?
How do you manage the various time scales and complexity scales of changes? Task/project length can vary from hours to years and dependencies can range from single systems to many different systems, internal and external.
Yeah, all new commits are merged to main.
The complexity comes from releases. Suppose you have a good commit 123 were all your tests pass for some project, you cut a release, and deploy it.
Then development continues until commit 234, but your service is still at 123. Some critical bug is found, and fixed in commit 235. You can't just redeploy at 235 since the in-between may include development of new features that aren't ready, so you just cherry pick the fix to your release.
It's branches in a way, but _only_ release branches. The only valid operations are creating new releases from head, or applying cherrypicks to existing releases.
That's where tags are useful because the only valid operations (depending on force push controls) are creating a new tag. If your release process creates tag v0.6.0 for commit 123 your tools (including `git describe`) should show that as the most recent release, even at commit 234. If you need to cut a hotfix release for a critical bug fix you can easily start the branch from your tag: `git switch -c hotfix/v0.6.1 v0.6.0`. Code review that branch when it is ready and tag v0.6.1 from its end result.
Ideally you'd do the work in your hotfix branch and merge it to main from there rather than cherry picking, but I feel that mostly because git isn't always great at cherry picking.
> Suppose you have a good commit 123 were all your tests pass for some project, you cut a release, and deploy it.
And you've personally done this for a larger project with significant amount of changes and a longer duration (like maybe 6 months to a year)?
I'm struggling to understand why you would eliminate branches? It would increase complexity, work and duration of projects to try to shoehorn 2 different system models into one system. Your 6 month project just shifted to a 12 to 24 month project.
Can you clarify why it would impact project duration?
In my experience development branches vastly increase complexity by hiding the integration issues until very late when you try to merge.
The reason I said it would impact duration is the assumption that the previous version and new version of the system are all in the code at one time, managed via feature flags or something. I think I was picturing that due to other comments later in the thread, you may not be handling it that way.
Either way, I still don't understand how you can reasonably manage the complexity, or what value it brings.
Example:
main - current production - always matches exactly what is being executed in production, no differences allowed
production_qa - for testing production changes independent of the big project
production_dev_branches - for developing production changes during big project
big_project_qa_branch - tons of changes, currently being used to qa all of the interactions with this system as well as integrations to multiple other systems internal and external
big_project_dev_branches - as these get finalized and ready for qa they move to qa
Questions:
When production changes and project changes are in direct conflict, how can you possibly handle that if everyone is just committing to one branch?
How do you create a clean QA image for all of the different types of testing and ultimately business training that will need to happen for the project?
It depends a lot on a team by team basis as different teams would like different approaches
In general, all new code gets added to the tip of main, your only development branch. Then, new features can also be behind feature flags optionally. This allows developers to test and develop on the latest commit. They can enable a flag if they are interested in a particular feature. Ideally new code also comes with relevant automated tests just to keep the quality of the branch high.
Once a feature is "sufficiently tested" whatever that may mean for your team it can be enabled by default, but it won't be usable until deployed.
Critically, there is CI that validates every commit, _but_ deployments are not strictly performed from every commit. Release processes can be very varied.
A simple example is we decide to create a release from commit 123, which has some features enabled. You grab the code, build it, run automated tests, and generate artifacts like server binaries or assets. This is a small team with little SLAs so it's okay to trust automated tests and deploy right to production. That's the end, commit 123 is live.
As another example, a more complex service may require more testing. You do the same first steps, grab commit 123, test, build, but now deploy to staging. At this point staging will be fixed to commit 123, even as development continues. A QA team can perform heavy testing, fixes are made to main and cherry picked, or the release dropped if something is very wrong. At some point the release is verified and you just promote it to production.
So development is always driven from the tip of the main branch. Features can optionally be behind flags. And releases allow for as much control as you need.
There's no rule that says you can only have one release or anything like that. You could have 1 automatic release every night if you want to.
Some points that make it work in my experience are:
1. Decent test culture. You really want to have at least some metric for which commits are good release candidates. 2. You'll need some real release management system. The common tools available like to tie together CI and CD which is not the right way to think about it IMO (example your GitHub CI makes a deployment).
TL:Dr:
Multiple releases, use flags or configuration for the different deployments. They could all even be from the same or different commits.
I don't see how you're avoiding development branches. Surely while a change is in development the author doesn't simply push to main. Otherwise concurrent development, and any code review process—assuming you have one—would be too impractical.
So you can say that you have short-lived development branches that are always rebased on main. Along with the release branch and cherry-pick process, the workflow you describe is quite common.
Their dev branch is _the_ development branch.
They don’t do code reviews or any sort of parallel development.
They’re under the impression that “releases are complex and this is how they avoid it” but they just moved the complexity and sacrificed things like parallel work, code reviews, reverts of whole features.
I'm not sure where you got that from. There is a single branch, which obviously has code review, and reverts work just the same way.
What there isn't, is long lived feature branches with non-integrated changes.
Very interesting points. Would you mind sharing a few examples of when cherry-picking is necessary and why atomic changes are a lie?
I'm using a monorepo for my company across 3+ products and so far we're deploying from stable release to stable release without any issues.
Atomic changes are a lie in the sense that there is no atomic deployment of a repo.
The moment you have two production services that talk to each other, you end up with one of them being deployed before the other.
Atomicity also rarely matters as much as people think it does if contracts are well defined and maintained.
A selling point of monorepos is that you don't need to maintain backwards compatible contracts and can make changes to both sides of an API at once.
If you have a monolith you get atomic deployment, too.
Not sure what GP had in mind, but I have a few reasons:
Cherry picks are useful for fixing releases or adding changes without having to make an entirely new release. This is especially true for large monorepos which may have all sorts of changes in between. Cherry picks are a much safer way to “patch” releases without having to create an entirely new release, especially if the release process itself is long and you want to use a limited scope “emergency” one.
Atomic changes - assuming this is related to releases as well, it’s because the release process for the various systems might not be in sync. If you make a change where the frontend release that uses a new backend feature is released alongside the backend feature itself, you can get version drift issues unless everything happens in lock-step and you have strong regional isolation. Cherry picks are a way to circumvent this, but it’s better to not make these changes “atomic” in the first place.
If your monorepo compiles to one binary on one host then fine, but what do you do when one webserver runs vN, another runs v(N-1), and half the DB cluster is stuck on v(N-17)?
A monorepo only allows you to reason about the entire product as it should be. The details of how to migrate a live service atomically have little to do with how the codebase migrates atomically.
That's why I mention having real stable APIs for cross-service interaction, as you can't guarantee that all teams deploy the exact same commit everywhere at once. It is possible but I'd argue that's beyond what a monorepo provides. You can't exactly atomically update your postgres schema and JavaScript backend in one step, regardless of your repo arrangement.
Adding new APIs is always easy. Removing them not so much since other teams may not want to do a new release just to update to your new API schema.
But isn't that a self-inflicted wound then? I mean is there some reason your devs decided not to fix the DB cluster? Or did management tell you "Eh, we have other things we want to prioritize this month/quarter/year?"
This seems like simply not following the rules with having a monorepo, because the DB Cluster is not running the version in the repo.
Maybe the database upgrade from v(N-17) to v(N-16) simply takes a while, and hasn't completed yet? Or the responsible team is looking at it, but it doesn't warrant the whole company to stop shipping?
Being 17 versions behind is an extreme example, but always having everything run the latest version in the repo is impossible, if only because deployments across nodes aren't perfectly synchronised.
This is why you have active/passive setup and you don't run half-deployed code in production. Using API contracts is a weak solution, because eventually you will write a bug. It's simpler to just say "everything is running the same version" and make that happen.
Do you take down all of your projects and then bring them back up at the new version? If not, then you have times at which the change is only partially complete.
I would see a potentially more liberal use of atomic, that if the repo state reflects the totality of what I need to get to new version AND return to current one, then I have all I need from a reproducibility perspective. Human actions could be allowed in this, if fully documented. I am not a purist, obviously.
Nah, these days the new thing is Vibe Deployments, just ship the change and pray.
People that Blue Green are doing that, aren't they?
Canary/Incremental, not so much
Blue/green might allow you to do (approximately) atomic deploys for one service, but it doesn't allow you to do an atomic deploy of the clients of that service as well.
Why that? In a very simple case, all services of a monorepo run on a single VM. Spin up new VM, deploy new code, verify, switch routing. Obviously, this doesn't work with humongous systems, but the idea can be expanded upon: make sure that components only communicate with compatible versions of other components. And don't break the database schema in a backward-incompatible way.
So yes, in theory you can always deploys sets of compatible services, but it's not really workable in practice: you either need to deploy the world on every change, or you need to have complicated logic to determine which services are compatible with which deployment sets of other services.
There's a bigger problem though: in practice there's almost always a client that you don't control, and can't switch along with your services, e.g. an old frontend loaded by a user's browser.
The only way I could read their answer as being close to correct is if the clients they're referring to are not managed by the deployment.
But (in my mind) even a front end is going to get told it is out of date/unusable and needs to be upgraded when it next attempts to interact with the service, and, in my mind atleast, that means that it will have to upgrade, which isn't "atomic" in the strictest sense of the word, but it's as close as you're going to get.
each deployment is a separate "atomic change". so if a one-file commit downstream affects 2 databases, 3 websites and 4 APIs (madeup numbers), then that is actually 9 different independent atomic changes.
We use a mono repo and feature flag new features which gives us the deployment control timing.
I can guarantee that your codebase is spaghetti of conditional functionality that no developer understands, and that most of those conditionals are leftovers that are no longer needed, but nobody dares to remove.
Feature flags are a good idea, but they require a lot of discipline and maintenance. In practice, they tend to be overused, and provide more negatives than positives. They're a complement, but certainly not a replacement for VCS branches, especially in monorepos.
What do you use for feature flags?
Not OP, but I think building feature flags yourself really isn’t hard and worth doing. It’s such an important component that I wouldn’t want to depend on a third party
I agree, but it's hard to get the nuances right. It's easy to roll out a feature to half of your user base. It's a bit harder to roll a feature out to half of users who are in a certain region, and have the flag be sticky on them.
We use Unleash at work, which is open source, and it works pretty well.
I generally agree, but see some more nuance. I think feature-flagging is an overloaded term that can mean two things.
First, my philosophy is that long-lived feature branches are bad, and lead to pain and risk once complete and need to be merged.
Instead, prefer to work in small, incremental PRs that are quickly merged to main but dormant in production. This ensures the team is aware of the developing feature and cannot break your in-progress code (e.g. with a large refactor).
This usage of "feature flags" is simple enough that it's fine and maybe even preferable to build yourself. It could be as simple as env vars or a config file.
--
However, feature flagging may also refer to deploying two variants of completed code for A/B testing or just an incremental rollout. This requires the ability to expose different code paths to selected users and measure the impact.
This sort of tooling is more difficult to build. It's not impossible, but comparatively complex because it probably needs to be adjustable easily without releases (i.e. requires a persistence layer) and by non-engineers (i.e. requires an admin UI). This becomes a product, and unless it's core to your business, it's probably better to pick something off the shelf.
Something I learned later in my career is that measuring the impact is actually a separate responsibility. Product metrics should be reported on anyway, and this is merely adding the ability to tag requests or other units of work with the variants applied, and slice your reporting on it. It's probably better not to build this either, unless you have a niche requirement not served by the market.
--
These are clearly two use cases, but share the overloaded term "feature flag":
1. Maintaining unfinished code in `main` without exposing it to users, which is far superior than long-lived feature branches but requires the ability to toggle.
2. Choosing which completed features to show to users to guide your product development.
(2) is likely better served by something off the shelf. And although they're orthogonal use cases, sometimes the same tool can support both. But if you only need (1), I wouldn't invest in a complex tool that's designed to support (2)—which I think is where I agree with you :)
If statements?
Unleash
You can also do them in Gitlab.
I like keeping old branches but a lot of places ditch them, never understood why. I also dislike git squash, it means you have to make a brand new branch for your next PR, waste of time when I should be able to pull down master / dev / main / whatever and merge it into my working branch. I guess this is another reason I prefer the forking approach of github, let devs have their own sandbox and their own branches, and let them get their work done, they will PR when its ready.
squash results in a cleaner commit history. at least that’s why we mandate it at my work. not everyone feels the same about it I guess
Squashing only results in a cleaner commit history if you're making a mess of the history on your branches. If you're structuring the commit history on your branches logically, squashing just throws information away.
I’m all ears for a better approach because squashing seems like a good way to preserve only useful information.
My history ends up being: - add feature x - linting - add e2e tests - formatting - additional comments for feature - fix broken test (ci caught this) - update README for new feature - linting
With a squash it can boil down to just “added feature x” with smaller changes inside the description.
If my change is small enough that it can be treated as one logical unit, that will be reviewed, merged and (hopefully not) reverted as one unit, all these followup commits will be amends into the original commit. There's nothing wrong with small changes containing just one commit; even if the work wasn't written or committed at one time.
Where logical commits (also called atomic commits) really shine is when you're making multiple logically distinct changes that depend on each other. E.g. "convert subsystem A to use api Y instead of deprecated api X", "remove now-unused api X", "implement feature B in api Y", "expose feature B in subsystem A". Now they can be reviewed independently, and if feature B turns out to need more work, the first commits can be merged independently (or if that's discovered after it's already merged, the last commits can be reverted independently).
If after creating (or pushing) this sequence of commits, I need to fix linting/formatting/CI, I'll put the fixes in a fixup commit for the appropriate and meld them using a rebase. Takes about 30s to do manually, and can be automated using tools like git-absorb. However, in reality I don't need to do this often: the breakdown of bigger tasks into logical chunks is something I already do, as it helps me to stay focused, and I add tests and run linting/formatting/etc before I commit.
And yes, more or less the same result can be achieved by creating multiple MRs and using squashing; but usually that's a much worse experience.
You can always take advantage of the graph structure itself. With `--first-parent` git log just shows your integration points (top level merge commits, PR merges with `--no-ff`) like `Added feature X`. `--first-parent` applies to blame, bisect, and other commands as well. When you "need" or most want linear history you have `--first-parent` and when you need the details "inside" a previous integration you can still get to them. You can preserve all information and yet focus only on the top-level information by default.
It's just too bad not enough graphical UIs default to `--first-parent` and a drill-down like approach over cluttered "subway graphs".
stacked diffs are the best approach and working at a company that uses them and reading about the "pull request" workflow that everyone else subjects themselves to makes me wonder why everyone is not using stacked diffs instead of repeating this "squash vs. not squash" debate eternally.
every commit is reviewed individually. every commit must have a meaningful message, no "wip fix whatever" nonsense. every commit must pass CI. every commit is pushed to master in order.
Not everyone develops and commits the same way and mandating squashing is a much simpler management task than training up everyone to commit in a similar manner.
Besides, they probably shouldn't make PR commits atomic, but do so as often as needed. It's a good way to avoid losing work. This is in tension with leaving behind clean commits, and squashing resolves it.
The solution there is to make your commit history clean by rebasing it. I often end my day with a “partial changes done” commit and then the next day I’ll rebase it into several commits, or merge some of the changes into earlier commits.
Even if we squash it into main later, it’s helpful for reviewing.
We also do conventional commits: https://www.conventionalcommits.org/
Other than that pretty free how you write commit messages
At work there was only one way to test a feature, and that was to deploy it to our dev environment. The only way to deploy to dev was to check the repo into a branch, and deploy from that branch.
So one branch had 40x "Deploy to Dev" commits. And those got merged straight into the repo.
They added no information.
Good luck getting 100+ devs to all use the same logical commit style. And if tests fail in CI you get the inevitable "fix tests" commit in the branch, which now spams your main branch more than the meaningful changes. You could rebase the history by hand, but what's the point? You'd have to force push anyway. Squashing is the only practical method of clean history for large orgs.
This - even 5 devs.
Also rebasing is just so fraught with potential errors - every month or two, the devs who were rebasing would screw up some feature branch that they had work on they needed and would look to me to fix it for some reason. Such a time sink for so little benefit.
I eventually banned rebasing, force pushes, and mandated squash merges to main - and we magically stopped having any of these problems.
We squash, but still rebase. For us, this works quite well. As you said, rebasing needs to be done carefully... But the main history does look nice this way.
Why bother with the rebase if you squash anyway? That history just gets destroyed?
Rebase before creating PR, merge after creating PR.
> Good luck getting 100+ devs to all use the same logical commit style
The Linux kernel manages to do it for 1000+ devs.
What you really need is stacked changes, where each commit is reviewed, ran on ci, and merged independently.
No information loss, and every commit is valid on their own, so cherry picks maintain the same level of quality.
True but. There's a huge trade-off in time management.
I can spend hours OCDing over my git branch commit history.
-or-
I can spend those hours getting actual work done and squash at the end to clean up the disaster of commits I made along the way so I could easily roll back when needed.
it's also very easy to rewrite commit history in a few seconds.
If I'm rewriting history ... why not just squash?
But also, rewriting history only works if you haven't pushed code and are working as a solo developer.
It doesn't work when the team is working on a feature in a branch and we need to be pushing to run and test deployment via pipelines.
> But also, rewriting history only works if you haven't pushed code and are working as a solo developer.
Weird, works fine in our team. Force with lease allows me to push again and the most common type of branch is per-dev and short lived.
Squash loses the commit history - all you end up with is merge merge merge
It's harder to debug as well (this 3000line commit has a change causing the bug... best of luck finding it AND why it was changed that way in the first place.
I, myself, prefer that people tidy up their branches such that their commits are clear on intent, and then rebase into main, with a merge commit at the tip (meaning that you can see the commits AND where the PR began/ended.
git bisect is a tonne easier when you have that
What about separate, atomic, commits? Are they squashed too? Makes reverting a fix harder without impacting the rest, no?
PRs should be atomic, if they need to be separated for reverting, they should be multiple PRs.
"squash results in a cleaner commit history" Isn't the commit history supposed to be the history of actual commits? I have never understood why people put so much effort into falsifying git commit histories.
Here is how I think of it. When I am actively developing a feature I commit a lot. I like the granularity at that stage and typically it is for an audience of 1 (me). I push these commits up in my feature branch as a sort of backup. At this stage it is really just whatever works for your process.
When I am ready to make my PR I delete my remote feature branch and then squash the commits. I can use all my granular commit comments to write a nice verbose comment for that squashed commit. Rarely I will have more than one commit if a user story was bigger than it should be. Usually this happens when more necessary work is discovered. At this stage each larger squashed commit is a fully complete change.
The audience for these commits is everyone who comes after me to look at this code. They aren’t interested in seeing it took me 10 commits to fix a test that only fails in a GitHub action runner. They want the final change with a descriptive commit description. Also if they need to port this change to an earlier release as a hotfix they know there is a single commit to cherry pick to bring in that change. They don’t need to go through that dev commit history to track it all down.
The "cleaner" commit history should be a separate layer and the actual commit history should never be altered.
There are several valid reasons to "falsify" commit history.
- You need to remove trash commits that appear when you need to rerun CI. - You need to remove commits with that extra change you forgot. - You want to perform any other kind of rebase to clean up messages.
I assume in this thread some people mean squashing from the perspective of a system like Gitlab where it's done automatically, but for me squashing can mean simply running an interactive (or fixup) and leaving only important commits that provide meaningful information to the target branch.
> You need to remove trash commits that appear when you need to rerun CI
Serious question, what's going on here?
Are you using a "trash commit" to trigger your CI?
Is your CI creating "trash commits" (because build artefacts)?
“Falsifying” is complete hyperbole. Git commit history is a tool and not everyone derives the same ROI from the effort of preserving it. Also squashing is pretty effortless.
I'm very fortunate to not have to use PR style forges at work (branch based, that is). Instead each commit is its own unit of code to review, test, and merge individually. I never touch branches anymore since I also use JJ locally.
What is JJ?
https://github.com/jj-vcs/jj
> you have to make a brand new branch for your next PR
Is there overhead to creating a branch?
people talk about "one change, everywhere, all at once." That is a great way to break production on any api change. if you have a db and >2 nodes, you will have the old system using the old schema and the new system using the new schema unless you design for forwards-backwards compatible changes. While more obvious with a db schema, it is true for any networked api.
At some point, you will have many teams. And one of them _will not_ be able to validate and accept some upgrade. Maybe a regression causes something only they use to break. Now the entire org is held hostage by the version needs of one team. Yes, this happens at slightly larger orgs. I've seen it many times.
And since you have to design your changes to be backwards compatible already, why not leverage a gradual roll out?
Do you update your app lock-step when AWS updates something? Or when your email service provider expands their API? No, of course not. And you don't have to lock yourself to other teams in your org for the same reason.
Monorepos are hotbeds of cross contamination and reaching beyond API boundaries. Having all the context for AI in one place is hard to beat though.
100%, this is all true and something you have to tackle eventually. Companies like this one (Kasava) can get away with it because, well, they likely don't have very many customers and it doesn't really matter. But when you're operating at a scale where you have international customers relying on your SaaS product 24/7, suddenly deploys having a few minutes of downtime matters.
This isn't to say monorepo is bad, though, but they're clearly naive about some things;
> No sync issues. No "wait, which repo has the current pricing?" No deploy coordination across three teams. Just one change, everywhere, instantly.
It's literally impossible to deploy "one change" simultaneously, even with the simplest n-tier architecture. As you mention, a DB schema is a great example. You physically cannot change a database schema and application code at the exact same time. You either have to ensure backwards compatibility or accept that there will be an outage while old application code runs against a new database, or vice-versa. And the latter works exactly up until an incident where your automated DB migration fails due to unexpected data in production, breaking the deployed code and causing a panic as on-call engineers try to determine whether to fix the migration or roll back the application code to fix the site.
To be a lot more cynical; this is clearly an AI-generated blog post by a fly-by-night OpenAI-wrapper company and I suspect they have few paying customers, if any, and they probably won't exist in 12 months. And when you have few paying customers, any engineering paradigm works, because it simply does not matter.
The only way to do a sane migration without downtime is to have tha application handle both schema versions at the same time. This is easily doable with a monorepo.
I’m not sure why you made the logical leap from having all code stored in a single repo to updating/deploying code in lockstep. Where you put your code (the repo) can and should be decoupled from how you deploy changes.
> you will have the old system using the old schema and the new system using the new schema unless you design for forwards-backwards compatible changes
Of course you design changes to be backwards compatible. Even if you have a single node and have no networked APIs. Because what if you need to rollback?
> Maybe a regression causes something only they use to break. Now the entire org is held hostage by the version needs of one team.
This is an organizational issue not a tech issue. Who gives that one team the power to hold back large changes that benefit the entire org? You need a competent director or lead to say no to this kind of hostage situation. You need defined policies that balance the needs of any individual team versus the entire org. You need to talk and find a mutually accepted middle ground between teams that want new features and teams that want stability and no regressions.
The point is that the realities of not being able to deploy in lockstep erode away at a lot of the claimed benefits the monorepo gives you in being able to make a change everywhere at once.
If my code has to be backwards compatible to survive the deployment, then having the code in two different repos isn’t such a big deal, because it’ll all keep working while I update the consumer code.
The point is atomic code changes, not atomic deployments. If I want to rename some common library function, it's just a single search and replace operation in a monorepo. How do you do this with multiple repos?
> If I want to rename some common library function, it's just a single search and replace operation in a monorepo. How do you do this with multiple repos?
Multiple repos shouldn't depend on a single shared library that needs to be updated in lockstep. If they do, something has gone horribly wrong.
They do, it's just instead of it being a library call it's a network call usually, which is even worse. Makes it nigh impossible to refactor your codebase in any meaningful way.
But if you need to rename endpoint for example you need to route service A version Y to compatible version in service B. After changing the endpoint, now you need to route service A version Z to a new version of service B. Am I missing something? Meaning that it doesn’t truly mater whether you have 1 repo, 2 repos or 10 repos. Deployments MUST be done in sequence and there MUST be a backwards compatible commit in between OR you must have some mesh that’s going to take care of rerouting requests for you.
You just deploy all the services at once, A B style. Just flip to the new services once they're all deployed and make the old ones inactive, in one go. Yes you'll probably need a somewhat central router, maybe you do this per-client or per-user or whatever makes sense.
So that's blue green with added version aware routing. What if you need to rollback? Good luck I guess.
You can do phased deployments with blue green, that's what we do. It depends on your application but ours has a natural segmentation by client. And when you roll back you just flip the active and passive again.
It doesn't need to, it's just much more convenient when you can do everything in a single commit.
> This is an organizational issue not a tech issue.
It’s both. Furthermore, you _can_ solve organizational problems with tech. (Personally, I prefer solutions to problems that do not rely strictly on human competence)
I think I disagree.
We have a monorepo, we use automated code generation (openapi-generator) for API clients for each service derived from an OpenAPI.json generated by the server framework. Service client changes cascade instantly. We have a custom CI job that trawls git and figures out which projects changed (including dependencies) as to compute which services need to be rebuilt/redeployed. We may just not be at scale—thank God. We're a small team.
Monorepo vs multiple repos isn't really relevant here, though. It's all about how many independently deployed artifacts you have. e.g. a very simple modern SaaS app has a database, backend servers and some kind of frontend that calls the backend servers via API. These three things are all deployed independently in different physical places, which means when you deploy version N, there will be some amount of time they are interacting with version N-1 of the other components. So you either have to have a way of managing compatibility, or you accept potential downtime. It's just a physical reality of distributed systems.
> We may just not be at scale—thank God. We a small team.
It's perfectly acceptable for newer companies and small teams to not solve these problems. If you don't have customers who care that your website might go down for a few minutes during a deploy, take advantage of that while you can. I'm not saying that out of arrogance or belittlement or anything; zero-downtime deployments and maintaining backwards compatibility have an engineering cost, and if you don't have to pay that cost, then don't! But you should at least be cognizant that it's an engineering decision you're explicitly making.
> Having all the context for AI in one place is hard to beat though.
Seems like a weird workaround, you could just clone multiple repos into a workspace. Agree with all your other points though.
Exactly. Monorepo-enjoyers like to pretend that workspaces don't a) exist, and b) provide >90% of the benefits of a monorepo, with none of the drawbacks.
> At some point, you will have many teams. And one of them _will not_ be able to validate and accept some upgrade. Maybe a regression causes something only they use to break. Now the entire org is held hostage by the version needs of one team. Yes, this happens at slightly larger orgs. I've seen it many times.
The alternative of every service being on their own version of libraries and never updating is worse.
atomic updates in particular is one of those things that sounds good to the C-suite, but falls apart extremely badly in the lower levels.
months-long delays on important updates due to some large project doing extremely bad things and pushing off a minor refactor endlessly has been the norm for me. but they're big so they wield a lot of political power so they get away with it every time.
or worse, as a library owner: spending INCREDIBLE amounts of time making sure a very minor change is safe, because you can't gradually roll it out to low-risk early adopter teams unless it's feature-flagged to hell and back. and if you missed something, roll back, write a report and say "oops" with far too many words in several meetings, spend a couple weeks triple checking feature flagging actually works like everyone thought (it does not, for at least 27 teams using your project), and then try again. while everyone else working on it is also stuck behind that queue.
monorepos suck imo. they're mostly company lock-in, because they teach most absolutely no skills they'd need in another job (or for contributing to open source - it's a brain drain on the ecosystem), and all external skill is useless because every monorepo is a fractal snowflake of garbage.
I really have never been able to grasp how people who believe that forward-compatible data schema changes are daunting can ever survive contact with the industry at scale. It's extremely simple to not have this problem. "design for forwards-backwards compatible changes" is what every grown-up adult programmer does.
You always have this problem thats why you have a release process for apis.
And monorepo or not, bad software developers will always run into this issue. Most software will not have 'many teams'. Most software is written by a lot of small companies doing niche things. Big software companies with more than one team, normally have release managers.
My tipp: use architecture unit tests for external facing APIs. If you are a smaller company: 24/7 doesn't has to be the thing, just communicate this to your customers but overall if you run SaaS Software and still don't know how to do zero-downtime-deployment in 2025/2026, just do whatever you are still doing because man come on...
I used to be against monorepos... Then I got really into claude code, and monorepo makes sense for the first time in my life, specifically because of tools like Claude. I mean technically I could open all the different repos from the parent directory I suppose, but its much nicer in one spot. Front-end and back-end changes are always in sync this way too.
I guess I could work with either option now.
Opening Claude from the parent directory is what I do, and it seems to work pretty well, but I do like this monorepo idea so that a single commit can change things in the front end and back end together, since this is a use case that's quite common
Yeah, I used to hate it, but as I was building a new project I was like, oh man, I can't believe I'm even thinking of doing this, but it makes more sense LOL Instead of prompting twice, I can prompt once in one shot and it has the context of both pieces too. I guess if I ever need them to be separate I can always do that too.
Except of course rollout will not be atomic anyway and making changes in a single commit might lead Devs to make changes without thinking about backwards compat
Even if the rollout was atomic to the servers, you will still have old clients with cached old front ends talking to updated front ends. Depending on the importance of the changes in question, you can sometimes accept breakage or force a full UI refresh. But that should be a conscious decision. It’s better to support old clients as the same time as new clients and deprecate the old behavior and remove it over time. Likewise, if there’s a critical change where you can’t risk new front ends breaking when talking to old front ends (what if you had to rollback), you can often deploy support for new changes, and activate the UI changes in a subsequent release or with a feature flag.
I think it’s better to always ask your devs to be concerned about backwards compatibility, and sometimes forwards compatibility, and to add test suites if possible to monitor for unexpected incompatible changes.
This is a systems problem that can and should be fixed in the system IMO, not by relying on devs executing processes in some correct order.
This is where unit testing / integration testing should be implemented as guard rails in my eyes.
Rollout should be within a minute. Let's say you ship one thing a day and 1/3 things involve a backwards-incompatible api change. That's 1 minute of breakage per 3 days. Aka it's broken 0.02% of the time. Life is too short to worry about such things
> Rollout should be within a minute
And if it's not, it breaks everything. This is an assumption you can't make.
You might have old clients for several hours, days or forever(mobile). This has to be taken into account, for example by aggressively forcing updates which can be annoying for users, especially if their hardware doesn't support updating.
backend-repo $ claude --add-dir ../frontend-repo
Opting for a monorepo because you don't want to alias this flag is.. something you can do, I guess.
What does the flag do? Just allow Claude to access that directory?
I changed my biggest project to a monorepo based on the same issue. I tinker with a lot of the bleeding-edge LLM tools and it was a nightmare trying to wire them all up properly so they would look at the different bits. So I refactored it into one just to make life easier for a computer.
I've been a big fan of monorepos for awhile, but like the author, not a huge fan of using e.g. yarn workspaces. React Native can get pretty pissy with hoisting. I just started putting things like implementation plans and PRDs in the repo and I'm loving it so far. It helps give AI more of the context to make good choices.
And think about what it’s like for humans as well—-spreading a feature over several repos with separate PRs makes either a mockery of the review process (if the PRs have to be merged in one repo to be able to test things together), or significantly increases cognitive overhead of reviewing code.
Claude Code can actually work on multiple directories, so this is not strictly necessary! I do this when I'm working on a project whose dependencies also need to be refactored.
Seems like a limitation/assumption that is introduced by the tooling (Claude) and could also be improved in the tooling to work equally well with multiple repos.
Is there any concern/issue regarding Claude’s context limit?
This article reads like 4o wrote it. It's so exhausting not being able to find content produced by a human being.
Yeah it reads like it, and if a random AI detector (GPTZero) is to be believed it's pretty much all AI generated.
Crazy that nobody can be bothered to get rid of the obvious AI-isms "This isn't just for...", "The Challenges (And How We Handle Them)", "One PR. One review. One merge. Everything ships together." It's an immediate signal that whoever wrote this DGAF.
The obvious tell for me is when the article is packed full of 'Its not just x, it's y' statements. I am not sure why LLMs gravitated so heavily towards their current style of writing. Pre LLMs, I can't recall seeing that much written content in that format. If I did, it was in short form content.
I hadn't come across GPTZero before and wondered if it worked. Just testing on a sample of my blog posts (I do one each year) I got a 100% AI generated mark for a post in... 2022, and 2023. Both before AI tools were around.
Not to say this post isn't AI generated but you might want a better tool (if one exists)
Yeah, it's got a real issue with false positives. And I've tried a bunch of other tools (Sapling, ZeroGPT, a few others) and actually GPTZero was the best of the bunch. The others would miss obviously AI generated content that I'd just generated to test them.
I've had a blog post kicking around about this for a while, it's CRAZY how much more expensive AI detection is than AI generation.
In my mind content generated today with AI "tells" like the above and a general zero-calorie-feel that also trip an AI detector are very likely AI generated.
Hmm I'm curious which blog post tripped it? I tried a few from your site in 2023 and none of them were flagged as AI generated.
Pff the mental list of what I can’t use when I write is getting pretty big. Em dashes are done for, as are deep dives, delving, anything too enthusiastic, and Oxford commas…
A text either has value to you or it doesn’t. I don’t really understand what the level of AI involvement has to do with it. A human can produce slop, an AI can produce an insightful piece. I rely mostly on HN to tell them apart value-wise.
Did this not read as AI generated to you?
No, can’t say I noticed it. But I’m not a native English speaker. For me the AI transforms my poor Dunglish (Dutch-English) into perfect English. I do tell it to not sound like an American waiter though.
What's exhausting is seeing people complain about AI writing. What exactly are you looking for instead? A poorly written article?
Yes, we're looking for some other human sharing something interesting. There is no requirement to put things out into the world. So when somebody shares something to a discussion board like HN the hope is that if I'm going to spend my time reading it, they spent the time to write it. If I wanted to read an AI response I could just ask it "Tell me about how you could organize an entire business in a monorepo".
Or honesty about the author. If it's written by ChatGPT, say that. If I start to read an article with the expectation of it being written by a human, then see something like this, I instantly check out.
> Last week, I updated our pricing limits. One JSON file. The backend started enforcing the new caps, the frontend displayed them correctly, the marketing site showed them on the pricing page, and our docs reflected the change—all from a single commit.
If you ask an AI that question, it would tell you all the ways this is a bad idea, which isn't in this article (which is one of the reasons I think this wasn't written by AI, but just formatted by it)
Human articles on HN are largely shit. I would personally prefer to see either AI articles, or human articles by experts (which we get almost none of on HN)
Somehow I suspect this would be a nonissue if it easy to determine whether an article is written by AI or not.
Agreed. Especially when a lot of people just pick out x, y and z thing as if it's the definitive sign of AI, disregarding the possibility of it being normal outside of their own writing and what they read. Not to mention cultural differences. That certain characters or ways of structuring text have become more pervasive lately is a sign, yes, but it does not mean that the presence of it in a text is anything definitive towards the use of AI.
It's almost as if when you seek to find patterns, you'll find patterns, even if there are none. I think it'd benefit these kinds of people to remember the scientific "rule" of correlation does not equal causation and vice versa.
[dead]
> When you ask Claude to "update the pricing page to reflect the new limits," it can...
wat. You are running the marketing page from the same repo, yet having an LLM make the updates? You have the data file available. Just read the pricing info from your config file and display it?
AI is turning in to an addiction and crutch for some people.
Code review still exists, you know.
AI didn’t magically uninvent “let’s have someone else check this over before it’s shipped”.
This post is obviously (almost insultingly) written by AI. That being said, the idea behind the post is a good one (IaC taken to an extreme). This leaves me at a really weird spot in terms of how I feel about it.
It's weird it looks like only a small % of comments on here have caught on to the obvious LLM-ness of it all (I missed it the first go-around but on second read, you're is absolutely correct).
I'm wondering once the exceedingly obvious LLM style creeps more and more into the public mind if we're going to look back at these blog posts and just cringe at how blatant they were in retrospect. The models are going to improve (and people will catch on that you can't just use vanilla output from the models as blog posts without some actual editing) and these posts will just stand out like some very sore thumbs.
(ps all of the above 100% human written ;)
You’d think people would at least spend 2 minutes changing obvious tells like “Why This Matters”…
It feels like intellectual dishonesty when it's not declared at the top of the article. I have no issues with AI, when the authors are honest about their usage. But if you stamp your name to an article without clear mention that LLMs wrote at least a significant piece of it, it feels dishonest and I disconnect from it.
I like this for adjacent things too.
Company website in the same repo means you can find branding material and company tone from blogs, meaning you can generate customer slides, video demos
Going further, Docs + Code, why not also store Bugs, Issues etc. I wonder
I built something like this at my previous startup, Pangea [1]. Overall I think looking back on our journey I'd sign up for it again, but it's not a panacea.
Here were the downsides we ran into
- Getting buy in to do everything through the repo. We had our feature flags controlled via a yaml file in the repo as well, and pretty quickly people got mad at the time it took for us to update a feature flag (open MR -> merge MR -> have CI update feature flag in our envs), and optimizing that took quite a while. It then made branch invariants harder to reason about (everything in the production branch is what is in our live environments, but except for feature flags). So, we moved that out of the monorepo into an actual service.
- CI time and complexity. When we started getting to around 20 services that deployed independently, GitLab started choking on the size of our CI configuration and we'd see a spinner for about 5 minutes before our pipeline even launched. Couple that with special snowflakes like the feature flag system I mentioned above, eventually it got to the point that only a few people knew exactly how rollouts edge cases worked. The juice was not worth the squeeze at that point (the juice being - "the repo is the source of truth for everything")
- Test times. We ran some e2e UI tests with Cypress that required a lot of beefy instances, and for safety we'd run them every single time. Couple that with flakiness, and you'd have a lot of red pipelines when the goal was 100% green all the time.
That being said, we got a ton of good stuff out of it too. I distinctly remember one day that I updated all but 2 of our services to run on ARM without involving service authors and our compute spend went down by 70% for that month because nobody was using the m8g spot instances, which had just been released.
[1]: https://pangea.cloud/
Did you use turbo, buck or Bazel? Without monorepo tooling (and the blood, sweat, and tears it takes to hone them for your use cases), you start hitting all kinds of scaling limits in CI.
We had python scripts that generated GitLab CI/CD yaml [1]. Tooting my own horn here, but it was super cool to ship fairly fast for the first year or so. By the end, we had something like 5 MB of yaml, but in order for the GitLab SaaS backend to process it, it took something like 32 gigs of ram on their MergeRequestProcessor SideKiq worker.
They had to open a whole epic in order to reduce the memory usage, but I think all that work just let us continue to use GitLab as the number of services we grew increased. They recommended we use something called parent/child pipelines, but it would have been a fairly large rewrite of our logic.
[1]: https://docs.gitlab.com/ci/yaml/
I promise I only self promote when it is relevant, but this is exactly what I am building https://nimbalyst.com/ for.
We build a user-friendly way for non-technical users to interact with a repo using Claude Code. It's especially focused on markdown, giving red/green diffs on RENDERED markdown files which nobody else has. It supports developers as well, but our goal is to be much more user friendly than VSCode forks.
Internally we have been doing a lot of what they talk about here, doing our design work, business planning, and marketing with Claude Code in our main repo.
I’m curious about the authors experience with monorepo for marketing. I’ve found that using static site generators with nontechnical PMs resulted in dissatisfaction and more work for engineers that those PMs could handle independently in Wordpress/Contentful. As a huge believer in monorepo, I’d love to hear how folks have approached incorporating nonengingeers into the monorepo workflows.
So the insane thing I do is I don't use worktrees. I am using multiple Claude code instances on the same project doing different things at the same time like one is editing the CSS for the login screen while another one is changing up the settings section of the project.
yep. if the project is large enough, there are usually changes to be made that don't overlap, allowing multiple agents to work concurrently without work trees.
for example I can have a prompt writing playwright tests for happy paths while another prompt is fixing a bug of duplicated rows in a table because of a missing SQL JOIN condition.
> Nimbalyst is SOC-Type 2 certified
What does this mean in context of downloadable desktop apps?
The thing I dislike about monorepos is that people don't ship stuff. Multiple versions of numpy and torch exist within the codebase, mitigated by bazel or some other build tool, instead of building binaries and deb packages and shipping actual products with well-documented APIs so that one team never needs to actually touch another team's code to get stuff done.
The people who say polyrepos cause breakage aren't doing it right. When you depend across repos in a polyrepo setup, you should depend on specific versions of things across repos, not the git head. Also, ideally, depend on properly installed binaries, not sources.
That makes sense when you depend on a shared library. However, if service A depends on endpoint x in service B, then you still have to work out synchronized deployments (or have developers handle this by making multiple separate deployments).
To be fair, this problem is not solved at all by monorepos. Basically, only careful use of gRPC (and similar technology) can help solve this… and it doesn’t really solve for application layer semantics, merely wire protocol compatibility. I’m not aware of any general comprehensive and easy solution.
> However, if service A depends on endpoint x in service B, then you still have to work out synchronized deployments (or have developers handle this by making multiple separate deployments).
In a polyrepo environment, either:
- B updates their endpoint in a backward compatible fashion, making sure older stuff still works
OR
- B releases a new version of their API at /api/2.0 but keeps /api/1.0 active and working until nothing depends on it anymore, releasing deprecation messages to devs of anyone depending on 1.0
Right, so all of that is independent of mono vs poly repo.
I leverage git submodules and avoid the same pitfalls of monorepo scale hell we had 20 years ago. Glad it works for you though. I feel like this is the path to ARR until you need to scale engineering beyond just you and your small team. The good news here is that the author has those domains segregated out as subfolders so in the future, he/she could just pull that out into its own repo if that time came.
Still adverse to the monorepo though, but I understand why it's attractive.
"Conclusion Our monorepo isn't about following a trend. It's about removing friction between things that naturally belong together, something that is critical when related context is everything.
When a feature touches the backend API, the frontend component, the documentation, and the marketing site—why should that be four repositories, four PRs, four merge coordination meetings?
The monorepo isn't a constraint. It's a force multiplier."
Thank you Claude :)
It wrote the code, so it's best placed to write the copy too.
That is exactly right!
55 business logic services? Sounds extremely overengineered. I'm sure at least half of those services should be consolidated into others.
There is no universally "correct" granularity.
You could easily scoff the same way about some number of API endpoints, class methods, config options, etc, and it still wouldn't be meaningful without context.
It's ok to split or lump as the team sees fit.
> There is no universally "correct" granularity.
There may not be a universally correct granularity, but that doesn't mean clearly incorrect ones don't exist. 50+ services is almost always too many, except for orgs with hundreds or thousands of engineers.
What’s the value proposition? I mean the fact that you have
- frontend - backend - website
is already confusing to me.
I understand that one commit seems nice, but you could have achieve this with e.g. 3 repos and very easily maintain all of them. There’s a bit of overhead of course, but having some experience working with a team that has a few „monorepos” I know that the cost to actually make it work is significant.
I love the idea. It's bold. But, I hate it from an information architecture perspective.
This is something that is, of course, super relevant given context management for agentic AI. So there's great appeal in doing this.
And today, it might even be the best decision. But this really feels like an alpha version of something that will have much better tooling in the near-future. JSON and
Markdown are beautiful simple information containers, but they aren't friendly for humans as compared with something like Notion or Excel. Again I'll say, I'm confident that in the near-future we'll start to see solutions emerge that structure documentation that is friendly to both AIs and humans.
Well written, anticipated my questions about pain points at the end except one: have you hit a point yet where deploying is a pain because it’s happening so frequently? I understand there’s good separation of concerns so a change in marketing/ won’t cause conflicts or anything to impact frontend/ but I have to imagine eventually you’ll hit that pain point. But fwiw I’m a big fan of monorepo containing multiple services, and only breaking up the monorepo when it starts to cause problems. Sounds like author is doing that
I really want the world to move on from monorepos to multirepos. Git submodules set multirepos back by 10 years, but they still make more sense. The are composable!
For me, integrating features that spans multiple repositories means coordinating changes, multiple PRs, switching branches on many repos to do testing. Quite time consuming. I did use submodules but I find monorepo easier to manage
I don't doubt that it is. Monorepo tools are much better right now. But monorepos don't compose. They don't branch. They don't scale.
My impression is that the world moved on from multirepo to monorepo and I vaguely remember that git submodules have some serious gotchas.
https://diziet.dreamwidth.org/14666.html#what-is-wrong-with-...
yeah, I dunno how else to say it except that if this feature worked right people would like it
Interesting approach to giving LLMs full context. My only concern is the "no workspaces" approach; manual cd && npm install usually leads to dependency drift and "it works on my machine" issues once you start sharing logic between the API and the frontend. It’s a great setup for velocity now, but I'm curious if you've hit any friction with types or shared utils without a more formal monorepo tool?
How do you guys share types between your frontend and backend? I've looked into tRPC, but don't like having to use their RPC system.
I do it naively. Maintain the backend and frontend separately. Roll out each change in a backwards compatible manner.
I used to dread this approach (it’s part of why I like Typescript monorepos now), but LLMs are fantastic at translating most basic types/shapes between languages. Much less tedious to do this than several years ago.
Of course, it’s still a pretty rough and dirty way to do it. But it works for small/demo projects.
So in short you don't share types. Manually writing them for both is easy, but also tedious and error prone.
Each layer of your stack should have different types.
Never expose your storage/backend type. Whenever you do, any consumers (your UI, consumers of your API, whatever) will take dependencies on it in ways you will not expect or predict. It makes changes somewhere between miserable and impossible depending on the exact change you want to make.
A UI-specific type means you can refactor the backend, make whatever changes you want, and have it invisible to the UI. When the UI eventually needs to know, you can expose that in a safe way and then update the UI to process it.
Usually you only share API functions signature and response types.
It's tempting to return a db table type but you don't have to.
This completely misses the point of what sharing types is about. The idea behind sharing types is not exposing your internal backend classes to the frontend. Sharing types is about sharing DTO definitions between the backend and the frontend. In other words, sharing the return types of your public API to ensure when you change a public API, you instantly see all affected frontend code that needs to be changed as well. No one is advocating for sharing internal representations.
I have a library translate the backend types into Typescript. What language do you use on the back?
Typescript, using Zod with Express for parameter validation.
Why do you even have to ask, then? TS on both sides is the easiest case.
Typespec is up and coming. Otherwise there are plenty of options like OpenAPI
FastAPI -> OpenAPI -> openapi-typescript
protobuf?
Protobuf is decent enough, I've used Avro and Thrift before (way way before protobuf came to be), and the dev experience of protobuf has been the best so far.
It's definitely not amazing, code generation in general will always have its quirks, but protobuf has some decent guardrails to keep the protocol backwards-forwards compatible (which was painful with Avro without tooling for enforcement), it can be used with JSON as a transport for marshaling if needed/wanted, and is mature enough to have a decent ecosystem of libraries around.
Not that I absolutely love it but it gets the job done.
I like this a lot. Every time I am forced to open Notion or Slite, I just wish so much it would just be .md files in a git repository.
Good Christ. Imagine having decided that your price structures should be a JSON file instead of persisted in a database and then thinking that any decision made by that person/team is a good idea.
I look forward to when we see the article about breaking the monorepo nightmare.
Depending on how often you need to change your pricing and how many products you offer, flat files might make a lot of sense.
Sometimes this sort of thing is not a bad idea. If it's a simple data structure that doesn't change very often, you get an admin interface (vi), change tracking, and audit trail for free. Just think of it as configuration rather than data and most folks would think it's normal to do this.
I have a question about Monorepo. Do companies really expose their entire source code all in one repo for their devs to download ? I understand that people can always do bad things if they want but with monorepo, you are literally letting me download everything right ?
This is probably different between startups and enterprises. My background is purely startups, and I can't imagine not having access to 100% of the code for the company I work.
I work at Google, and yes. We use a monorepo for absolutely everything you can think of. But good luck getting that code off a corp device without being caught!
While not talked about on HN as much, the big corps doing monorepo use something like Perforce which has "protects" tables allowing very granular access control
Hosting a developer environment remotely that you SSH into is very common. That’s how you would approach working with a monorepo that has any serious size to it.
The opening blurb about updating a JSON file and seeing it reflected right away in the live web app just reminds me of a CMS.
I’m not sure how seemingly most of us forgot about the context window being a finite _window_.
“It’s all there Claude just read it.”
Ok…
You can still have all the context in one place, just clone the repos to one folder on your machine, problem solved.
That introduces the problem of coordinating changes between repositories
Oddly enough, I wrote an article about this very topic recently: https://medium.com/@jensenbox/why-monorepos-are-winning-in-t...
I envy this confidence, the less you know the more confident you are.
Had me interested up until the word "AI"
Honestly, from the enterprise IT perspective?
Fuck yes I love this attitude to transparency and code-based organization. This is the kind of stuff that gets me going in the morning for work, the kind of organization and utility I honestly aspire to implement someday.
As many commenters rightly point out, this doesn't run the human side of the company. It could, though, if the company took this approach seriously enough. My personal two cents, it could be done as a separate monorepo, provided the company and its staff remain disciplined in its execution and maintenance. It'd be far easier to have a CSV dictate employees and RBAC rather than bootstrapping Active Directory and fussing with its integrations/tentacles. Putting department processes into open documentation removes obfuscation and a significant degree of process politics, enabling more staff to engage in self-service rather than figuring out who wields the power to do a thing.
I really love everything about this, and I'd like to see more of it, AI or not. Less obfuscation and more transparency is how you increase velocity in any organization.
For the purpose of AI Tools, you can also have one workspace, or one directory where multiple repos are cloned to as a parent. Just saying...
Sounds like a pain in the ass for non developers to contribute.
Also, are we just upvoting obvious AI gen marketing slop now?