> He said that GitLab hosts one repository with about 20-million references; each time a reference is deleted, the packed-refs file has to be completely rewritten which means rewriting 2GB of data. "To add insult to injury, this repository typically deletes references every couple seconds."
Git is a beautiful piece of software but it does expose complexity in a very by programmers for programmers kind of way.
I've successfully gotten many non tech roles to use git but there's usually a lot missed in the nuances and power that is in their reach, but not quite adopted.
The learn git branching site/game [1] has always been an awesome resource but you'd like something like that UX be almost part of the initial usage. Intuitive defaults, progressive learning at the right times, etc.
Nowadays, if you can get your users to use an agent CLI like Claude/Codex/OpenCode then it's easier to have that last experience but the more git itself uses accesible abstractions the easier it should be.
No, the problem isn't that Git exposes complexity. The problem is that Git exposes complexity through inappropriate terms, muddled concepts and bad documentation.
Fair enough. I think I'm trying to say the same thing you're saying. It's tough to navigate that world without a lot of investment in understanding the internals which, most people wouldn't do (rational behaviour for "yet another tool", if you don't come from a software development background).
Exacly. Git supposed to be DVCS not generic DVFS. Choose right tool for right task. I needed generic DVFS to store my docs, so I wrote one. Its easy and quick and does it job :)
As explained, the storage backend in git is pluggable but still not flexible enough.
There has been efforts to store git repos in torrents, and to store got repos on crypto blockchains, but all are big architectural challenges, for example people want everything to be backwards compatible for starters, and some want to be able to partially store some content somewhere else, while still keeping all existing use cases efficient.
No. Git should never do that, it would make git worse.
There are a lot of other different metadata that you could imagine to store per commit, but git already supports storing arbitrary data in every commit, you don’t need special casing for some type of metadata, just store it in the commmit as everyone already does, and perhaps build your own tools on top of that if you have special needs.
I don't fully understand what you mean, but you certainly don't want that in git. Git is a source code management system and that's all it should be. Any additional functionality should be added as an add-on (like git-annex) by extending its splendidly extensible replicated content addressed storage system.
Very excited for git 3.0, and also ready to be immediately frustrated by it :D
`jj` has done git users an amazing service simply by being a more intuitive VCS front-end is possible.
Competition is good for an ecosystem.
> He said that GitLab hosts one repository with about 20-million references; each time a reference is deleted, the packed-refs file has to be completely rewritten which means rewriting 2GB of data. "To add insult to injury, this repository typically deletes references every couple seconds."
What in the world... why?
And also: why should he or git care?
GitLab pays his salary.
Git is a beautiful piece of software but it does expose complexity in a very by programmers for programmers kind of way.
I've successfully gotten many non tech roles to use git but there's usually a lot missed in the nuances and power that is in their reach, but not quite adopted.
The learn git branching site/game [1] has always been an awesome resource but you'd like something like that UX be almost part of the initial usage. Intuitive defaults, progressive learning at the right times, etc.
Nowadays, if you can get your users to use an agent CLI like Claude/Codex/OpenCode then it's easier to have that last experience but the more git itself uses accesible abstractions the easier it should be.
[1] - https://learngitbranching.js.org/?locale=en_US
No, the problem isn't that Git exposes complexity. The problem is that Git exposes complexity through inappropriate terms, muddled concepts and bad documentation.
Fair enough. I think I'm trying to say the same thing you're saying. It's tough to navigate that world without a lot of investment in understanding the internals which, most people wouldn't do (rational behaviour for "yet another tool", if you don't come from a software development background).
Storing large files somewhere else is a step towards the centralized model. But it's against initial design principles of git.
No it's not? It's simply an addressing model and interface. Sure you could use a fixed or centralised store but you could also use IPFS for example.
Exacly. Git supposed to be DVCS not generic DVFS. Choose right tool for right task. I needed generic DVFS to store my docs, so I wrote one. Its easy and quick and does it job :)
As explained, the storage backend in git is pluggable but still not flexible enough.
There has been efforts to store git repos in torrents, and to store got repos on crypto blockchains, but all are big architectural challenges, for example people want everything to be backwards compatible for starters, and some want to be able to partially store some content somewhere else, while still keeping all existing use cases efficient.
The transition away from SHA1 is taking absurdly long. The hash function should have been more modular from the beginning.
Git needs to track software licenses on a per commit basis.
No. Git should never do that, it would make git worse.
There are a lot of other different metadata that you could imagine to store per commit, but git already supports storing arbitrary data in every commit, you don’t need special casing for some type of metadata, just store it in the commmit as everyone already does, and perhaps build your own tools on top of that if you have special needs.
I don't fully understand what you mean, but you certainly don't want that in git. Git is a source code management system and that's all it should be. Any additional functionality should be added as an add-on (like git-annex) by extending its splendidly extensible replicated content addressed storage system.
It's not even necessarily for source code. It can be for anything text-based. In that case, software licenses wouldn't even apply.
Use trailers like has become common for LLM assisted code.
Co-Authored-By: Whatever LLM
License: WTFPL