Why wouldn't it be a good idea, assuming we had tools capable of scaling indefinitely? We currently split large codebases into "repos" that are effectively isolated even though they're logically connected, and it would often be useful to do, say, atomic commits across them. This seems like a limitation of our current tools, not an intrinsically good thing.
The hypothetical infinitely scaling tool would just be inevitably be functionally equivalent to splitting repos. Basically all you would be doing is swapping around terminology. Instead of "here is Foo, our great big git system which stores thousands of repos" you would be saying "here is our Foo repo, it stores thousands of projects". Objections that git cannot be used because a git repo cannot hold as much data as a Perforce repo are more or less based in improper strict mapping of terminology ("we have a single perforce repo, so we must have a single git repo").
In other words, that those infinitely scaling systems already exist, and they are built on top of git (or hypothetically Hg, though I cannot think of any examples).
For one publicly visible example of the sort of thing that I am talking about, look into how Android development works. Android is 'in git', even though it is too large for a 'single' git repo.
(Note that due to the ways that code and resposibility is typically organized, most organizations are probably in a situation where migrating from a single monolith repo to many git repos would be a conceptually straightforward task, provided that they can break from the conceptual notion of having a 'single' repo. Atomic commits across repos are the primary pain-point, but you would be surprised how much that disappears as you grow use to working with many repos. Supporting a strong notion of versioned dependencies between packages goes a long way.)
Thanks for the pointer to Android, I hadn't looked at how that was organized before.
I agree that atomic commits are a red herring. They're nice to have, but by the time you outgrow a single git repo, you also have projects with different release schedules, and once you have that, you have to deal with version skew anyway, and then you don't really need atomic commits.
I disagree that "those infinitely scaling systems already exist," though. Looking at Android, they had to build a nontrivial wrapper around git to make it work, and it's not totally transparent. You have to think about when to use 'repo' and when to use 'git', and where the boundaries between repos are.
There are huge benefits to having everything be in one giant pile of code and being able to import and sometimes modify code from far away parts of the tree with minimal overhead. The key is to let any directory be a "project" that you can refer to, without any arbitrary distinction between top-level project directories and others. This lets you do things like spin out a part of a project as a semi-independent library without moving files around or creating new repos.
There are some downsides too, of course. Google eventually broke from this model slightly by introducing components, which had issues of their own. Perhaps Facebook has done it better.
So what you're saying: if you manual manage versions of your dependencies, your _version_ control system works?
I think that's a shame, and it doesn't work very well either. Mercurial does the same thing incidentally (unless facebook have somehow solved that), so it's no better there.
Splitting repos is a pain; perhaps a necessary one, but hardly ideal. It just introduces a bunch of extra administration, and it reduces the power of your VCS primitives (such as branching and merging).
> I think that's a shame, and it doesn't work very well either. Mercurial does the same thing incidentally (unless facebook have somehow solved that), so it's no better there.
The way I interpreted the reddit post from the Facebook engineer is that they looked into customizing git to scale better for them, but the codebase wasn't to their liking, so they're going to customize mercurial's instead.
> The matter of customizing git came up and people looked at the code and decided it's pretty convoluted when compared to the Mercurial code.
So it's not like mercurial is able to scale in a way that git can't, it's that they plan to make mercurial scale in a way that git can't. (Any over/under on when they give up on that and decide to use Perforce's Git Fusion?)