But then when you find an unmaintained project and want to find out if it has a maintained fork, you have to browse through hundreds of forks that were created just to submit one patch.
> hundreds of forks that were created just to submit one patch.
Or, worse (IMHO) when they created the fork to house a patch - sometimes a meaningful feature, bugfix, or security fix - and then left it because the submission process was too onerous to bother. Onerous can also include never getting the PR reviewed, too, which is far more damaging (IMHO) because it disincentivizes future contributions, too
I actually would love an "implied PR" view of the forks which would enable quickly filtering out the -0,+0 "shallow forks" from the -50,+2000 "oh, that's likely doing something interesting" ones, provided it has the sane ?w=0 to hide forks that felt it necessary to push up simple reformatting changes
It's a shared/copy-on-write backend. An "entire fork" consists of 20-byte commit ID and whatever metadata identifies the owner and organization, and you create the "entire fork" in the UX by clicking on a single button. I don't understand the criticism here, do you just not like the use of the word "fork" because it implies a copy that doesn't exist?
Not doubting you, but do you have a source where I could read about this? I feel that would create some wild problems if the original repository was deleted.
Edit: I guess they would just not delete the data if there were more references to it.
> What would the alternative be, as you can't just edit someone else's repo?
git push directly to the repository, in a separate branch namespace. This is how e.g. Gerrit works (pushing to a special ref makes a review, which is essentially the same as a pull request).
> Besides just cloning it and making the change locally, of course.
With GitHub, you cannot do that and get a PR out in the other end. You _must_ fork the repository into your own user/organization, push to that and then send a PR from that.
> git push directly to the repository, in a separate branch namespace. This is how e.g. Gerrit works (pushing to a special ref makes a review, which is essentially the same as a pull request).
And that `git push` doesn't need to be literally to the one and only repository. The SSH daemon could create an isolated environment (e.g. QEMU, FreeBSD jail, etc) that contains a copy of the repository, and run the commands in there. Obviously this could also check SSH keys and the requested git commands before doing anything at all.
It would probably be like what Sourcehut does[1] for letting you SSH into build VMs, but instead of a build it's a push. And they already do some logic during a push[2], so their code for those two places is probably a good place to look for how to implement this kind of thing.
>And that `git push` doesn't need to be literally to the one and only repository
I believe github already has their own implementation of a git server, so any commands submitted to it are abstracted away. They probably don't have a literal .git directory sitting on a server.
> git push directly to the repository, in a separate branch namespace. This is how e.g. Gerrit works (pushing to a special ref makes a review, which is essentially the same as a pull request).
What's the material difference? They build special mechanisms to provide access control for sub namespaces, which sound a lot like "forks".
Also i have no clue on their backend (iirc this info is researchable tho), but i wouldn't be surprised if functionally that is exactly how they do it anyway. It's all content addressed, i doubt they pay 2x the storage anytime you fork a repo right?
The big difference is how it is organized to the users viewing it. My GitHub account is littered with old forks of repositories I created just to submit one line patches; and despite the pull request to a repository in some sense being data relating to the history of that project -- certainly the discussion is all organized under that project -- if the original person actually removes fork to garbage collect their namespace the commits referenced in that pull request just disappear. Meanwhile, despite people now using the word "fork" for this purpose due to GitHub, there is actual value in being able to search for actual forks of a project--things that people are choosing to publicly distribute and maybe maintain themselves--rather than seeing a thousand repositories which exist only for the purpose of contributing a single patch (or, though this is another topic, people making the metaphorical equivalent of a "backup copy" within the strange set of semantics and ownership that is GitHub).
So not really. It's a special branch path that only exists for opening PRs, and doesn't do anything other than opening a PR. Yes, they share an object space, but so do forks in the first place, so any security issues with this flow are the same ones in the fork-PR flow.
It's still a security problem. If you put an unlocked outside door on your house and rely on the interior doors to be locked, we'd agree that's not safe, right? Or to keep it safe would require the kind of uniform attentiveness that people are generally bad at.
Folks with accounts "littered with old forks created to make PRs" may not have that kind of attentiveness.
I don't understand the analogy. What's the actual risk exposed by pushing to a magic branch path that opens a PR instead of actually creating a branch, compared to creating a fork and making a PR in that way?