Hacker News new | past | comments | ask | show | jobs | submit login

Git uses SHA-1 hashes, which have not been considered cryptographically secure since 2005.

Git and CVS are both just tools. They each provide a server implementation, but it's uncommon to use either of these for write access in large projects. It's more common to wrap CVS or Git with a different frontend like HTTPS or SSH. My guess is that the OpenBSD guys use OpenSSH.

This team is fanatical about security and process. I am completely comfortable with them using whichever tools they want.




SHA-1 has been shown to not be collision-resistant. Correct me if you've heard otherwise, but I believe SHA-1 is still believed to be second-preimage resistant.

In other words, with fewer than 2 * * 80 trials, an attacker can generate two files that have the same SHA-1 hash. (In the case of C source files, this probably means embedding a nonsense comment in the middle of each file, probably several hundred to several thousand ASCII characters.) If an attacker can get the very carefully constructed benign file past code review and have the ability to modify the repository, they can substitute the very carefully crafted malicious file for the very carefully crafted benign file without changing the root of the Merkle tree.

Assuming that SHA-1 is still second-preimage resistant, it will take an attacker about 2 * * 159 attempts to come up with a file that has the same hash as a legitimate file not carefully constructed by the attacker.

So, the weaknesses in SHA-1 probably mean that exploiting those weaknesses in the context of git still requires a mole in the development team. Though, I wouldn't want to bet my life on nobody noticing a big nonsense comment in the middle of a C file or someone figuring out how to construct a reasonably reviewable C file as the carefully crafted benign file.

In any case, the weaknesses in SHA-1 still likely pose a significant difficulty in forging a git history without planting a mole in the dev team. It's much better than no cryptographic barriers to forgery.


Conversely there are other signing techniques. GPG signed tags is an officially supported method.

Probably more importantly though, Git encourages everyone to have the full repository lying around. Even if you inserted a vulnerability in a master, there would still be thousands of copies of code which could be independently compared to find the exact changes which were made.


I think gpg signs just the sha1 the tag points to (root of merkle tree). Also, when comparing local repo against remote repo during fetch, I think git assumes that as long as the sha1 of a commit did not change, there is no need to compare further. So the substitution will not get propagated to people who do "git pull" but people who do "git clone" will get it.


Linus Torvalds: "the point is the SHA-1, as far as Git is concerned, isn't even a security feature. It's purely a consistency check. The security parts are elsewhere, so a lot of people assume that since Git uses SHA-1 and SHA-1 is used for cryptographically secure stuff, they think that, OK, it's a huge security feature. It has nothing at all to do with security, it's just the best hash you can get."


The edit button is gone. I guess these expire?

My reply was not intended as an attack on Git. I use it daily and would choose it 10 times out of 10 vs. CVS for a new project. I just think the assertion that Git 'saved' Linux from some backdooring attempts because it's decentralized and uses cryptographic hashes is wrong; it's not the tools that make this happen, it's the processes around the use of these tools which do that.

I don't know any OpenBSD developers nor do I have any inside knowledge of how their team works, but I know from observation that they are a small team with high standards for code style and quality. They don't just let anyone commit code and appear to be thorough with code review. When procedural/practice problems are identified in the industry, they are proactive about mitigating or fixing those. They have a demonstrated track record of good releases. Basically, I don't see any reason to question their use of CVS.


(first note that I didn't assert anything: I asked question(s) and used "IIRC" etc.)

I found the story back and things are, IMHO, actually quite interesting... If only because the attempt was made after someone ill-intentioned gained access to Linux's CVS repository.

Back then Linux was still using BitKeeper (decentralized) for Linus hadn't created Git yet (so I was not remembering things correctly here). But apparently some people didn't like BitKeeper so there was a CVS clone of the BitKeeper version. And it's in the CVS repo that the attempt took place (after someone hacked his way into the server hosting the CVS repo).

Here's the story:

https://freedom-to-tinker.com/blog/felten/the-linux-backdoor...

Now even though Linus didn't choose SHA-1 for its cryptographic properties and even if SHA-1 is not SHA-256 nor SHA-3, it still looks like an attacker gaining access to a CVS repo would have a much easier time inserting a backdoor than an attacker gaining access to DVCS using cryptographic hashes (which user KMag here explained nicely).


> Basically, I don't see any reason to question their use of CVS.

Why not ?

With CVS, the security rely on the security of a single server. Anybody with root access to the CVS server can modify history, and nobody would notice.


> Git uses SHA-1 hashes, which have not been considered cryptographically secure since 2005.

That a vast oversimplification. There are still no publicly-known preimage or second-preimage attacks against SHA-1. Even the collision attack in 2005 was limited insofar that it reduced the search from a brute-force 2^80 to 2^69.

Perhaps you're referring to a length extension attack, but I admit I don't know much about those in the context of Git's use of SHA-1.


In terms of C source files, unless the fist C source file being extended is at least 64 petabytes long, a length extension attack is going to embed null bytes in the C source file. I don't know chapter and verse of any of the C standards or GCC/Clang extensions, but I wouldn't be surprised if even string literals and comments including nulls cause problems for both GCC and Clang.

Anyone care to chime in/experiment with ways to embed nulls in C files such that either GCC or Clang will continue compiling code after hitting a null byte in the middle of a file? (I'm not talking about an escaped null in a character or string literal, but actually a 00 showing up in hexdump -C of the source.)

EDIT: I think it's safe to assume people will notice someone trying to sneak a 64-petabyte C source file into the codebase. With apologies to Sweet Brown, aint nobody got time for [downloading] that.


"I am completely comfortable with them using whichever tools they want."

Forgive me for saying so and I doubt you intend it so, but statements like this is what gets us into trouble. At a certain point we need to trust the people that build the foundation for us, but only once we've done our due diligence. Most people, maybe not you since you seem to know more about the team, have not yet.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: