Not it's not. You can't distinguish the complete works of Shakespeare from the complete works of Shakespeare with the words changed a bit, or switching to Harry Potter for a bit in the middle, or a completely dazzling and original work of fiction. Any hash might generate all three of these, and essentially boundless other works. It's a library of Babel.
Forgive me if I've misunderstood - the premise was that, out of the infinite set of data that hashes to a particular hash/checksum, there is a unique data set that is "obviously" the real one? Or at least, that the set is meaningfully bounded? My reply is that this is not the case. There will be infinitely many "plausible" data sets as well.
You could collide every hash in existence merely by making undetectably tiny alterations to Shakespeare.
Not obviously, no. - Just a very high probability. Perhaps that provides noise immunity to some degree. If it does, it is a form of AI.
This admittedly open question presumes a very large fuzzy 'code book' with which it can re-assemble the data. The length of the input in cleartext is valuable metadata that speeds up the search.
You still seem to be missing the point. The problem is that the SHA-1 hash is one of 2^160 possible values, while a 1GB plaintext is one of 2^8000000000 possible values. That means that the hash disambiguates your message down to one of 2^7999999840 values.
Let's go smaller. Let's say our plaintext is a single kilobyte. One 80x25 terminal's worth of ASCII. Knowing the SHA-1 hash narrows our search space, yes, but the search space is so absurdly large that it only solves 0.0000... (insert over two thousand zeros here) ..0001% of the problem.