It's entirely valid C and (assuming this and that are byte pointers) copies a range of bytes until (and including) a zero byte is reached.
With a suffficient warning level (e.g. -Wall on gcc, which should always be enabled anyway, together with -Wextra), compilers will complain about the '=' and ask you to add a pair of braces to make clear that this is actually intended:
while( (*this++ = *that++) );
It's also one of those cases where the C code matches the output assembly pretty well:
As far as "obfuscated C" goes, this is a very tame example though, it's just a straightforward usage of language features, which might look strange only when coming from other languages that don't have pointers or a post-increment operator).
That extra pair of braces doesn't make the code 'ugly' ;)
And the code without braces is still entirely valid standard C, the warning is essentially just a lint to protect against typos (similar to JS linters warning about '===' vs '==').
PS: let's see if the alternatives would be any more readable:
char c;
while (c = *that++) {
*this++ = c;
}
...this is already buggy because it doesn't copy the final zero byte, so the test must happen inside the loop body and also lets try to get rid of the post-increment:
while (true) {
char c = *that;
*this = c;
this += 1;
that += 1;
if (c == 0) {
break;
}
}
...hmm not really any more readable...
Let's try with an index...
while (true) {
char c = that[i];
this[i] = c;
i += 1;
if (c == 0) {
break;
}
}
...might be a bit easier to grasp when used to other languages, but readability hasn't improved all that much I'd say...
For reference, MUSL also just uses the original approach:
I don’t write much C, but to an outsider like me this is a pretty big improvement.
It is a shame post-test loops aren’t more popular, given the similarity to the assembly they output. Seems more mechanically sympathetic. Oh well, at least it is an excuse to whip out the goto.
I find it crazy that you improve the readability so much and say readability hasn't improved that much.
The order things happen in *this++ is not obvious unless you know a bunch of C-specific rules, while the ordering of multiple statements is obvious even to someone who doesn't know C. Perhaps C programmers should find this obvious, but it seems to me more like cognitive overhead which has a non zero chance of confusing someone at some point.
That's almost a philosophical question ;) Should code in a specific language be more readable to programmers familiar to that language or to programmers who are not familiar?
E.g. I guess for a mathematician, all imperative languages are probably 'weird', while something like Haskell feels more familiar?
I was unclear, sorry: I didn't mean to say that the extra braces make it uglier, I meant to point out that something that was described as beautiful was actually flawed.
The flaw was minor in this case because the identifier names and lack of body make the intention clear, but my point is that there are a lot of minor things in C that can come and bite you at any time.
Edit: You are right, I don't see a way this could have been implemented more readable without sacrificing some performance.
I've added a couple of examples trying to find a more readable version, which actually isn't trivial. Sorry for the 'post-edit' :)
As for performance: I don't think such details matter much, first, compilers are pretty good to turn "readable but inefficient" code into the same optimal output (aka "zero cost abstraction").
And a really performance-oriented strcpy() wouldn't simply copy byte by byte anyway, but try to move data in bigger chunks (like 64-bit or even SIMD registers). Whether this is then actually faster also depends on the CPU though.
`this` and `that` are arrows which range over a stream of data; `=` is copy, and `++` moves the arrow along the stream.
This isn't a "clever one-liner" it is a clear and precise syntax for expressing the operation the machine actually performs.
while(copy(current(stream_a), current(stream_b)) and not end_of_stream(stream_a))
You might prefer the above, but then, that's every other major language. The beauty of C is that the above code has to compile to something like the C version. C just allows you to actually express it
GCC warnings can be overly pedantic. It's setup to warn about common footguns but doesn't know what your intent is. In this case it's a common enough idiom to assign within a control statement that GCC has the extra parens escape hatch.
You shouldn't just blindly let your tooling dictate how you work. It's a tool that's supposed to work for you, not control you. -Wall and -Wextra are good baselines but I always disable some of their warnings because I don't need the hassle on known good code.
With a suffficient warning level (e.g. -Wall on gcc, which should always be enabled anyway, together with -Wextra), compilers will complain about the '=' and ask you to add a pair of braces to make clear that this is actually intended:
It's also one of those cases where the C code matches the output assembly pretty well:https://www.godbolt.org/z/nz1jbz4Er
As far as "obfuscated C" goes, this is a very tame example though, it's just a straightforward usage of language features, which might look strange only when coming from other languages that don't have pointers or a post-increment operator).