Positive values are a particular case of signed values, you can still use signed...

dkersten · on April 2, 2021

> No need to enforce your semantics through type

Maybe I'm spoiled by other languages with more powerful type systems, but this is exactly what I want my types to do! Isn't this why we have type traits and concepts and whatnot in C++ now? If not for semantics, why have types at all, the compiler could figure out what amount of bytes it needs to store my data in, after all.

I use types for two things: to map semantics to hardware (if memory or performance optimization are important, which is rare) and to enforce correctness in my code. You're telling me that the latter is not a valid use of types and I say that's the single-biggest reason I use statically typed languages over dynamically typed languages, when I do so.

But even if that's not the case, why would I use a more general type than I need, when I know the constraints of my code? If I know that negative values are not semantically valid, why not use a type that doesn't allow those? What benefit would I get from not doing that? I mean, why do we have different sizes of integers when all the possible ones I could want can be represented as a machine-native size and I can enforce size constraints in software instead? We could also just use double's for all numbers, like some languages do.

jcelerier · on April 2, 2021

> Maybe I'm spoiled by other languages with more powerful type systems, but this is exactly what I want my types to do! Isn't this why we have type traits and concepts and whatnot in C++ now? If not for semantics, why have types at all, the compiler could figure out what amount of bytes it needs to store my data in, after all.

yes, but understand that, despite the name, what unsigned models in C / C++ is not "positive numbers" but "modulo 2^N" arithmetic (while signed models the usual arithmetic).

There is no good type that says "always positive" by default in C or C++ - any type which gives you an infinite loop if you do

    for({int,unsigned,whatever} i = 0; i < n - 1; i++) {
       // oops, n was zero, n - 1 is 4.something billion, see you tomorrow
    }

is not a good type.

If you want a "always positive" type use some safe_int template such as https://github.com/dcleblanc/SafeInt - here if you do "x - y" and the result of the computation should be negative, then you'll get the rightful runtime error that you want, not some arbitrarily high and incorrect number

The correct uses of unsigned are for instance for computations of hashes, crypto algorithms, random number generation, etc... as those are in general defined in modular arithmetic

oddthink · on April 2, 2021

+1 for this. I was just bitten by this last week, when I switched from using a custom container where size() was an int to a std::vector where size() is size_t.

The code was check-all-pairs, e.g.

  for (int i = 0; i < container.size() - 1; ++i) {
    for (int j = i + 1; j < container.size(); ++j) {
      stuff(container[i], container[j]);
    }
  }

Which worked just fine for int size, but failed spectacularly for size_t size when size==0.

I totally should have caught that one, but I just couldn't see it until someone else pointed it out. And then it was obvious, like many bugs.

jcelerier · on April 2, 2021

I recommend using -fsanitize=undefined -fsanitize=integer if you can build with clang - it will print a warning when an unsigned int underflows which catches a terrifying amount of similar bugs the first time it is run (there are a lot of false positives in hash functions, etc though but imho it's well worth using regularly)

enriquto · on April 2, 2021

Would you really write a function find_prime_factors() that takes an input of type "integer" and an output of type "prime", that you have previously defined? Then if you want to sum or multiply such primes you have to cast them back to integers. Maybe it makes sense for you, but for me this is the textbook example of useless over-engineering.

The same ugliness occurs when using unsigned types to store values that happen to be positive. Well, in that case it is even worse, because it is incomplete and asymmetric. What's so special about the lower bound of the possible set of values? If it's an index to an array of length N, you'll surely want an integer type whose values cannot exceed N. And this is a can of worms that I prefer not to open...

dkersten · on April 2, 2021

> Would you really write a function find_prime_factors() that takes an input of type "integer" and an output of type "prime", that you have previously defined?

If the language allows me to and its an important semantic part of my program, then yes. The same way as I would create types for units that need conversion.

Unless I'm writing low level performance sensitive code, yes, I want to encode as much of my semantics as I can, so that I can catch mistakes and mismatches at compile time, make sure units get properly converted and whatnot.

> What's so special about the lower bound of the possible set of values?

Nothing, I would encode a range if I can. But many things don't have a knowable upper-bound but do have a lower bound at zero: you can't have a negative size (for most definitions of size), usually when you have a count of things you don't have negatives, you know that a dynamically sized array can never have an element index less than 0, but you may not know the upper bound.

Also, the language has limitations, so I have to work within them. I don't understand your objection for using what is available to make sure software is correct. Also, remember that many of the security bugs we've seen in recent years came about because of C not being great at enforcing constraints. Are you really suggesting not to even try?

> And this is a can of worms that I prefer not to open...

And yet many languages do and even C++20 is introducing ranges which kind of sort of fall into this space.

giomasce · on April 2, 2021

To me it could totally make sense. It depends on the context, but I can very well see contexts where such a choice could make sense. For example, in line of principle it would make sense, for an RSA implementation, to accept to construct a type PublicKey only computing the product of two Prime's, and not two arbitrary numbers. And the Prime type would only be constructible by procedures that provably (perhaps with high probability) generate an actual prime number. It would be a totally sensible form of defensive programming. You don't want to screw up your key generation algorithm, so it makes sense to have your compiler help you to not construct keys from anything.

For the same reason, say, in an HTTP server I could store a request as a char* or std::string, but I would definitely create a class that ensures, upon construction, that the request is valid and legitimate. Code that processes the request would accept HTTPRequest, but not char*, so that unverified requests cannot even risk to cross the trust boundary.

UncleMeat · on April 2, 2021

But "unsigned" doesn't actually enforce the semantics you want. Missing an overflow check means your value will never be negative, but it is almost certainly still a bug. And because unsigned overflow is defined, the compiler isn't allowed to prevent you from doing it!

This is just enough type semantics to injure oneself.

dkersten · on April 2, 2021

So, because its not perfect, should you throw it all out?

UncleMeat · on April 2, 2021

No. Because people tend to make more mistakes if they try to use unsigned values in this manner in C/C++.

dkersten · on April 3, 2021

I’ve personally never encountered a bug that turned out to be caused by an unsigned value. YMMV, I guess.

UncleMeat · on April 3, 2021

If seen all sorts of bugs caused by surprise conversions, as well as overflows that cause bugs that would be statically detectable but can't become blocking errors because unsigned overflow is well defined.

masklinn · on April 2, 2021

> Positive values are a particular case of signed values, you can still use signed ints to store positive values.

And yet Java's lack of unsigned integers is considered a major example of its (numerous) design errors.

> No need to enforce your semantics through type, and especially not when the values of the type are trivially particular cases of the values of another type.

Of course not, there's no need for any type at all, you can do everything with just the humble byte.

> The same thing for positive numbers

No?

> You can and should do everything with signed integers

You really should not. If a value should not have negative values, then making it so it can not have negative values is strictly better than the alternative. Making invalid values impossible makes software clearer and more reliable.

> except bitfields, of course.

There's no more justification for that than for the other things you object to.

nayuki · on April 4, 2021

Java's lack of unsigned int is widely (but not universally) seen as a deficiency. This is especially true when Java is compared to C#, a very similar language at its core but which does have uint types. Anyway, I have a separate article arguing why Java should not have uint, and many ideas from there can be adapted to C/C++ too: https://www.nayuki.io/page/unsigned-int-considered-harmful-f...

enriquto · on April 2, 2021

Well, you and me are different persons and we don't have to agree on everything. In this case, it seems that we don't agree on anything. But it's still OK, if it works for you ;)