It’s been a while since I wrote my last article. I recently read an old book Expert C Programming and found there are many C langauge details that I never think about. Review and rethink what C integer promotion rules meant to C is one of the interesting topic that I want to share with you.
What is type systems?
Type is the methods that interpret and process your data. Sounds familiar to OOP? Definiing a class IS defining a method. In C, expressions on different data type produces different instructions. What instruction to generate? Depends on the data type. (int is different than float, for example).
A chunk of data in memory or register can mean anything on a CPU. The type system defines how operators works on different types. Given the example below:
1 2 3 4 5 6 |
|
When adding an integer to a pointer, the compiler knows the offset multiplier
by the type of the pointer points to. If p
where defined as char* p
, then
the multiplier would be 1. Also, the +
operator only make sense to add an
offset to a pointer, therefore adding two pointer together is invalid in C
syntax.
Review C integer promotion rules
I wrote an article about C integer promotion 2 years ago: Deep C: Integer Promotion. The standard integer promotion rules in C99 is listed below.
If an int can represent all values of the original type, the value is converted to an int; otherwise, it is converted to an unsigned int. These are called the integer promotions. All other types are unchanged by the integer promotions.
This is called value preserving. On machine level, the arithmetic works best on a whole register (or 32 bit register). Remember that C’s philosophy is to make your program runs as fast as possible, so it loads your small data chunk into the whole register, then do the calculation.
However, things worked differently on K&R C. It’s was defined as unsigned preserving. When mixing a signed and unsigned integer, the result is unsigned type. Below is the example copied from Expert C Programming book.
1 2 3 4 5 |
|
Mixing unsigned and signed integer in modern C
Although most of the time C will preserve the value of the type for you, you can still get surprising result. When the signed integer can’t hold the unsigned value, everything is converted to unsigned integer.
1 2 3 4 5 6 |
|
This program will execute the else branch and print:
1
|
|
Conclusion
Always be careful when you mix unsigned and signed integer in expressions. When possible, type cast it to signed int. C is designed to make program fast, but not safe for beginners. Additional checks would create a lot of overhead in the runtime, so the developer need to know the system and language well to avoid shooting at their own foot. Another possible reason may be that the original C developers were all operating system writers and compiler writers. Therefore, the mappings between Assembly and C langauge is straight forward to them.