Saturday, March 2, 2013

Basics of signed and unsigned integers in C

Any competent C programmer understands how signed and unsigned integers are handled.  I understand the concepts, but realized I forgot some of the details, so I experimented with a simple program to refresh my memory.

basic_int.c

The C language standard does not specify how to handle negative numbers, but most modern HW platforms use two's complement.  In two's complement, -1 is represented as all 1's, which is the maximum unsigned integer.  Also, printf interprets the data using the format string, not the type of the variable.  For example:
Results in:
int: -1, unsigned int: 4294967295, hex: ffffffff

Casting from an int to an unsigned int does not change the value stored in memory:
Results in:
i: int: -1, unsigned int: 4294967295, hex: ffffffff
ui: int: -1, unsigned int: 4294967295, hex: ffffffff

In two's complement, the most negative number is represented with an MSB of 1, followed by all zeros:
Results in:
i: int: -2147483648, unsigned int: 2147483648, hex: 80000000
ui: int: -2147483648, unsigned int: 2147483648, hex: 80000000

Addition and subtraction is handled the same for signed and unsigned (for two's complement).  Subtraction is implemented by inverting, adding, and adding 1.  For example, i - 1 is implemented as i + 0xFFFFFFE + 1.
Results in:
i: int: 2147483647, unsigned int: 2147483647, hex: 7fffffff
ui: int: 2147483647, unsigned int: 2147483647, hex: 7fffffff

When an operation is performed where one operand is signed and the other is unsigned, if all values of the unsigned operand can be represented in the signed type, then the unsigned operand is promoted to the signed type.  If not, then the signed operand is cast to unsigned.  This leads to some interesting cases for relational operations.  Constants are implicitly signed.

For example, If i = -1, and ui = i (4,294,967,295), then i < 0 is true and and ui > 0 is true, which makes sense.  But, ui == i is true, which seems incorrect since i < 0 and ui > 0.  But, C casts i to the unsigned interpretation of 4,294,967,295, which equals ui.
Results in:
i: int: -1, unsigned int: 4294967295, hex: ffffffff
ui: int: -1, unsigned int: 4294967295, hex: ffffffff
i < 0? true
ui < 0? false
i == ui? true

Here's another odd case - compare -1 (implicitly signed) to 0 (unsigned).  -1 is implicitly cast to 0xFFFFFFFF unsigned, which is greater than 0.
Results in:
ui: int: 0, unsigned int: 0, hex: 0
-1 < ui? false

Casting an unsigned to a signed can also lead to an interesting result.  If ui = 0x80000000, and i = (ui - 1), then i < ui is true. But, if ui is cast to a signed int, then it's a negative number and i < ui is false.
Results in:
i: int: 2147483647, unsigned int: 2147483647, hex: 7fffffff
ui: int: -2147483648, unsigned int: 2147483648, hex: 80000000
i < ui ? true
i < (int) ui ? false

If ui is 1, and i = -1, then (ui < i) is true, which seems incorrect.  But i is implicitly cast to an unsigned int with a value of 0xFFFFFFFF, which is greater than 1.  However, if the signed operand can represent all values of the unsigned operand, then the unsigned operand is promoted to the type of the signed operand.  For example, if usi is an unsigned short = 1, and i = -1, then (usi < i) is false as expected, since i remains -1 as a signed int, and usi is promoted to 1 as a signed int.
Results in:
i: int: -1, unsigned int: 4294967295, hex: ffffffff
ui: int: 1, unsigned int: 1, hex: 1
usi: int: 1, unsigned int: 1, hex: 1
ui < i ? true
usi < i ? false

I'm sure you found this post really exciting!