Monday, March 1, 2010

The Joy of Hex

The Joy of Hex

The following program adds two hexadecimal, or "hex," literals and prints the result in hex. What does the program print?

public class JoyOfHex {

public static void main(String[] args) {

System.out.println(

Long.toHexString(0x100000000L + 0xcafebabe));

}

}



Solution : The Joy of Hex


It seems obvious that the program should print 1cafebabe. After all, that is the sum of the hex numbers 10000000016 and cafebabe16. The program uses long arithmetic, which permits 16 hex digits, so arithmetic overflow is not an issue. Yet, if you ran the program, you found that it prints cafebabe, with no leading 1 digit. This output represents the low-order 32 bits of the correct sum, but somehow the thirty-third bit gets lost. It is as if the program were doing int arithmetic instead of long, or forgetting to add the first operand. What's going on here?



Decimal literals have a nice property that is not shared by hexadecimal or octal literals: Decimal literals are all positive [JLS 3.10.1]. To write a negative decimal constant, you use the unary negation operator (-) in combination with a decimal literal. In this way, you can write any int or long value, whether positive or negative, in decimal form, and negative decimal constants are clearly identifiable by the presence of a minus sign. Not so for hexadecimal and octal literals. They can take on both positive and negative values. Hex and octal literals are negative if their high-order bit is set. In this program, the number 0xcafebabe is an int constant with its high-order bit set, so it is negative. It is equivalent to the decimal value -889275714.



The addition performed by the program is a mixed-type computation: The left operand is of type long, and the right operand is of type int. To perform the computation, Java promotes the int value to a long with a widening primitive conversion [JLS 5.1.2] and adds the two long values. Because int is a signed integral type, the conversion performs sign extension: It promotes the negative int value to a numerically equal long value.



The right operand of the addition, 0xcafebabe, is promoted to the long value 0xffffffffcafebabeL. This value is then added to the left operand, which is 0x100000000L. When viewed as an int, the high-order 32 bits of the sign-extended right operand are -1, and the high-order 32 bits of the left operand are 1. Add these two values together and you get 0, which explains the absence of the leading 1 digit in the program's output. Here is how the addition looks when done in longhand. (The digits at the top of the addition are carries.)




    1111111

0xffffffffcafebabeL

+ 0x0000000100000000L

0x00000000cafebabeL



Fixing the problem is as simple as using a long hex literal to represent the right operand. This avoids the damaging sign extension, and the program prints the expected result of 1cafebabe:




public class JoyOfHex {

public static void main(String[] args) {

System.out.println(

Long.toHexString(0x100000000L + 0xcafebabeL));

}

}



The lesson of this puzzle is that mixed-type computations can be confusing, more so given that hex and octal literals can take on negative values without an explicit minus sign. To avoid this sort of difficulty, it is generally best to avoid mixed-type computations. For language designers, it is worth considering support for unsigned integral types, which eliminate the possibility of sign extension. One might argue that negative hex and octal literals should be prohibited, but this would likely frustrate programmers, who often use hex literals to represent values whose sign is of no significance.

No comments:

Post a Comment