Dharma's Blog: Multicast

Multicast

Casts are used to convert a value from one type to another. This program uses three casts in succession. What does it print?

public class Multicast {

    public static void main(String[] args) {

        System.out.println((int) (char) (byte) -1);

    }

}

Solution : Multicast

This program is confusing any way you slice it. It starts with the int value -1, then casts the int to a byte, then to a char, and finally back to an int. The first cast narrows the value from 32 bits down to 8, the second widens it from 8 bits to 16, and the final cast widens it from 16 bits back to 32. Does the value end up back where it started? If you ran the program, you found that it does not. It prints 65535, but why?

The program's behavior depends critically on the sign extension behavior of casts. Java uses two's-complement binary arithmetic, so the int value -1 has all 32 bits set. The cast from int to byte is straightforward. It performs a narrowing primitive conversion [JLS 5.1.3], which simply lops off all but the low-order 8 bits. This leaves a byte value with all 8 bits set, which (still) represents –1.

The cast from byte to char is trickier because byte is a signed type and char unsigned. It is usually possible to convert from one integral type to a wider one while preserving numerical value, but it is impossible to represent a negative byte value as a char. Therefore, the conversion from byte to char is not considered a widening primitive conversion [JLS 5.1.2], but a widening and narrowing primitive conversion [JLS 5.1.4]: The byte is converted to an int and the int to a char.

All of this may sound a bit complicated. Luckily, there is a simple rule that describes the sign extension behavior when converting from narrower integral types to wider: Sign extension is performed if the type of the original value is signed; zero extension if it is a char, regardless of the type to which it is being converted. Knowing this rule makes it easy to solve the puzzle.

Because byte is a signed type, sign extension occurs when converting the byte value –1 to a char. The resulting char value has all 16 bits set, so it is equal to 2¹⁶ – 1, or 65,535. The cast from char to int is also a widening primitive conversion, so the rule tells us that zero extension is performed rather than sign extension. The resulting int value is 65535, which is just what the program prints.

Although there is a simple rule describing the sign extension behavior of widening primitive conversions between signed and unsigned integral types, it is best not to write programs that depend on it. If you are doing a widening conversion to or from a char, which is the only unsigned integral type, it is best to make your intentions explicit.

If you are converting from a char value c to a wider type and you don't want sign extension, consider using a bit mask for clarity, even though it isn't required:

int i = c & 0xffff;

Alternatively, write a comment describing the behavior of the conversion:

int i = c; // Sign extension is not performed

If you are converting from a char value c to a wider integral type and you want sign extension, cast the char to a short, which is the same width as a char but signed. Given the subtlety of this code, you should also write a comment:

int i = (short) c; // Cast causes sign extension

If you are converting from a byte value b to a char and you don't want sign extension, you must use a bit mask to suppress it. This is a common idiom, so no comment is necessary:

char c = (char) (b & 0xff);

If you are converting from a byte to a char and you want sign extension, write a comment:

char c = (char) b; // Sign extension is performed

The lesson is simple: If you can't tell what a program does by looking at it, it probably doesn't do what you want. Strive for clarity. Although a simple rule describes the sign extension behavior of widening conversions involving signed and unsigned integral types, most programmers don't know it. If your program depends on it, make your intentions clear.

Dharma's Blog

Monday, March 1, 2010

Multicast

Multicast

Solution : Multicast

No comments:

Post a Comment