Ashish: Formatters and Java Print Streams

Constructors

Exactly where the output from a Formatter ends up depends on what argument you pass to the constructor. You’ve already seen the constructor that takes a filename:

public Formatter(String fileName) throws FileNotFoundException

If the named file does not exist in the current working directory, this constructor attempts to create it. If that fails for any reason other than a security violation, the constructor throws aFileNotFoundException. Security problems are reported with aSecurityExceptioninstead. If the file does exist, its contents are overwritten.

Instead of a filename, you can pass in aFileobject:

public Formatter(File file) throws FileNotFoundException

You can also use aFormatterto write onto aPrintStreamor another kind ofOutputStream:

public Formatter(PrintStream out)

public Formatter(OutputStream out)

or onto anyAppendableobject:

public Formatter(Appendable out)

TheAppendableinterface is a new Java 5 interface for anything onto whichchars can be appended. This includesStringBuffers andStringBuilders. It also includes a number of classes we’ll talk about later, such asCharBuffer andWriter.

Finally, the no-args constructor creates aFormatterwith no specified destination for output:

public Formatter()

In this case, theFormatterwrites everything onto a newStringBuilderobject. You can retrieve this object using theout()method at any time before theFormatteris closed:

public Appendable out() throws FormatterClosedException

You might need to use this method if you want to write unformatted output onto the sameStringBuilder, but more commonly you’ll just use thetoString()method to get the final result. For example:

Formatter formatter = new Formatter();

for (double degrees = 0.0; degrees <>

double radians = Math.PI * degrees / 180.0;

double grads = 400 * degrees / 360;

formatter.format("%5.1f %5.1f %5.1f\n", degrees , radians, grads);

}

String table = formatter.toString();

Formatters and Java Print Streams - Character Sets

stick to the ASCII character set, a single computer, and System.out, character sets aren’t likely to be a problem. However, as data begins to move between different systems, it becomes important to consider what happens when the other systems use different character sets. For example, suppose I use a Formatteror aPrintStream on a typical U.S. or Western European PC to write the sentence “Au cours des dernières années, XML a été adapte dans des domaines aussi diverse que l’aéronautique, le multimédia, la gestion de hôpitaux, les télécommunications, la théologie, la vente au détail, et la littérature médiévale” in a file. Say I then send this file to a Macintosh user, who opens it up and sees “Au cours des derniËres annÈes, XML a ÈtÈ adapte dans des domaines aussi diverse que l’aÈronautique, le multimÈdia, la gestion de hÙpitaux, les tÈlÈcommunications, la thÈologie, la vente au dÈtail, et la littÈrature mÈdiÈvale.” This is not the same thing at all! The confusion is even worse if you go in the other direction.

If you’re writing to the console (i.e.,System.out), you don’t really need to worry about character set issues. The default character set Java writes in is usually the same one the console uses.

Actually, you may need to worry a little. On Windows, the console encoding is usually not the same as the system encoding found in thefile.encodingsystem property. In particular, the console uses a DOS character set such as Cp850 that includes box drawing characters such as L and +, while the rest of the system uses an encoding such as Cp1252 that maps these same code points to alphabetic characters like È and Î. To be honest, the console is reliable enough for ASCII, but anything beyond that requires a GUI.

However, there’s more than one character set, and when transmitting files between systems and programs, it pays to be specific. In the previous example, if we knew the file was going to be read on a Macintosh, we might have specified that it be written with the MacRoman encoding:

Formatter formatter = new Formatter("data.txt", "MacRoman");

More likely, we’d just agree on both the sending and receiving ends to use some neutral format such as ISO-8859-1 or UTF-8. In some cases, encoding details can be embedded in the file you write (HTML, XML) or sent as out-of-band metadata (HTTP, SMTP). However, you do need some way of specifying and communicating the character set in which any given document is written. When you’re writing to anything other than the console or a string, you should almost always specify an encoding explicitly. Three of theFormatterconstructors take character set names as their second argument:

public Formatter(String fileName, String characterSet)

throws FileNotFoundException

public Formatter(File file , String characterSet)

throws FileNotFoundException

public Formatter(OutputStream out, String characterSet)

Formatters and Java Print Streams - Locales

Character sets are not the only localization issue in the Formatter class. For instance, in France, a decimal comma is used instead of a decimal point. Thus, a French user running the earlier degree table example would want to see this:

0,0 0,0 0,0

1,0 0,0 1,1

2,0 0,0 2,2

3,0 0,1 3,3

4,0 0,1 4,4

...

Sometimes Java adapts the format to the local conventions automatically, and sometimes it doesn’t. For instance, if you want decimal commas, you have to write%,5.1finstead of%5.1f. The comma after the percent sign is a flag that tells the formatter to use the local conventions. (It does not actually say to use commas.) Java will now use commas only if the local conventions say to use commas. On a typical U.S. English system, the local convention is a decimal point, and that’s what you’ll get even if you format numbers as%,5.1f.

Of course, sometimes you don’t want a program to adapt to the local conventions. For instance, many companies use PCs adapted to local languages and customs but still need to produce English documents that use American formats. Thus, as an optional third argument to the constructor, you can pass ajava.util.Localeobject:

public Formatter(String fileName, String characterSet, Locale locale)

throws FileNotFoundException

public Formatter(File file, String characterSet, Locale locale)

throws FileNotFoundException

public Formatter(OutputStream out, String characterSet, Locale locale)

For example, to force the use of American conventions regardless of where a program is run, you’d construct aFormatterlike this:

Formatter formatter = new Formatter("data.txt", "ISO-8859-1", Locale.US);

You can also specify a locale when writing to anAppendableobject or aStringBuilder:

public Formatter(Appendable out, Locale locale)

public Formatter(Locale locale)

Character encodings don’t matter for these two cases because bothAppendableandStringBuilderare defined in terms of characters rather than bytes—there’s no conversion to be done. However, locales can change formatting even when the character set stays the same.

On occasion, you might wish to change the locale for one string you write but not for other strings (in a mixed English/French document, perhaps). In that case, you can pass a locale as the first argument to theformat()method before the format string:

public Formatter format(Locale locale, String format, Object... args)

You can do the same thing with theprintf()andformat()methods in thePrintStreamclass:

public PrintStream printf(Locale locale, String format, Object... args)

Finally, I’ll note that there’s a getter method that returns theFormatter’s current locale:

public Locale locale()

Error Handling

The Formatter class handles errors in much the same way PrintStream does. That is, it sweeps them under the rug and pretends they didn’t happen. Notice how none of the methods mentioned so far threwIOException?

To find out if theFormatterhas encountered an error, invoke itsioException()method:

public IOException ioException()

This returns the las tIOExceptionthrown by the underlying output stream. If there was more than one, only the last one is available.

This is marginally better than PrintStream’s booleancheckError()method. At leastFormatter will tell you what the problem was. However, it still won’t tell you unless you ask. For simple cases in which you don’t have to write a lot of data before closing theFormatterand checking for any errors, this may be adequate. However, programs that need to write for an extended period of time should probably create strings using a Formatter but write them using a regularOutputStream. That way, if an I/O error does happen, you’ll find out soon enough to do something about it.

Formatters and Java Print Streams - Format Specifiers

The Formatter class and the printf() method inPrintStreamthat depends on it support several dozen format specifiers. In addition to integer and floating-point numbers,Formatter offers a wide range of date and time formats. It also has a few general formatters that can display absolutely any object or primitive data type.

All format specifiers begin with percent signs. The minimum format specifier is a percent sign followed by an alphabetic conversion code. This code identifies what the corresponding argument is to be formatted as. For instance,%fformats a number with a decimal point,%dformats it as a decimal (base-10) integer,%oformats it as an octal integer, and%xformats it as a hexadecimal integer. None of these specifiers changes what the number actually is; they’re just different ways of creating a string that represents the number.

To use a literal percent character in a format string, just double escape it. That is,%%is formatted as%in the output.

To get the platform default line separator, use%n.(\nis always a linefeed regardless of platform.%nmay be a carriage return, a linefeed, or a carriage return linefeed pair, depending on the platform.)

Integer conversions

Integer conversions can be applied to all integral types (specifically, byte, short,int, andlong, as well as the type-wrapper classesByte,Short,Integer,Long, and also thejava.math.BigIntegerclass). These conversions are:

A regular base-10 integer, such as 987

A base-8 octal integer, such as 1733

A base-16 lowercase hexadecimal integer, such as

3db

A base-16 uppercase hexadecimal integer, such as

3DB

Example 7-1 prints the number 1023 in all four formats.

Example 7-1. Integer format specifiers

public class IntegerFormatExample {

public static void main(String[] args) {

int n = 1023;

System.out.printf("Decimal: %d\n", n);

System.out.printf("Octal: %o\n", n);

System.out.printf("Lowercase hexadecimal: %x\n", n);

System.out.printf("Uppercase hexadecimal: %X\n", n);

}

Here’s the output:

Decimal: 1023

Octal: 1777

Lowercase hexadecimal: 3ff

Uppercase hexadecimal: 3FF

Formatters and Java Print Streams - Floating-point conversions :-\

Floating-point conversions can be applied to all floating-point types: float and double, the type-wrapper classesFloatandDouble, andjava.math.BigDecimal. These conversions are:

A regular base-10 decimal number, such as 3.141593

A decimal number in scientific notation with a

lowercase e, such as 3.141593e+00

A decimal number in scientific notation with an

uppercase E, such as 3.141593E+00

A decimal number formatted in either regular or

scientific notation, depending on its size and

precision, with a lowercase e if scientific notation is

used

A decimal number formatted in either regular or

scientific notation, depending on its size and

precision, with an uppercase E if scientific notation is

used

A lowercase hexadecimal floating-point number, such

as 0x1.921fb54442d18p1

An uppercase hexadecimal floating-point number,

such as 0X1.921FB54442D18P1

Surprisingly, you cannot use these conversions on integer types such asintorBigDecimal. Java will not automatically promote the integer type to a floating-point type when formatting. If you try to use them, it throws anIllegalFormatConversionException.

Example 7-2 prints π in all of these formats.

Example 7-2. Floating-point format specifiers

public class FloatingPointFormatExample {

public static void main(String[] args){

System.out.printf("Decimal: %f\n", Math.PI);

System.out.printf("Scientific notation: %e\n", Math.PI);

System.out.printf("Scientific notation: %E\n", Math.PI);

System.out.printf("Decimal/Scientific: %g\n", Math.PI);

System.out.printf("Decimal/Scientific: %G\n", Math.PI);

System.out.printf("Lowercase Hexadecimal: %a\n", Math.PI);

System.out.printf("Uppercase Hexadecimal: %A\n", Math.PI);

}

Here’s the output:

Decimal: 3.141593

Scientific notation: 3.141593e+00

Scientific notation: 3.141593E+00

Decimal/Scientific: 3.14159

Lowercase Hexadecimal: 0x1.921fb54442d18p1

Uppercase Hexadecimal: 0X1.921FB54442D18P1

Ashish

Monday, November 1, 2010

Formatters and Java Print Streams

No comments:

Post a Comment

Pages

Facebook Badge

Followers

Blog Archive

About Me