- 177 - also more work. In specialized cases, you might want to consider taking control of every aspect of
the IO right down to the
byte
-to-
char
encoding, but for this you need to consider how to maintain compatibility with the JDK.
Table 8-1 and
Table 8-2 summarize all the results from these experiments.
Table 8-1, Timings of the Long-Line Tests Normalized to the JDK 1.2 Buffered Input Stream Test
1.2 1.2 no JIT
1.3 HotSpot 1.0
HotSpot 2nd Run 1.1.6
Unbuffered input stream 1951 3567
1684 1610 1641
1341 Buffered input stream
100 450
52 56
45 174
8K buffered input stream 102
477 50
45 48
225 Buffered
reader 47 409
43 74 41
43 Custom-built
reader 26 351
37 81 36
15 Custom reader and converter 12 69
18 77 17
10
Table 8-2, Timings of the Short-Line Tests Normalized to the JDK 1.2 Buffered Input Stream Test
1.2 1.2 no JIT
1.3 HotSpot 1.0
HotSpot 2nd Run 1.1.6
Unbuffered input stream 1308
2003 1101
1326 1232
871 Buffered input stream
100 363
33 50
54 160
8K buffered input stream 101
367 31
41 54
231 Buffered reader
111 554
39 149
45 127
Custom-built reader 19
237 28
94 26
14 Custom reader and converter
9 56
21 80
53 8
8.4 Serialization
Objects are serialized in a number of situations in Java. The two main reasons to serialize objects are to transfer objects and to store them.
There are several ways to improve the performance of serialization and deserialization. First, fields that are transient do not get serialized, saving both space and time. You can consider implementing
readObject
and
writeObject
see
java.io.Serializable
documentation to override the default serialization routine; it may be that you can produce a faster serialization routine for
your specific objects. If you need this degree of control, you are better off using the
java.io.Externalizable
interface the reason is illustrated shortly. Overriding the default serialization routine in this way is generally only worth doing for large or frequently serialized
objects. The tight control this gives you may also be necessary to correctly handle canonicalized objects to ensure objects remain canonical when deserializing them.
To transfer objects across networks, it is worth compressing the serialized objects. For large amounts of data, the transfer overhead tends to swamp the costs of compressing and decompressing
the data. For storing to disk, it is worth serializing multiple objects to different files rather than to one large file. The granularity of access to individual objects and subsets of objects is often
improved as well.
It is also possible to serialize objects in a separate thread for storage and network transfers, letting the serialization execute in the background. For objects whose state can change between
serializations, consider using transaction logs or change-logs logs of the differences in the objects since they were last fully serialized rather than reserializing the whole object. This works much
- 178 - like the way full and incremental backups work. You need to maintain the changes somewhere, of
course, so it makes the objects more complicated, but this complexity can have a really good payback in terms of performance: consider how much faster an incremental backup is compared to
a full backup.
It is worthwhile to spend some time on a basic serialization tuning exercise. I chose a couple of fairly simple objects to serialize, but they are representative of the sorts of issues that crop up in
serialization:
class Foo1 implements Serializable {
int one; String two;
Bar1[] four; public Foo1
{ two = new StringSTART;
one = two.length ; four = new Bar1[2];
four[0] = new Bar1 ; four[1] = new Bar1 ;
} }
class Bar1 implements Serializable {
float one; String two;
public Bar1 {
two = new Stringhello; one = 3.14F;
} }
Note that I have given the objects default initial values for the tuning tests. The defaults assigned to the various
String
variables are forced to be unique for every object by making them new
String
s. Without doing this, the compiler assigns the identical
String
to every object. That alters the timings: only one
String
is written on output, and when created on input, all other
String
references reference the same string by identity. Java serialization can maintain relative identity of objects for objects that are serialized together. Using identical
String
s would make the serialization tests quicker, and would not be representative of normal serializations.
Test measurements are easily skewed by rewriting previously written objects. Previously written objects are not converted and written out again; instead, only a reference to the original object is written. Writing
this reference can be faster than writing out the object again. The speed is even more skewed on reading, since only one object gets created. All the other references refer to the same uniquely created object.
Early in my career, I was set the task of testing the throughput of an object database. The first tests registered a fantastically high throughput until we realized we were storing just a few objects once, and
all the other objects we thought we were storing were only references to those first few.
The
Foo
objects each contain two
Bar
objects in an array, to make the overall objects slightly more representative of real-world objects. Ill make a baseline using the standard serialization technique:
if toDisk OutputStream ostream = new FileOutputStreamt.tmp;
- 179 -
else OutputStream ostream = new ByteArrayOutputStream ;
ObjectOutputStream wrtr = new ObjectOutputStreamostream; long time = System.currentTimeMillis ;
write objects: time only the 3 lines for serialization output wrtr.writeObjectlotsOfFoos;
wrtr.flush ; wrtr.close ;
System.out.printlnWriting time: + System.currentTimeMillis -time;
if toDisk InputStream istream = new FileInputStreamt.tmp;
else InputStream istream = new ByteArrayInputStream
ByteArrayOutputStream ostream.toByteArray ; ObjectInputStream rdr = new ObjectInputStreamistream;
time = System.currentTimeMillis ; read objects: time only the 2 lines for serialization input
Foo1[] allFoos = Foo1[] rdr.readObject ; rdr.close ;
System.out.printlnReading time: + System.currentTimeMillis -time;
As you can see, I provide for running tests either to disk or purely in memory. This allows you to break down the cost into separate components. The actual values revealed that 95 of the time is
spent in the serialization. Less than 5 is the actual write to disk of course, the relative times are system-dependent, but these results are probably representative .
When measuring, I used a pregrown
ByteArrayOutputStream
so that there were no effects from allocating the
byte
array in memory. Furthermore, to eliminate extra memory copying and garbage-collection effects, I reused the same
ByteArrayOutputStream
, and indeed the same
byte
array from that
ByteArrayOutputStream
object for reading. The
byte
array is accessible by subclassing
ByteArrayOutputStream
and providing an accessor to the
ByteArrayOutputStream.buf
instance variable. The results of this first test for JDK 1.2
[13]
are:
[13]
Table 8-3 lists the full results of tests with a variety of VMs. I have used the 1.2 results for discussion in this section, and the results are generally applicable
to the other VMs tested.
Writing serializing Reading deserializing
Standard serialization 100
175
I have normalized the baseline measurements to 100 for the
byte
array output i.e., serializing the collection of
Foo
s. On this scale, the reading deserializing takes 175. This is not what I expected, because I am used to the idea that writing takes longer than reading. Thinking about
exactly what is happening, you can see that for the serialization you take the data in some objects and write that data out to a stream of bytes, which basically accesses and converts objects into
bytes. But for the deserializing, you access elements of a byte array and convert these to other object and data types, including creating any required objects. Added to the fact that the serializing
procedures are much more costly than the actual disk writes and reads, and it is now understandable that deserialization is likely to be the more intensive, and consequently slower,
activity.
- 180 - Considering exactly what the
ObjectInputStream
and
ObjectOutputStream
must do, I realize that they are accessing and updating internal elements of the objects they are serializing, without
knowing beforehand anything about those objects. This means there must be a heavy usage of the
java.reflect
package, together with some internal VM access procedures since the serializing can reach private and protected fields and methods.
[14]
All this suggests that you should improve performance by taking explicit control of the serializing.
[14]
The actual code is difficult and time-consuming to work through. It was written in parts as one huge iteratedrecursed switch, probably for performance reasons.
Alert readers might have noticed that
Foo
and
Bar
have constructor s that initialize the object, and may be wondering if deserializing could be speeded up by changing the
constructors to avoid the unnecessary overhead there. In fact, the deserialization uses internal VM access to create the objects without going through the constructor, similar
to cloning the objects. Although the
Serializable
interface requires serializable objects to have no-arg constructors, deserialized objects do not actually use that or
any constructor.
To start with, the
Serializable
interface supports two methods that allow classes to handle their own serializing. So the first step is to try these methods. Add the following two methods to
Foo
:
private void writeObjectjava.io.ObjectOutputStream out throws IOException
{ out.writeUTFtwo;
out.writeIntone; out.writeObjectfour;
} private void readObjectjava.io.ObjectInputStream in
throws IOException, ClassNotFoundException {
two = in.readUTF ; one = in.readInt ;
four = Bar2[] in.readObject ; }
Bar
needs the equivalent two methods:
private void writeObjectjava.io.ObjectOutputStream out throws IOException
{ out.writeUTFtwo;
out.writeFloatone; }
private void readObjectjava.io.ObjectInputStream in throws IOException, ClassNotFoundException
{ two = in.readUTF ;
one = in.readFloat ; }
The following chart shows the results of running the test with these methods added to the classes:
Writing serializing Reading deserializing
Standard serialization 100
175 Customized readwriteObject in Foo and Bar
125 148
- 181 - We have improved the reads but made the writes worse. I expected an improvement for both, and I
cannot explain why the writes are worse other than perhaps that the
ObjectOutputStream
class may have suboptimal performance for this method overriding feature. Instead of analyzing what
the
ObjectOutputStream
class may be doing, lets try further optimizations. Examining and manipulating objects during serialization takes more time than the actual conversion
of data to or from streams. Considering this, and looking at the customized serializing methods, you can see that the
Foo
methods simply pass control back to the default serializing mechanism to handle the embedded
Bar
objects. It may be worth handling the serializing more explicitly. For this example, Ill break encapsulation by accessing the
Bar
fields directly although going through accessors and updators or calling serialization methods in
Bar
would not make much difference in time here. I redefine the
Foo
serializing methods as:
private void writeObjectjava.io.ObjectOutputStream out throws IOException
{ out.writeUTFtwo;
out.writeIntone; out.writeUTFfour[0].two;
out.writeFloatfour[0].one; out.writeUTFfour[1].two;
out.writeFloatfour[1].one; }
private void readObjectjava.io.ObjectInputStream in throws IOException, ClassNotFoundException
{ two = in.readUTF ;
one = in.readInt ; four = new Bar3[2];
four[0] = new Bar3 ; four[1] = new Bar3 ;
four[0].two = in.readUTF ; four[0].one = in.readFloat ;
four[1].two = in.readUTF ; four[1].one = in.readFloat ;
}
The
Foo
methods now handle serialization for both
Foo
and the embedded
Bar
objects, so the equivalent methods in
Bar
are now redundant. The following chart illustrates the results of running the test with these altered methods added to the classes
Table 8-3 lists the full results of tests with a
variety of VMs:
Writing serializing Reading deserializing
Standard serialization 100
175 Customized readwriteObject in Foo and Bar
125 148 Customized readwriteObject in Foo handling Bar
31 59
Now this gives a clearer feel for the costs of dynamic object examination and manipulation. Given the overheads the serializing IO classes incur, it has now become obvious that the more
serializing you handle explicitly, the better off you are. This being the case, the next step is to ask the objects explicitly to serialize themselves, rather than going through the
ObjectInputStream
and
ObjectOutputStream
to have them in turn ask the objects to serialize themselves.
- 182 - The
readObject
and
writeObject
methods must be defined as
private
according to the
Serializable
interface documentation, so they cannot be called directly. You must either wrap them in another public method or copy the implementation to another method so you can access
them directly. But in fact,
java.io
provides a third alternative. The
Externalizable
interface also provides support for serializing objects using
ObjectInputStream
and
ObjectOutputStream
. But
Externalizable
defines two
public
methods rather than the two
private
methods required by
Serializable
. So you can just change the names of the two methods:
readObjectObjectInputStream
becomes
readExternalObjectInput
, and
writeObjectObjectOutputStream
becomes
writeExternalObjectOutput
. You must also redefine
Foo
as implementing
Externalizable
instead of
Serializable
. All of these are simple changes, but to be sure that nothing untoward has happened as a consequence, rerun the tests as
good tuners should for any changes, even minor ones. The following chart shows the new test results.
Writing serializing Reading deserializing
Standard serialization 100
175 Customized readwriteObject in Foo handling Bar
31 59 Foo made Externalizable, using last methods renamed
28 46
Remarkably, the times are significantly faster. This probably reflects the improvement you get from being able to compile and execute a line such as:
Externalizable someObject.writeExternalthis
in the
ObjectOutputStream
class, rather than having to go through
java.reflect
and the VM internals to reach the private
writeObject
method. This example also shows that you are better off making your classes
Externalizable
rather than
Serializable
if you want to control your own serializing.
The drawback to controlling your own serializing is a significantly higher maintenance cost, as any change to the class structure also requires changes to the two
Externalizable
methods or the two methods supported by
Serializable
. In some cases as in the example presented in this tuning exercise, changes to the structure of
one class actually require changes to the
Externalizable
methods of another class. The example presented here requires that if the structure of
Bar
is changed, the
Externalizable
methods in
Foo
must also be changed to reflect the new structure of
Bar
. Here, you can avoid the dependency between the classes by having the
Foo
serialization methods call the
Bar
serialization methods directly. But the general fragility of serialization, when individual class structures change, still remains.
You changed the methods in the first place to provide public access to the methods in order to access them directly. Lets continue with this task. Now, for the first time, you will change actual
test code, rather than anything in the
Foo
or
Bar
classes. The new test looks like:
if toDisk OutputStream ostream = new FileOutputStreamt.tmp;
else OutputStream ostream = new ByteArrayOutputStream ;
ObjectOutputStream wrtr = new ObjectOutputStreamostream; The old version of the test just ran the next
commented line to write the objects wrtr.writeObjectlotsOfFoos;
- 183 -
long time = System.currentTimeMillis ; This new version writes the size of the array,
then each object explicitly writes itself time these five lines for serialization output
wrtr.writeIntlotsOfFoos.length; for int i = 0; i lotsOfFoos.length ; i++
lotsOfFoos[i].writeExternalwrtr; wrtr.flush ;
wrtr.close ; System.out.printlnWriting time: +
System.currentTimeMillis -time; if toDisk
InputStream istream = new FileInputStreamt.tmp; else
InputStream istream = new ByteArrayInputStream ByteArrayOutputStream ostream.toByteArray ;
ObjectInputStream rdr = new ObjectInputStreamistream; The old version of the test just ran the next
commented line to read the objects Foo1[] allFoos = Foo1[] rdr.readObject ;
time = System.currentTimeMillis ; This new version reads the size of the array and creates
the array, then each object is explicitly created and reads itself. read objects - time these ten lines to
the close for serialization input int len = rdr.readInt ;
Foo[] allFoos = new Foo[len]; Foo foo;
for int i = 0; i len ; i++ {
foo = new Foo ; foo.readExternalrdr;
allFoos[i] = foo; }
rdr.close ; System.out.printlnReading time: +
System.currentTimeMillis -time;
This test bypasses the serialization overhead completely. You are still using the
ObjectInputStream
and
ObjectOutputStream
classes, but really only to write out basic data types, not for any of their object-manipulation capabilities. If you didnt require those specific
classes because of the required method signatures, you could have happily used
DataInputStream
and
DataOutputStream
classes for this test. The following chart shows the test results.
Writing serializing Reading deserializing
Standard serialization 100
175 Foo made Externalizable, using last methods renamed
28 46 Foo as last test, but readwrite called directly in test
8 36
If you test serializing to and from the disk, you find that the disk IO now takes nearly one-third of the total test times. Because disk IO is now a significant portion of the total time, the CPU is now
underworked and you can even gain some speedup by serializing in several threads, i.e., you can evenly divide the collection into two or more subsets and have each subset serialized by a separate
thread I leave that as an exercise for you.
- 184 - Note that since you are now explicitly creating objects by calling their constructors, the instance
variables in
Bar
are being set twice during deserialization, once at the creation of the
Bar
instance in
Foo.readExternal
, and again when reading in the instance variable values and assigning those values. Normally you should move any
Bar
initialization out of the no-arg constructor to avoid redundant assignments .
Is there any way of making the deserializing faster? Well, not significantly, if you need to read in all the objects and use them all immediately. But more typically, you need only some of the objects
immediately. In this case, you can use lazily initialized objects to speed up the deserializing phase see also
Section 4.5.2 . The idea is that instead of combining the read with the object creation in
the deserializing phase, you decouple these two operations. So each object reads in just the bytes it needs, but does not convert those bytes into objects or data until that object is actually accessed. To
test this, add a new instance variable to
Foo
to hold the bytes between reading and converting to objects or data. You also need to change the serialization methods. I will drop support for the
Serializable
and
Externalizable
interfaces, since we are now explicitly requiring the
Foo
objects to serialize and deserialize themselves, and Ill add a second stream to store the size of the serialized
Foo
objects.
Foo
now looks like:
class Foo1 implements Serializable {
int one; String two;
Bar1[] four; byte[] buffer;
empty constructor to optimize deserialization public Foo5 {}
And constructor that creates initialized objects public Foo5boolean init
{ this ;
if init init ;
} public void init
{ two = new StringSTART;
one = two.length ; four = new Bar5[2];
four[0] = new Bar5 ; four[1] = new Bar5 ;
} Serialization method
public void writeExternalMyDataOutputStream out, DataOutputStream outSizes throws IOException
{ Get the amount written so far so that we can determine
the extra we write int size = out.written ;
write out the Foo out.writeUTFtwo;
out.writeIntone; out.writeUTFfour[0].two;
out.writeFloatfour[0].one; out.writeUTFfour[1].two;
out.writeFloatfour[1].one; Determine how many bytes I wrote
- 185 -
size = out.written - size; and write that out to our second stream
outSizes.writeIntsize; }
public void readExternalInputStream in, DataInputStream inSizes throws IOException
{ Determine how many bytes I consist of in serialized form
int size = inSizes.readInt ; And read me into a byte buffer
buffer = new byte[size]; int len;
int readlen = in.readbuffer; be robust and handle the general case of partial reads
and incomplete streams if readlen == -1
throw new IOExceptionexpected more bytes; else
whilereadlen buffer.length {
len = in.readbuffer, readlen, buffer.length-readlen; if len 1
throw new IOExceptionexpected more bytes; else
readlen += len; }
} This method does the deserializing of the byte buffer to a real Foo
public void convert throws IOException
{ DataInputStream in = new DataInputStreamnew ByteArrayInputStreambuffer;
two = in.readUTF ; one = in.readInt ;
four = new Bar5[2]; four[0] = new Bar5 ;
four[1] = new Bar5 ; four[0].two = in.readUTF ;
four[0].one = in.readFloat ; four[1].two = in.readUTF ;
four[1].one = in.readFloat ; buffer = null;
} }
As you can see, I have chosen to use
DataInputStream
s and
DataOutputStream
s, since they are all thats needed. In addition, I use a subclass of
DataOutputStream
called
MyDataOutputStream
. This class adds only one method,
MyDataOutputStream.written
, to provide access to the
DataOutputStream.written
instance variable so you have access to the number of bytes written. The timing tests are essentially the same as before, except that you change the stream types and add
a second stream for the sizes of the serialized objects e.g., to file t2.tmp, or a second pair of byte- array input and output streams. The following chart shows the new times:
Writing serializing Reading deserializing
Standard serialization 100
175 Foo as last test, but readwrite called directly in test
8 36
- 186 -
Foo lazily initialized 20 7
We have lost out on the writes because of the added complexity, but improved the reads considerably. The cost of the
Foo.convert
method has not been factored in, but the strategy illustrated here is for cases where you need to run only that convert method on a small subset of the