no JIT HotSpot 1.0 no JIT HotSpot 1.0 Serialization

- 177 - also more work. In specialized cases, you might want to consider taking control of every aspect of the IO right down to the byte -to- char encoding, but for this you need to consider how to maintain compatibility with the JDK. Table 8-1 and Table 8-2 summarize all the results from these experiments. Table 8-1, Timings of the Long-Line Tests Normalized to the JDK 1.2 Buffered Input Stream Test 1.2 1.2 no JIT

1.3 HotSpot 1.0

HotSpot 2nd Run 1.1.6 Unbuffered input stream 1951 3567 1684 1610 1641 1341 Buffered input stream 100 450 52 56 45 174 8K buffered input stream 102 477 50 45 48 225 Buffered reader 47 409 43 74 41 43 Custom-built reader 26 351 37 81 36 15 Custom reader and converter 12 69 18 77 17 10 Table 8-2, Timings of the Short-Line Tests Normalized to the JDK 1.2 Buffered Input Stream Test 1.2 1.2 no JIT

1.3 HotSpot 1.0

HotSpot 2nd Run 1.1.6 Unbuffered input stream 1308 2003 1101 1326 1232 871 Buffered input stream 100 363 33 50 54 160 8K buffered input stream 101 367 31 41 54 231 Buffered reader 111 554 39 149 45 127 Custom-built reader 19 237 28 94 26 14 Custom reader and converter 9 56 21 80 53 8

8.4 Serialization

Objects are serialized in a number of situations in Java. The two main reasons to serialize objects are to transfer objects and to store them. There are several ways to improve the performance of serialization and deserialization. First, fields that are transient do not get serialized, saving both space and time. You can consider implementing readObject and writeObject see java.io.Serializable documentation to override the default serialization routine; it may be that you can produce a faster serialization routine for your specific objects. If you need this degree of control, you are better off using the java.io.Externalizable interface the reason is illustrated shortly. Overriding the default serialization routine in this way is generally only worth doing for large or frequently serialized objects. The tight control this gives you may also be necessary to correctly handle canonicalized objects to ensure objects remain canonical when deserializing them. To transfer objects across networks, it is worth compressing the serialized objects. For large amounts of data, the transfer overhead tends to swamp the costs of compressing and decompressing the data. For storing to disk, it is worth serializing multiple objects to different files rather than to one large file. The granularity of access to individual objects and subsets of objects is often improved as well. It is also possible to serialize objects in a separate thread for storage and network transfers, letting the serialization execute in the background. For objects whose state can change between serializations, consider using transaction logs or change-logs logs of the differences in the objects since they were last fully serialized rather than reserializing the whole object. This works much - 178 - like the way full and incremental backups work. You need to maintain the changes somewhere, of course, so it makes the objects more complicated, but this complexity can have a really good payback in terms of performance: consider how much faster an incremental backup is compared to a full backup. It is worthwhile to spend some time on a basic serialization tuning exercise. I chose a couple of fairly simple objects to serialize, but they are representative of the sorts of issues that crop up in serialization: class Foo1 implements Serializable { int one; String two; Bar1[] four; public Foo1 { two = new StringSTART; one = two.length ; four = new Bar1[2]; four[0] = new Bar1 ; four[1] = new Bar1 ; } } class Bar1 implements Serializable { float one; String two; public Bar1 { two = new Stringhello; one = 3.14F; } } Note that I have given the objects default initial values for the tuning tests. The defaults assigned to the various String variables are forced to be unique for every object by making them new String s. Without doing this, the compiler assigns the identical String to every object. That alters the timings: only one String is written on output, and when created on input, all other String references reference the same string by identity. Java serialization can maintain relative identity of objects for objects that are serialized together. Using identical String s would make the serialization tests quicker, and would not be representative of normal serializations. Test measurements are easily skewed by rewriting previously written objects. Previously written objects are not converted and written out again; instead, only a reference to the original object is written. Writing this reference can be faster than writing out the object again. The speed is even more skewed on reading, since only one object gets created. All the other references refer to the same uniquely created object. Early in my career, I was set the task of testing the throughput of an object database. The first tests registered a fantastically high throughput until we realized we were storing just a few objects once, and all the other objects we thought we were storing were only references to those first few. The Foo objects each contain two Bar objects in an array, to make the overall objects slightly more representative of real-world objects. Ill make a baseline using the standard serialization technique: if toDisk OutputStream ostream = new FileOutputStreamt.tmp; - 179 - else OutputStream ostream = new ByteArrayOutputStream ; ObjectOutputStream wrtr = new ObjectOutputStreamostream; long time = System.currentTimeMillis ; write objects: time only the 3 lines for serialization output wrtr.writeObjectlotsOfFoos; wrtr.flush ; wrtr.close ; System.out.printlnWriting time: + System.currentTimeMillis -time; if toDisk InputStream istream = new FileInputStreamt.tmp; else InputStream istream = new ByteArrayInputStream ByteArrayOutputStream ostream.toByteArray ; ObjectInputStream rdr = new ObjectInputStreamistream; time = System.currentTimeMillis ; read objects: time only the 2 lines for serialization input Foo1[] allFoos = Foo1[] rdr.readObject ; rdr.close ; System.out.printlnReading time: + System.currentTimeMillis -time; As you can see, I provide for running tests either to disk or purely in memory. This allows you to break down the cost into separate components. The actual values revealed that 95 of the time is spent in the serialization. Less than 5 is the actual write to disk of course, the relative times are system-dependent, but these results are probably representative . When measuring, I used a pregrown ByteArrayOutputStream so that there were no effects from allocating the byte array in memory. Furthermore, to eliminate extra memory copying and garbage-collection effects, I reused the same ByteArrayOutputStream , and indeed the same byte array from that ByteArrayOutputStream object for reading. The byte array is accessible by subclassing ByteArrayOutputStream and providing an accessor to the ByteArrayOutputStream.buf instance variable. The results of this first test for JDK 1.2 [13] are: [13] Table 8-3 lists the full results of tests with a variety of VMs. I have used the 1.2 results for discussion in this section, and the results are generally applicable to the other VMs tested. Writing serializing Reading deserializing Standard serialization 100 175 I have normalized the baseline measurements to 100 for the byte array output i.e., serializing the collection of Foo s. On this scale, the reading deserializing takes 175. This is not what I expected, because I am used to the idea that writing takes longer than reading. Thinking about exactly what is happening, you can see that for the serialization you take the data in some objects and write that data out to a stream of bytes, which basically accesses and converts objects into bytes. But for the deserializing, you access elements of a byte array and convert these to other object and data types, including creating any required objects. Added to the fact that the serializing procedures are much more costly than the actual disk writes and reads, and it is now understandable that deserialization is likely to be the more intensive, and consequently slower, activity. - 180 - Considering exactly what the ObjectInputStream and ObjectOutputStream must do, I realize that they are accessing and updating internal elements of the objects they are serializing, without knowing beforehand anything about those objects. This means there must be a heavy usage of the java.reflect package, together with some internal VM access procedures since the serializing can reach private and protected fields and methods. [14] All this suggests that you should improve performance by taking explicit control of the serializing. [14] The actual code is difficult and time-consuming to work through. It was written in parts as one huge iteratedrecursed switch, probably for performance reasons. Alert readers might have noticed that Foo and Bar have constructor s that initialize the object, and may be wondering if deserializing could be speeded up by changing the constructors to avoid the unnecessary overhead there. In fact, the deserialization uses internal VM access to create the objects without going through the constructor, similar to cloning the objects. Although the Serializable interface requires serializable objects to have no-arg constructors, deserialized objects do not actually use that or any constructor. To start with, the Serializable interface supports two methods that allow classes to handle their own serializing. So the first step is to try these methods. Add the following two methods to Foo : private void writeObjectjava.io.ObjectOutputStream out throws IOException { out.writeUTFtwo; out.writeIntone; out.writeObjectfour; } private void readObjectjava.io.ObjectInputStream in throws IOException, ClassNotFoundException { two = in.readUTF ; one = in.readInt ; four = Bar2[] in.readObject ; } Bar needs the equivalent two methods: private void writeObjectjava.io.ObjectOutputStream out throws IOException { out.writeUTFtwo; out.writeFloatone; } private void readObjectjava.io.ObjectInputStream in throws IOException, ClassNotFoundException { two = in.readUTF ; one = in.readFloat ; } The following chart shows the results of running the test with these methods added to the classes: Writing serializing Reading deserializing Standard serialization 100 175 Customized readwriteObject in Foo and Bar 125 148 - 181 - We have improved the reads but made the writes worse. I expected an improvement for both, and I cannot explain why the writes are worse other than perhaps that the ObjectOutputStream class may have suboptimal performance for this method overriding feature. Instead of analyzing what the ObjectOutputStream class may be doing, lets try further optimizations. Examining and manipulating objects during serialization takes more time than the actual conversion of data to or from streams. Considering this, and looking at the customized serializing methods, you can see that the Foo methods simply pass control back to the default serializing mechanism to handle the embedded Bar objects. It may be worth handling the serializing more explicitly. For this example, Ill break encapsulation by accessing the Bar fields directly although going through accessors and updators or calling serialization methods in Bar would not make much difference in time here. I redefine the Foo serializing methods as: private void writeObjectjava.io.ObjectOutputStream out throws IOException { out.writeUTFtwo; out.writeIntone; out.writeUTFfour[0].two; out.writeFloatfour[0].one; out.writeUTFfour[1].two; out.writeFloatfour[1].one; } private void readObjectjava.io.ObjectInputStream in throws IOException, ClassNotFoundException { two = in.readUTF ; one = in.readInt ; four = new Bar3[2]; four[0] = new Bar3 ; four[1] = new Bar3 ; four[0].two = in.readUTF ; four[0].one = in.readFloat ; four[1].two = in.readUTF ; four[1].one = in.readFloat ; } The Foo methods now handle serialization for both Foo and the embedded Bar objects, so the equivalent methods in Bar are now redundant. The following chart illustrates the results of running the test with these altered methods added to the classes Table 8-3 lists the full results of tests with a variety of VMs: Writing serializing Reading deserializing Standard serialization 100 175 Customized readwriteObject in Foo and Bar 125 148 Customized readwriteObject in Foo handling Bar 31 59 Now this gives a clearer feel for the costs of dynamic object examination and manipulation. Given the overheads the serializing IO classes incur, it has now become obvious that the more serializing you handle explicitly, the better off you are. This being the case, the next step is to ask the objects explicitly to serialize themselves, rather than going through the ObjectInputStream and ObjectOutputStream to have them in turn ask the objects to serialize themselves. - 182 - The readObject and writeObject methods must be defined as private according to the Serializable interface documentation, so they cannot be called directly. You must either wrap them in another public method or copy the implementation to another method so you can access them directly. But in fact, java.io provides a third alternative. The Externalizable interface also provides support for serializing objects using ObjectInputStream and ObjectOutputStream . But Externalizable defines two public methods rather than the two private methods required by Serializable . So you can just change the names of the two methods: readObjectObjectInputStream becomes readExternalObjectInput , and writeObjectObjectOutputStream becomes writeExternalObjectOutput . You must also redefine Foo as implementing Externalizable instead of Serializable . All of these are simple changes, but to be sure that nothing untoward has happened as a consequence, rerun the tests as good tuners should for any changes, even minor ones. The following chart shows the new test results. Writing serializing Reading deserializing Standard serialization 100 175 Customized readwriteObject in Foo handling Bar 31 59 Foo made Externalizable, using last methods renamed 28 46 Remarkably, the times are significantly faster. This probably reflects the improvement you get from being able to compile and execute a line such as: Externalizable someObject.writeExternalthis in the ObjectOutputStream class, rather than having to go through java.reflect and the VM internals to reach the private writeObject method. This example also shows that you are better off making your classes Externalizable rather than Serializable if you want to control your own serializing. The drawback to controlling your own serializing is a significantly higher maintenance cost, as any change to the class structure also requires changes to the two Externalizable methods or the two methods supported by Serializable . In some cases as in the example presented in this tuning exercise, changes to the structure of one class actually require changes to the Externalizable methods of another class. The example presented here requires that if the structure of Bar is changed, the Externalizable methods in Foo must also be changed to reflect the new structure of Bar . Here, you can avoid the dependency between the classes by having the Foo serialization methods call the Bar serialization methods directly. But the general fragility of serialization, when individual class structures change, still remains. You changed the methods in the first place to provide public access to the methods in order to access them directly. Lets continue with this task. Now, for the first time, you will change actual test code, rather than anything in the Foo or Bar classes. The new test looks like: if toDisk OutputStream ostream = new FileOutputStreamt.tmp; else OutputStream ostream = new ByteArrayOutputStream ; ObjectOutputStream wrtr = new ObjectOutputStreamostream; The old version of the test just ran the next commented line to write the objects wrtr.writeObjectlotsOfFoos; - 183 - long time = System.currentTimeMillis ; This new version writes the size of the array, then each object explicitly writes itself time these five lines for serialization output wrtr.writeIntlotsOfFoos.length; for int i = 0; i lotsOfFoos.length ; i++ lotsOfFoos[i].writeExternalwrtr; wrtr.flush ; wrtr.close ; System.out.printlnWriting time: + System.currentTimeMillis -time; if toDisk InputStream istream = new FileInputStreamt.tmp; else InputStream istream = new ByteArrayInputStream ByteArrayOutputStream ostream.toByteArray ; ObjectInputStream rdr = new ObjectInputStreamistream; The old version of the test just ran the next commented line to read the objects Foo1[] allFoos = Foo1[] rdr.readObject ; time = System.currentTimeMillis ; This new version reads the size of the array and creates the array, then each object is explicitly created and reads itself. read objects - time these ten lines to the close for serialization input int len = rdr.readInt ; Foo[] allFoos = new Foo[len]; Foo foo; for int i = 0; i len ; i++ { foo = new Foo ; foo.readExternalrdr; allFoos[i] = foo; } rdr.close ; System.out.printlnReading time: + System.currentTimeMillis -time; This test bypasses the serialization overhead completely. You are still using the ObjectInputStream and ObjectOutputStream classes, but really only to write out basic data types, not for any of their object-manipulation capabilities. If you didnt require those specific classes because of the required method signatures, you could have happily used DataInputStream and DataOutputStream classes for this test. The following chart shows the test results. Writing serializing Reading deserializing Standard serialization 100 175 Foo made Externalizable, using last methods renamed 28 46 Foo as last test, but readwrite called directly in test 8 36 If you test serializing to and from the disk, you find that the disk IO now takes nearly one-third of the total test times. Because disk IO is now a significant portion of the total time, the CPU is now underworked and you can even gain some speedup by serializing in several threads, i.e., you can evenly divide the collection into two or more subsets and have each subset serialized by a separate thread I leave that as an exercise for you. - 184 - Note that since you are now explicitly creating objects by calling their constructors, the instance variables in Bar are being set twice during deserialization, once at the creation of the Bar instance in Foo.readExternal , and again when reading in the instance variable values and assigning those values. Normally you should move any Bar initialization out of the no-arg constructor to avoid redundant assignments . Is there any way of making the deserializing faster? Well, not significantly, if you need to read in all the objects and use them all immediately. But more typically, you need only some of the objects immediately. In this case, you can use lazily initialized objects to speed up the deserializing phase see also Section 4.5.2 . The idea is that instead of combining the read with the object creation in the deserializing phase, you decouple these two operations. So each object reads in just the bytes it needs, but does not convert those bytes into objects or data until that object is actually accessed. To test this, add a new instance variable to Foo to hold the bytes between reading and converting to objects or data. You also need to change the serialization methods. I will drop support for the Serializable and Externalizable interfaces, since we are now explicitly requiring the Foo objects to serialize and deserialize themselves, and Ill add a second stream to store the size of the serialized Foo objects. Foo now looks like: class Foo1 implements Serializable { int one; String two; Bar1[] four; byte[] buffer; empty constructor to optimize deserialization public Foo5 {} And constructor that creates initialized objects public Foo5boolean init { this ; if init init ; } public void init { two = new StringSTART; one = two.length ; four = new Bar5[2]; four[0] = new Bar5 ; four[1] = new Bar5 ; } Serialization method public void writeExternalMyDataOutputStream out, DataOutputStream outSizes throws IOException { Get the amount written so far so that we can determine the extra we write int size = out.written ; write out the Foo out.writeUTFtwo; out.writeIntone; out.writeUTFfour[0].two; out.writeFloatfour[0].one; out.writeUTFfour[1].two; out.writeFloatfour[1].one; Determine how many bytes I wrote - 185 - size = out.written - size; and write that out to our second stream outSizes.writeIntsize; } public void readExternalInputStream in, DataInputStream inSizes throws IOException { Determine how many bytes I consist of in serialized form int size = inSizes.readInt ; And read me into a byte buffer buffer = new byte[size]; int len; int readlen = in.readbuffer; be robust and handle the general case of partial reads and incomplete streams if readlen == -1 throw new IOExceptionexpected more bytes; else whilereadlen buffer.length { len = in.readbuffer, readlen, buffer.length-readlen; if len 1 throw new IOExceptionexpected more bytes; else readlen += len; } } This method does the deserializing of the byte buffer to a real Foo public void convert throws IOException { DataInputStream in = new DataInputStreamnew ByteArrayInputStreambuffer; two = in.readUTF ; one = in.readInt ; four = new Bar5[2]; four[0] = new Bar5 ; four[1] = new Bar5 ; four[0].two = in.readUTF ; four[0].one = in.readFloat ; four[1].two = in.readUTF ; four[1].one = in.readFloat ; buffer = null; } } As you can see, I have chosen to use DataInputStream s and DataOutputStream s, since they are all thats needed. In addition, I use a subclass of DataOutputStream called MyDataOutputStream . This class adds only one method, MyDataOutputStream.written , to provide access to the DataOutputStream.written instance variable so you have access to the number of bytes written. The timing tests are essentially the same as before, except that you change the stream types and add a second stream for the sizes of the serialized objects e.g., to file t2.tmp, or a second pair of byte- array input and output streams. The following chart shows the new times: Writing serializing Reading deserializing Standard serialization 100 175 Foo as last test, but readwrite called directly in test 8 36 - 186 - Foo lazily initialized 20 7 We have lost out on the writes because of the added complexity, but improved the reads considerably. The cost of the Foo.convert method has not been factored in, but the strategy illustrated here is for cases where you need to run only that convert method on a small subset of the