The Data Format The Serialization Algorithm

10.4.1 The Data Format

The first step is to discuss what gets written to the stream when an instance is serialized. Be warned: its a lot more information than you might guess from the previous discussion. An important part of serialization involves writing out class-related metadata associated with an instance. Most instances are more than one class. For example, an instance of String is also an instance of Object . Any given instance, however, is an instance of only a few classes. These classes can be written as a sequence: C1 , C2 ... CN , in which C1 is a superclass of C2 , C2 is a superclass of C3 , and so on. This is actually a linear sequence because Java is a single inheritance language for classes. We call C1 the least superclass and CN the most-derived class. See Figur e 10- 4 . Figure 10-4. Inheritance diagram After writing out the associated class information, the serialization mechanism stores out the following information for each instance: • A description of the most-derived class. • Data associated with the instance, interpreted as an instance of the least superclass. • Data associated with the instance, interpreted as an instance of the second least superclass. And so on until: • Data associated with the instance, interpreted as an instance of the most-derived class. So what really happens is that the type of the instance is stored out, and then all the serializable state is stored in discrete chunks that correspond to the class structure. But theres a question still remaining: what do we mean by a description of the most-derived class? This is either a reference to a class description that has already been recorded e.g., an earlier location in the stream or the following information: • The version ID of the class, which is an integer used to validate the .class files • A boolean stating whether writeObject readObject are implemented • The number of serializable fields • A description of each field its name and type • Extra data produced by ObjectOutputStream s annotateClass method • A description of its superclass if the superclass is serializable This should, of course, immediately seem familiar. The class descriptions consist entirely of metadata that allows the instance to be read back in. In fact, this is one of the most beautiful aspects of serialization; the serialization mechanism automatically, at runtime, converts class objects into metadata so instances can be serialized with the least amount of programmer work.

10.4.2 A Simplified Version of the Serialization Algorithm