The last group of methods consists mostly of protected methods that provide hooks, which allow the serialization mechanism itself, rather than the data associated to a particular class, to be
customized. These methods are: protected boolean enableResolveObjectboolean enable;
protected Class resolveClassObje ctStreamClass v; protected Object resolveObjectObject obj;
protected class resolveProxyClassString[] interfaces; protected ObjectStreamClass readClassDescriptor ;
protected Object readObjectOverride ; protected void readStreamHeader ;
public void registerValidationObjectInputValidation obj, int priority; public GetFields readFields ;
These methods are more important to people who tailor the serialization algorithm to a particular use or develop their own implementation of serialization. Like before, they also require a deeper
understanding of the serialization algorithm, so Ill hold off on discussing them right now.
10.3 How to Make a Class Serializable
So far, weve focused on the mechanics of serializing an object. Weve assumed we have a serializable object and discussed, from the point of view of client code, how to serialize it. The
next step is discussing how to make a class serializable.
There are four basic things you must do when you are making a class serializable. They are: 1. Implement the
Serializable interface.
2. Make sure that instance-level, locally defined state is serialized properly. 3. Make sure that superclass state is serialized properly.
4. Override equals
and hashCode
. Lets look at each of these steps in more detail.
10.3.1 Implement the Serializable Interface
This is by far the easiest of the steps. The Serializable
interface is an empty interface; it declares no methods at all. So implementing it amounts to adding implements Serializable to
your class declaration. Reasonable people may wonder about the utility of an empty interface. Rather than define an
empty interface, and require class definitions to implement it, why not just simply make every object serializable? The main reason not to do this is that there are some classes that dont have
an obvious serialization. Consider, for example, an instance of File
. An instance of File
represents a file. Suppose, for example, it was created using the following line of code: File file = new Filec:\\temp\\foo;
Its not at all clear what should be written out when this is serialized. The problem is that the file itself has a different lifecyle than the serialized data. The file might be edited, or deleted entirely,
while the serialized information remains unchanged. Or the serialized information might be used to restart the application on another machine, where
C:\\temp\\foo is the name of an
entirely different file. Another example is provided by the
Thread
[ 4]
class. A thread represents a flow of execution within a particular JVM. You would not only have to store the stack, and all the local variables, but
also all the related locks and threads, and restart all the threads properly when the instance is deserialized.
[ 4]
If you dont know much about threads, just wait a few chapters and then revisit this example. It will make more sense then.
Things get worse when you consider platform dependencies. In general, any class that involves native code is not really a
good candidate for serialization.
10.3.2 Make Sure That Instance-Level, Locally Defined StateIs Serialized Properly
Class definitions contain variable declarations. The instance-level, locally defined variables e.g., the nonstatic variables are the ones that contain the state of a particular instance. For example,
in our Money
class, we declared one such field: public class Money extends ValueObject {
private int _cents; ....
} The serialization mechanism has a nice default behavior™if all the instance-level, locally defined
variables have values that are either serializable objects or primitive datatypes, then the serialization mechanism will work without any further effort on our part. For example, our
implementations of Account
, such as Account_Impl
, would present no problems for the default serialization mechanism:
public class Account_Impl extends UnicastRemoteObject implements Account {
private Money _balance; ...
} While
_balance doesnt have a primitive type, it does refer to an instance of
Money , which is a
serializable class. If, however, some of the fields dont have primitive types, and dont refer to serializable classes,
more work may be necessary. Consider, for example, the implementation of ArrayList
from the java.util
package. An ArrayList
really has only two pieces of state: public class ArrayList extends AbstractList implements List, Cloneable,
java.io. Serializable {
private Object elementData[]; private int size;
... }
But hidden in here is a huge problem: ArrayList
is a generic container class whose state is stored as an array of objects. While arrays are first-class objects in Java, they arent serializable
objects. This means that ArrayList
cant just implement the Serializable
interface. It has to provide extra information to help the serialization mechanism handle its nonserializable fields.
There are three basic solutions to this problem: •
Fields can be declared to be transient. •
The writeObject
readObject methods can be implemented.
• serialPersistentFields
can be declared.
10.3.2.1 Declaring transient fields
The first, and easiest, thing you can do is simply mark some fields using the transient
keyword. In ArrayList
, for example, elementData
is really declared to be a transient field: public class ArrayList extends AbstractList implements List, Cloneable,
java.io. Serializable {
private transient Object elementData[]; private int size;
... }
This tells the default serialization mechanism to ignore the variable. In other words, the serialization mechanism simply skips over the transient variables. In the case of
ArrayList , the
default serialization mechanism would attempt to write out size
, but ignore elementData
entirely. This can be useful in two, usually distinct, situations:
The variable isnt serializable If the variable isnt serializable, then the serialization mechanism will throw an exception
when it tries to serialize the variable. To avoid this, you can declare the variable to be transient.
The variable is redundant Suppose that the instance caches the result of a computation. Locally, we might want to
store the result of the computation, in order to save some processor time. But when we send the object over the wire, we might worry more about consuming bandwidth and thus
discard the cached computation since we can always regenerate it later on.
10.3.2.2 Implementing writeObject and readObject
Suppose that the first case applies. A field takes values that arent serializable. If the field is still an important part of the state of our instance, such as
elementData in the case of an
ArrayList , simply declaring the variable to be
transient isnt good enough. We need to save
and restore the state stored in the variable. This is done by implementing a pair of methods with the following signatures:
private void writeObjectjava.io.ObjectOutputStream out throws IOException
private void readObjectjava.io.ObjectInputStream in throws IOException,
ClassNotFoundException;
When the serialization mechanism starts to write out an object, it will check to see whether the class implements
writeObject . If so, the serialization mechanism will not use the default
mechanism and will not write out any of the instance variables. Instead, it will call writeObject
and depend on the method to store out all the important state. Here is ArrayList
s implementation of writeObject
: private synchronized void writeObjectjava.io.ObjectOutputStream stream
throws java. io.IOException {
stream.defaultWriteObject ; stream.writeIntelementData.length;
for int i=0; isize; i++
stream.writeObjectelementData[i]; }
The first thing this does is call defaultWriteObject
. defaultWriteObject
invokes the default serialization mechanism, which serializes all the nontransient, nonstatic instance
variables. Next, the method writes out elementData.length
and then calls the streams writeObject
for each element of elementData
. Theres an important point here that is sometimes missed:
readObject and
writeObject are a pair of methods that need to be implemented together. If you do any customization of
serialization inside one of these methods, you need to implement the other method. If you dont, the serialization algorithm will fail.
Unit Tests and Serialization
Unit tests are used to test a specific piece of functionality in a class. They are explicitly not end-to-end or application-level tests. Its often a good
idea to adopt a unit-testing harness such as JUnit
when developing an application.
JUnit gives you an automated way to run unit tests on
individual classes and is available from ht t p: w w w .j unit .or g
. If you adopt a unit-testing methodology, then any serializable class
should pass the following three tests:
•
If it implements readObject
, it should implement writeObject
, and vice-versa.
•
It is equal using the equals
method to a serialized copy of itself.
•
It has the same hashcode as a serialized copy of itself. Similar constraints hold for classes that implement the
Externalizable interface.
10.3.2.3 Declaring serialPersistentFields
The final option that can be used is to explicitly declare which fields should be stored by the serialization mechanism. This is done using a special static final variable called
serialPersistentFields , as shown in the following code snippet:
private static final ObjectStreamField[] serialPersistentFields = { new ObjectStreamFieldsize, Integer.TYPE, .... };
This line of code declares that the field named size
, which is of type int
, is a serial persistent field and will be written to the output stream by the serialization mechanism. Declaring
serialPersistentFields is almost the opposite of declaring some fields
transient . The
meaning of transient is, This field shouldnt be stored by serialization, and the meaning of serialPersistentFields
is, These fields should be stored by serialization. But there is one important difference between declaring some variables to be
transient and
others to be serialPersistentFields
. In order to declare variables to be transient, they must be locally declared. In other words, you must have access to the code that declares the
variable. There is no such requirement for serialPersistentFields
. You simply provide the name of the field and the type.
What if you try to do both? That is, suppose you declare some variables to be
transient , and then also provide a
definition for serialPersistentFields
? The answer is that the
transient keyword is ignored; the definition of
serialPersistentFields is definitive.
So far, weve talked only about instance-level state. What about class-level state? Suppose you have important information stored in a static variable? Static variables wont get saved by
serialization unless you add special code to do so. In our context, shipping objects over the wire between clients and servers, statics are usually a bad idea anyway.
10.3.3 Make Sure That Superclass State Is Handled Correctly
After youve handled the locally declared state, you may still need to worry about variables declared in a superclass. If the superclass implements the
Serializable interface, then you
dont need to do anything. The serialization mechanism will handle everything for you, either by using default serialization or by invoking
writeObject readObject
if they are declared in the superclass.
If the superclass doesnt implement Serializable
, you will need to store its state. There are two different ways to approach this. You can use
serialPersistentFields to tell the
serialization mechanism about some of the superclass instance variables, or you can use writeObject
readObject to handle the superclass state explicitly. Both of these,
unfortunately, require you to know a fair amount about the superclass. If youre getting the .class files from another source, you should be aware that versioning issues can cause some really
nasty problems. If you subclass a class, and that classs internal representation of instance-level state changes, you may not be able to load in your serialized data. While you can sometimes
work around this by using a sufficiently convoluted
readObject method, this may not be a
solvable problem. Well return to this later. However, be aware that the ultimate solution may be to just implement the
Externalizable interface instead, which well talk about later.
Another aspect of handling the state of a nonserializable superclass is that nonserializable superclasses must have a zero-argument constructor. This isnt important for serializing out an
object, but its incredibly important when deserializing an object. Deserialization works by creating an instance of a class and filling out its fields correctly. During this process, the deserialization
algorithm doesnt actually call any of the serialized classs constructors, but does call the zero- argument constructor of the first nonserializable superclass. If there isnt a zero-argument
constructor, then the deserialization algorithm cant create instances of the class, and the whole process fails.
If you cant create a zero-argument constructor in the first nonserializable superclass, youll have to implement the
Externalizable interface instead.
Simply adding a zero-argument constructor might seem a little problematic. Suppose the object already has several constructors, all of which take arguments. If you simply add a zero-argument
constructor, then the serialization mechanism might leave the object in a half-initialized, and therefore unusable, state.
However, since serialization will supply the instance variables with correct values from an active instance immediately after instantiating the object, the only way this problem could arise is if the
constructors actually do something with their arguments™besides setting variable values.
If all the constructors take arguments and actually execute initialization code as part of the constructor, then you may need to refactor a bit. The usual solution is to move the local
initialization code into a new method usually named something like initialize
, which is then called from the original constructor:
public MyObjectarglist { set local variables from arglist
perform local initialization }
to something that looks like: private MyObject {
zero argument constructor, invoked by serialization and never by any other
piece of code. note that it doesnt call initialize
} public void MyObjectarglist {
set local variables from arglist
initialize ; }
private void initialize { perform local initialization
}
After this is done, writeObject
readObject should be implemented, and
readObject should end with a call to
initialize . Sometimes this will result in code
that simply invokes the default serialization mechanism, as in the following snippet: private void writeObjectjava.io.ObjectOutputStream stream throws
java.io.IOException { stream.defaultWriteObject ;
} private void readObjectjava.io.ObjectInputStream stream throws
java.io.IOException {
stream.defaultReadObject ; intialize ;
}
If creating a zero-argument constructor is difficult for example, you dont have the source code for the superclass,
your class will need to implement the Externalizable
interface instead of Serializable
.
10.3.4 Override equals and hashCode if Necessary