TXT

JAVA Serialization and performance

By Elaine Henry,2014-07-07 15:15
76 views 0
We all have done serialization in some way or another, mostly it's done without that we ever really 'know' that it happens. How can we control the java serialization and how can we improve its performance?

    We all have done serialization in some way or another, mostly it's done without that we ever really 'know' that it happens. A good example is in this case

    all sorts of software that use RMI (Remote Method Invocation).

    Most people realize the serialization process by giving the class the java.io.Serializable interface like this:

import java.io.*;

public class Pojo implements Serializable {

private static final long serialVersionUID = L;

....

}

    This is normally more then adequate for serialization, however the performance

    of doing serialization this way is poor. It has to use reflection not only

    to find out what the fields are to serialize but also to find out the types

    of the various fields. Reflection is as we know a very time consuming process.

    So the question is, what can we do to eliminate as much of this overhead as possible. There are a few ways, possible.

1. Using the ObjectStreamField class.

    2. Using the readObject / writeObject methods.

    3. Using the Externalizable interface.

    In the end the best performance is gained by using the Externalizable interface. But just to get the whole idea i??ll give an example of all the four

    different ways.

    We've already shown the serialization process by just using the Serializable

    interface. The only part I didn't mention is how you can exclude fields.

To exclude

    fields from the serialization process on can use the transient keyword.

    The ObjectStreamField class is used to tell the serialization mechanism more about the field and their types. This should save some time because it is

    not necessary anymore to retrieve the types of the different field using reflection. The transient keyword still works with this solution, so it is possible to create and ObjectStreamField class for all the fields and

    then exclude them using the transient keyword.

public class Pojo implements Serializable {

private static final long serialVersionUID = 1L;

private String valueA = "SomeTextA";

    private int valueB = 10;

    private float valueC = 100f;

    private double valueD = 100.100d;

    private short valueE = 10;

// Getters and setters go here.

    private static final ObjectStreamField[] serialPersistFields = { new ObjectStreamField("valueA", String.class),

    new ObjectStreamField("valueB", Integer.class),

    new ObjectStreamField("valueC", Float.class),

    new ObjectStreamField("valueD", Double.class),

    new ObjectStreamField("valueE", Short.class)

    };

    }

    As you can see it is not much that has to be done to achieve a bit more performance.

    The only 'problem' with this method is that you must specify all the fields you want to serialize, while with just using the Serializable interface all of the fields are included except for the

    fields that are marked as transient .

    The second way was by using the readObject / writeObject methods.

public class Pojo implements Serializable {

private static final long serialVersionUID = 1L;

private String valueA = "SomeTextA";

    private int valueB = 10;

    private float valueC = 100f;

    private double valueD = 100.100d;

    private short valueE = 10;

// Getters and setters go here.

    private void writeObject(ObjectOutputStream oos) throws IOException {

    oos.writeUTF(valueA);

    oos.writeInt(valueB);

    oos.writeFloat(valueC);

    oos.writeDouble(valueD);

    oos.writeShort(valueE);

    }

    private void readObject(ObjectInputStream ois) throws IOException, ClassNotFoundException {

    this.valueA = ois.readUTF();

    this.valueB = ois.readInt();

    this.valueC = ois.readFloat();

    this.valueD= ois.readDouble();

    this.valueE= ois.readShort();

    }

    }

    This way involves that the serialization process executes on of the two methods

    upon serialization or de-serialization. As you can see it is still required to

    specify the Serializable interface. Using the transient keyword however will not work, this is because you specifically tell the serialization

    interface what fields you want to serialize or de-serialize.

    The third way was by using the Externalizable interface.

public class Pojo implements Externalizable {

    private static final long serialVersionUID = 1L;

private String valueA = "SomeTextA";

    private int valueB = 10;

    private float valueC = 100f;

    private double valueD = 100.100d;

    private short valueE = 10;

// Getters and setters go here.

    public void writeExternal(ObjectOutput out) throws IOException { out.writeUTF(valueA);

    out.writeInt(valueB);

    out.writeFloat(valueC);

    out.writeDouble(valueD);

    out.writeShort(valueE);

    }

    public void readExternal(ObjectInput in) throws IOException, ClassNotFoundException {

    this.valueA = in.readUTF();

    this.valueB = in.readInt();

    this.valueC = in.readFloat();

    this.valueD= in.readDouble();

    this.valueE= in.readShort();

    }

    }

    The difference with the readObject / writeObject methods is that it isn't necessary to specify the Serializable interface anymore (using the serialVersionUID is still used however).

    For the rest there doesn't seem much difference with the readObject / writeObject methods

    it now uses the interfaces of the ObjectInputStream and ObjectOutputStream instead

    of the classes directly. The transient keyword don't do anything here either as with

    the readObject / writeObject methods solution.

    So to find out how much time we can save between the various ways of serialization we

    need to set up a simple test case.

public class Run {

private static int TIMES = 500000;

public static void main(String[] args) {

    Pojo sp = new Pojo();

long start = System.currentTimeMillis();

    for (int i = 0; i serialize(sp);

    }

    long duration = System.currentTimeMillis() - start;

    System.out.println("Externalizable: " + duration + "ms."); }

public static void serialize(Pojo o) {

    try {

    ByteArrayOutputStream bout = new ByteArrayOutputStream(); ObjectOutputStream out = new ObjectOutputStream(bout); out.writeObject(o);

    out.close();

    } catch (Exception e) {

    e.printStackTrace();

    }

    }

}

    To get a correct test we need to make sure that we follow some certain rules.

    First of all, don't test a few objects, we want an average. Second minimize the overhead of things we don't want to test. If we for example would instantiate a lot of object fast we don't want to hit some kind of ceiling

    which will cause a lot of garbage collection.

    In our case we just test the time to serialize 50,000 objects, nothing more.

    We do include the time to perform the loop, but if we exclude the loop then we get some other problems. This has to do with the method that is used to calculate the currentTimeMillis (you can find more information about this within the Java API documentation.

    We don't write the information to disk or a network interface because this

    would just cause more overhead of slow resources then we want.

    One other point to notice is that if you are testing with more than one

    String field make sure that the values contained within the various String fields are different. The synchronization process has an optimization that stores only unique String objects.

    And now the results, for testing i used various version of the JDK which is also quite interesting.

Test results:

JDK 1.4.2_12

Serializable: 9766ms.

    Streamfield: 9656ms.

    Read/Write object: 7781ms.

    Externalizable: 5875ms.

JDK 1.5.0_19

Serializable: 9016ms.

    Streamfield: 8859ms.

    Read/Write object: 7141ms.

    Externalizable: 5610ms.

JDK 1.6.0 (B103)

Serializable: 7484ms.

    Streamfield: 7312ms.

    Read/Write object: 5610ms.

    Externalizable: 4828ms.

    What??s interesting is that the Externalizable method really makes a lot of difference

    in comparison to the other methods, in some cases it makes 55% difference.

    Another interesting observation is that over the various versions of the JDK it keeps on

    Improving. I will not go further in to what the exact reason is, but it could be that

    the garbage collector has been improved or the Hotpot engine has been improved.

Report this document

For any questions or suggestions please email
cust-service@docsford.com