JPPF Issue Tracker
star_faded.png
Please log in to bookmark issues
enhancement_small.png
CLOSED  Enhancement JPPF-422  -  Performance improvements for serialization schemes
Posted Nov 02, 2015 - updated Nov 17, 2015
action_vote_minus_faded.png
0
Votes
action_vote_plus_faded.png
icon_info.png This issue has been closed with status "Closed" and resolution "RESOLVED".
Issue details
  • Type of issue
    Enhancement
  • Status
     
    Closed
  • Assigned to
     lolo4j
  • Type of bug
    Not triaged
  • Likelihood
    Not triaged
  • Effect
    Not triaged
  • Posted by
     lolo4j
  • Owned by
    Not owned by anyone
  • Category
    Performance
  • Resolution
    RESOLVED
  • Priority
    Normal
  • Targetted for
    icon_milestones.png JPPF 5.2
Issue description
Currently the JPPF serialization (implemented via the JPPFObjectOutputStream and JPPFObjectInputStream classes) is extremely slow. Depnding on what sort of data (e.g. strings and chars) is serialized, it can be up to 10 times slower than the standard Java serialization. I propose to profile and improve the code, along with the serialization format, to make the performance more acceptable.

Alternatively, I propose to explore other possibilities:
  • see if the Kryo serialization can be improved by upgrading to the latest version
  • investigate the possiblity to compress the serialized data, for instance using the built-in Gzip API (zlib) and the LZ4 library
  • also we should expore whether to keep the Kryo serialization as a sample or include it in the JPPF API ... or both

#2
Comment posted by
 lolo4j
Nov 02, 23:27
I first tested with a JPPFSystemInformation object, which contains a couple of hundreds of strings distributed in multiple TypedProperties objects. I tried a serialization skip combining gzip compression with Java serialization, and the result offered the best compression rate when compared with lz4, but it was too damn slow. Same when combined with Kryo serialization. Therefore I removed gzip compression from the tests, and only used lz4 as compression library.

After multiple iterations of improvements on the JPPF serialization, I got the results below. Each iteration means serialization followed by desrialization of the reference object.
100,000 iterations - object: JPPFSystemInformation (99% strings)
KryoSerialization time:        00:00:08.856; serialized object size: 11,272
KryoLZ4Serialization time:     00:00:13.772; serialized object size:  5,843
DefaultJavaSerialization time: 00:00:18.722; serialized object size: 12,194
DefaultJPPFSerialization time: 00:00:24.838; serialized object size: 25,285
Unfortunately, I didn't keep the numbers I had before the optimization rounds, but for JPPF serialization if was in the order of 00:00:68.000 (68 seconds) for around 35,500 bytes. So a huge performance improvement - albeit not up tot he other serialization schemes - , but a poor resulting serialized size.

Then I tried with a different type of serialized data, a 300x300 square matrix object with random double values, where the values are stored in a double[][] array. Here's the result:
100,000 iterations - object: test.serialization.Matrix  (300x300 matrix with random double values)
KryoSerialization time :       00:01:47.295; serialized object size: 720,933
KryoLZ4Serialization time:     00:02:31.514, serialized object size: 721,437
DefaultJavaSerialization time: 00:02:00.228, serialized object size: 723,109
DefaultJPPFSerialization time: 00:01:58.985, serialized object size: 722,680
Here, performance and footprint are really on par. We can see that compression for this kind of data is mostly useless, due to the values being random.

In the next step, I will use an object graph mixing all kinds of data and see what happens.
#5
Comment posted by
 lolo4j
Nov 04, 12:51
After a new round of optiomizations, especially for string serialization, I get the following results:

***** testing with object = org.jppf.management.JPPFSystemInformation *****
 
KryoSerialization time:        00:00:09.006; serialized size: 11,272
KryoLZ4Serialization time:     00:00:13.891; serialized size: 5,839
DefaultJavaSerialization time: 00:00:18.809; serialized size: 12,194
DefaultJPPFSerialization time: 00:00:23.734; serialized size: 15,042
 
***** testing with object = test.serialization.Matrix *****
 
KryoSerialization time:        00:00:12.538; serialized size: 80,232
KryoLZ4Serialization time:     00:00:18.569; serialized size: 80,316
DefaultJavaSerialization time: 00:00:15.851; serialized size: 81,109
DefaultJPPFSerialization time: 00:00:16.051; serialized size: 80,668
#6
Comment posted by
 lolo4j
Nov 05, 14:29
After yet another set of optimizations around strings, I get:

***** testing with object = org.jppf.management.JPPFSystemInformation *****
 
KryoSerialization time:        00:00:08.793; serialized size: 11,272
KryoLZ4Serialization time:     00:00:13.813; serialized size:  5,843
DefaultJavaSerialization time: 00:00:18.965; serialized size: 12,194
DefaultJPPFSerialization time: 00:00:21.779; serialized size: 12,950
 
***** testing with object = test.serialization.Matrix *****
 
KryoSerialization time:        00:00:12.386; serialized size: 80,232
KryoLZ4Serialization time:     00:00:18.030; serialized size: 80,316
DefaultJavaSerialization time: 00:00:15.611; serialized size: 81,109
DefaultJPPFSerialization time: 00:00:16.653; serialized size: 80,662
#7
Comment posted by
 lolo4j
Nov 08, 11:51
Current implementation in trunk revision 3887.

After more profiling, I realized that looking up the readObject(ObjectInputStream) and writeObject(ObjectOutputStream) for the serialized classes was very costly, essentially due to calls to Class.getDeclaredMethod(Class, Class...). It is in fact much more efficient to iterate over the array returned by Class.getDeclaredMethods(). Also, caching the corresponding Method objects (in a soft references map) results in a significant performance gain. Same for the fields of the serialized classes.

So now, I get the following performance results:

***** testing with object = org.jppf.management.JPPFSystemInformation *****
 
org.jppf.serialization.kryo.KryoSerialization time:     00:00:09.022; serialized size: 11,299
LZ4 org.jppf.serialization.kryo.KryoSerialization time: 00:00:13.842; serialized size:  5,826
org.jppf.serialization.DefaultJavaSerialization time:   00:00:18.856; serialized size: 12,221
org.jppf.serialization.DefaultJPPFSerialization time:   00:00:17.817; serialized size: 12,977
 
***** testing with object = test.serialization.Matrix *****
 
org.jppf.serialization.kryo.KryoSerialization time:     00:00:12.515; serialized size: 80,232
LZ4 org.jppf.serialization.kryo.KryoSerialization time: 00:00:18.257; serialized size: 80,316
org.jppf.serialization.DefaultJavaSerialization time:   00:00:15.822; serialized size: 81,109
org.jppf.serialization.DefaultJPPFSerialization time:   00:00:14.378; serialized size: 80,662
So, we're finally doing better than the JDK! But let's not cry victory. The above results are for tests that serialize a single, relatively large object at a each iteration. An implication of this is that the overhead of the serialization framework is diluted by the size of the data, which is mostly held in arrays that are easy and fast to serialize.

So instead, I tried testing with a different kind of data structure: basically something similar to what a JPPF client sends to a server when it submits a job: a job header + each task as a separate object graph. I made the tasks small and there are 300 of them, so at each iteration 301 objects are serialized and deserialized. Here are the results:

org.jppf.serialization.kryo.KryoSerialization time:     00:00:04.350; serialized size: 18,806
LZ4 org.jppf.serialization.kryo.KryoSerialization time: 00:00:21.121; serialized size: 31,353
org.jppf.serialization.DefaultJavaSerialization time:   00:00:20.896; serialized size: 94,234
org.jppf.serialization.DefaultJPPFSerialization time:   00:00:47.610; serialized size: 81,743
Yep, it's not pretty. That means more profiling, pooling, caching, refactoring etc...
#8
Comment posted by
 lolo4j
Nov 12, 08:56
Trunk revision 3888. A slight improvement:
***** testing with 301 objects *****
org.jppf.serialization.kryo.KryoSerialization time:     00:00:04.374; serialized size: 18,806
LZ4 org.jppf.serialization.kryo.KryoSerialization time: 00:00:21.719; serialized size: 31,353
org.jppf.serialization.DefaultJavaSerialization time:   00:00:21.978; serialized size: 94,234
org.jppf.serialization.DefaultJPPFSerialization time:   00:00:37.476; serialized size: 81,743
#9
Comment posted by
 lolo4j
Nov 12, 11:44
Trunk revision 3889. Victory! I managed to refactor the seirlaizer/deserializer code so they can be pooled and reused effectively. I like the results much better:
***** testing with 301 objects *****
org.jppf.serialization.kryo.KryoSerialization time:     00:00:04.426; serialized size: 18,806
LZ4 org.jppf.serialization.kryo.KryoSerialization time: 00:00:20.509; serialized size: 31,353
org.jppf.serialization.DefaultJavaSerialization time:   00:00:20.759; serialized size: 94,234
org.jppf.serialization.DefaultJPPFSerialization time:   00:00:04.830; serialized size: 22,328