JPPF Issue Tracker
star_faded.png
Please log in to bookmark issues
bug_report_small.png
CLOSED  Bug report JPPF-64  -  Deadlock: state transition in NodeClass and ClientClass
Posted Sep 13, 2012 - updated Oct 19, 2012
icon_info.png This issue has been closed with status "Closed" and resolution "RESOLVED".
Issue details
  • Type of issue
    Bug report
  • Status
     
    Closed
  • Assigned to
     jandam
  • Progress
       
  • Type of bug
    Not triaged
  • Likelihood
    Not triaged
  • Effect
    Not triaged
  • Posted by
     jandam
  • Owned by
    Not owned by anyone
  • Time spent
    1 hour
  • Category
    Server
  • Resolution
    RESOLVED
  • Priority
    Critical
  • Reproducability
    Often
  • Severity
    Critical
  • Targetted for
    icon_milestones.png JPPF 3.2
Issue description
State transition for channel always use channel as monitor object. Deadlock is because there is transitions for Client and Node channels in same time and monitors are locked in reverse order.

Transition for Client channel : lock CLIENT monitor, lock NODE monitor Transition for Node channel : lock NODE monitor, lock CLIENT monitor

"NodeClass-thread-3" prio=10 tid=0x00000000008f4000 nid=0x300b waiting for monitor entry [0x0000000042114000]
   java.lang.Thread.State: BLOCKED (on object monitor)
 at org.jppf.server.nio.classloader.node.WaitingNodeRequestState.processDynamic(WaitingNodeRequestState.java:195)
  - waiting to lock <0x00000000d8006a00> (a org.jppf.server.nio.SelectionKeyWrapper)
  at org.jppf.server.nio.classloader.node.WaitingNodeRequestState.performTransition(WaitingNodeRequestState.java:88)
  at org.jppf.server.nio.classloader.node.WaitingNodeRequestState.performTransition(WaitingNodeRequestState.java:36)
  at org.jppf.server.nio.StateTransitionTask.run(StateTransitionTask.java:82)
 - locked <0x00000000d80f1088> (a org.jppf.server.nio.SelectionKeyWrapper)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
  at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
 at java.util.concurrent.FutureTask.run(FutureTask.java:166)
 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
  at java.lang.Thread.run(Thread.java:722)
    Locked ownable synchronizers:
 - <0x00000000d80ba340> (a java.util.concurrent.ThreadPoolExecutor$Worker)
 
"ClientClass-thread-2" prio=10 tid=0x00000000006b3000 nid=0x3045 waiting for monitor entry [0x000000004291c000]
   java.lang.Thread.State: BLOCKED (on object monitor)
  at org.jppf.server.nio.classloader.client.WaitingProviderResponseState.performTransition(WaitingProviderResponseState.java:80)
  - waiting to lock <0x00000000d80f1088> (a org.jppf.server.nio.SelectionKeyWrapper)
  at org.jppf.server.nio.classloader.client.WaitingProviderResponseState.performTransition(WaitingProviderResponseState.java:33)
  at org.jppf.server.nio.StateTransitionTask.run(StateTransitionTask.java:82)
 - locked <0x00000000d8006a00> (a org.jppf.server.nio.SelectionKeyWrapper)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
  at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
 at java.util.concurrent.FutureTask.run(FutureTask.java:166)
 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
  at java.lang.Thread.run(Thread.java:722)
    Locked ownable synchronizers:
 - <0x00000000d8006a90> (a java.util.concurrent.ThreadPoolExecutor$Worker)
Steps to reproduce this issue
Configuration: 1 driver, 1 node with ForkJoinExecutor, DelegationModel=URL

#3
Comment posted by
 jandam
Sep 14, 08:36
This bug is regression in trunk (confirmed in latest revision 2376) 3.1 branch works correctly
#9
Comment posted by
 lolo4j
Sep 14, 10:45
With delegation model = url and using a task implementation that triggers hundreds of class loading requests, I also get a dadlock in the node:

"node processing-thread-4":
  waiting to lock monitor 0x000000000f3f4180 (object 0x00000000d592c6c0, a java.lang.Object),
  which is held by "node processing-thread-2"
 
"node processing-thread-2":
  waiting to lock monitor 0x000000000f5c8658 (object 0x00000000d591d7e0, a org.jppf.classloader.JPPFClassLoader),
  which is held by "node processing-thread-4"
 
"node processing-thread-4":
  at org.jppf.classloader.AbstractJPPFClassLoader.findClass(AbstractJPPFClassLoader.java:124)
 - waiting to lock <0x00000000d592c6c0> (a java.lang.Object)
 at org.jppf.classloader.AbstractJPPFClassLoader.loadClassLocalFirst(AbstractJPPFClassLoader.java:371)
 at org.jppf.classloader.AbstractJPPFClassLoader.loadClass(AbstractJPPFClassLoader.java:331)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
  at java.lang.Class.getDeclaredMethods0(Native Method)
 at java.lang.Class.privateGetDeclaredMethods(Class.java:2442)
 at java.lang.Class.getDeclaredMethod(Class.java:1952)
 at java.io.ObjectStreamClass.getPrivateMethod(ObjectStreamClass.java:1411)
  at java.io.ObjectStreamClass.access$1700(ObjectStreamClass.java:69)
 at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:481)
  at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:455)
  at java.security.AccessController.doPrivileged(Native Method)
 at java.io.ObjectStreamClass. (ObjectStreamClass.java:455)
  at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:352)
 at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:589)
 at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1601)
  at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1514)
 at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1601)
  at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1514)
 at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1750)
  at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1347)
 at java.io.ObjectInputStream.readObject(ObjectInputStream.java:369)
 at org.jppf.utils.ObjectSerializerImpl.deserialize(ObjectSerializerImpl.java:169)
 at org.jppf.utils.ObjectSerializerImpl.deserialize(ObjectSerializerImpl.java:155)
 at org.jppf.io.IOHelper.unwrappedData(IOHelper.java:161)
  at org.jppf.server.node.JPPFContainer$ObjectDeserializationTask.call(JPPFContainer.java:185)
  at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
 at java.util.concurrent.FutureTask.run(FutureTask.java:166)
 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
  at java.lang.Thread.run(Thread.java:722)
 
 "node processing-thread-2":
  at java.lang.ClassLoader.checkCerts(ClassLoader.java:933)
 - waiting to lock <0x00000000d591d7e0> (a org.jppf.classloader.JPPFClassLoader)
 at java.lang.ClassLoader.preDefineClass(ClassLoader.java:657)
 at java.lang.ClassLoader.defineClass(ClassLoader.java:785)
  at java.lang.ClassLoader.defineClass(ClassLoader.java:634)
  at org.jppf.classloader.AbstractJPPFClassLoader.findClass(AbstractJPPFClassLoader.java:151)
 - locked <0x00000000d592c6c0> (a java.lang.Object)
  at org.jppf.classloader.AbstractJPPFClassLoader.loadClassLocalFirst(AbstractJPPFClassLoader.java:367)
 at org.jppf.classloader.AbstractJPPFClassLoader.loadClass(AbstractJPPFClassLoader.java:331)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
  at java.lang.Class.forName0(Native Method)
  at java.lang.Class.forName(Class.java:264)
  at java.io.ObjectInputStream.resolveClass(ObjectInputStream.java:622)
 at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1593)
  at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1514)
 at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1750)
  at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1347)
 at java.io.ObjectInputStream.readObject(ObjectInputStream.java:369)
 at org.jppf.utils.ObjectSerializerImpl.deserialize(ObjectSerializerImpl.java:169)
 at org.jppf.utils.ObjectSerializerImpl.deserialize(ObjectSerializerImpl.java:155)
 at org.jppf.io.IOHelper.unwrappedData(IOHelper.java:161)
  at org.jppf.server.node.JPPFContainer$ObjectDeserializationTask.call(JPPFContainer.java:185)
  at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
 at java.util.concurrent.FutureTask.run(FutureTask.java:166)
 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
  at java.lang.Thread.run(Thread.java:722)
#10
Comment posted by
 lolo4j
Sep 15, 16:38
Fixed the deadlock in the node: trunk revision 2383

I'm still unable to reproduce the deadlock on the server side, even with delagation model = url, using FJ thread manager, and using tasks that trigger thousands of class loading requests (it takes over 5s for the first job to load all the classes it needs, with client + server + node on the same machine).
#11
Comment posted by
 jandam
Sep 17, 09:48
Temporary workaround trunk revision 2397 Introduced synchronization from 3.1 branch to class loaders. The deadlock is no longer reproducible. It looks it was introduced by multithread resource requests.
#13
Comment posted by
 jandam
Sep 19, 10:39
Removed not necessary synchronization that lead to deadlock. Fixed in trunk revision 2401

The issue was updated with the following change(s):
  • This issue has been closed
  • The status has been updated, from Being worked on to Closed.
  • This issue's progression has been updated to 100 percent completed.
  • The resolution has been updated, from Not determined to RESOLVED.
  • Information about the user working on this issue has been changed, from jandam to Not being worked on.
  • Time spent on this issue, from No time spent to 1 hour.
#15
Comment posted by
 jandam
Oct 19, 23:21
Fixed in trunk revision 2492. Problem was in ClassContext synchronization. State was idle when pending request exist.

The issue was updated with the following change(s):
  • This issue has been closed
  • The status has been updated, from New to Closed.
  • This issue's progression has been updated to 100 percent completed.
  • The resolution has been updated, from Not determined to RESOLVED.