JPPF Issue Tracker
Please log in to bookmark issues
CLOSED  Enhancement JPPF-413  -  Job stuck when dispatched to a peer driver with no node
Posted Sep 23, 2015 - updated Aug 15, 2018
icon_info.png This issue has been closed with status "Closed" and resolution "RESOLVED".
Issue details
  • Type of issue
  • Status
  • Assigned to
  • Type of bug
    Not triaged
  • Likelihood
    Not triaged
  • Effect
    Not triaged
  • Posted by
  • Owned by
    Not owned by anyone
  • Category
  • Resolution
  • Priority
  • Reproducability
  • Severity
  • Targetted for
    icon_milestones.png JPPF 5.0.x
Issue description
From this forum thread: in a topology with 2 drivers connected to each other, where one of the drivers doesn't have any node, if a job is dispatched to that driver, it will never be executed and will just hang.
Steps to reproduce this issue
see description

Comment posted by
Sep 23, 08:11
We can distinguish 2 use cases:

1) One of the drivers is used solely as failover, as is the case in the originating forum thread. In this situation an easy fix would be to make the other driver server mark it as a peer (e.g. "jppf.driver.peer = true") in the information available to execution policies. Then it is enough to set the job with an appropriate exeuction policy to work around the issue, like this:
JPPFJob job = new JPPFJob();
job.getSLA().setExecutionPolicy(new Equal("jppf.peer.driver", false));
This is the solution I will implement for JPPF 3.3.7 as it satisfies the scenario in the forum thread.

2) In the general case, for instance when the peer initially has some nodes attached, then for any reason the nodes get disconnected, there are several possibilities:
  • in the more recent versions, we can use a job dispatch timeout to cause the job dispatch to be resubmitted and hope that it will be resubmitted to a driver with nodes attached
  • a more efficient, but more complex and time-consuming, solution, would be to have a monitoring of the peer driver so the submitting driver knows how many nodes it has and can choose to reject it for scheduling. I might register this as a new feature for JPPF 5.2, as we are already horribly late for 5.1.
Comment posted by
Sep 29, 08:49
Fixed in: