JPPF Issue Tracker
Please log in to bookmark issues
CLOSED  Bug report JPPF-418  -  Memory leak in client queue
Posted Oct 16, 2015 - updated Oct 25, 2015
icon_info.png This issue has been closed with status "Closed" and resolution "RESOLVED".
Issue details
  • Type of issue
    Bug report
  • Status
  • Assigned to
  • Progress
  • Type of bug
    Not triaged
  • Likelihood
    Not triaged
  • Effect
    Not triaged
  • Posted by
  • Owned by
    Not owned by anyone
  • Category
  • Resolution
  • Priority
  • Reproducability
  • Severity
  • Targetted for
    icon_milestones.png JPPF 4.2.9
Issue description
From this forum thread:

When a job is cancelled and it was only partially sent to the server (due to load-balancer settings for instance), the sizeMap field of the class AbstractJPPFQueue is not properly cleaned up, which leads to an OutOfMemeoryError
Steps to reproduce this issue
Using the attached reproducing code:
  • configure a client with 256 MB of heap and these load balancer settings, to ensure the client only sends one task at a time:
jppf.load.balancing.algorithm = manual
jppf.load.balancing.profile = manual
jppf.load.balancing.profile.manual.size = 1
  • run the sample, it will attempt to submit 10,000 jobs with two tasks each, each task having a memory footprint of 5 MB.
==> the sample fails with an OOME at the 15th job

Comment posted by
Oct 16, 13:03
A file was uploaded. self-contained reproducing codeicon_open_new.png
Comment posted by
Oct 18, 06:57
Now that I fixed the memory leak, I uncovered znother bug in the client. In 4.2.8, the JPPFJob.awaitResult(timeout) (used internally in JPPFClient.submitJob() for blocking jobs) is sometimes not notified of the job completion, causing the application thread that awaits the rjob results to be stuck in a wait(). This can happen in extreme cases where the job completes between the call to SubmissionManager.submitJob() and the call to JPPFJob.awaitResults().

awaitResults ultimately calls AbstractJPPFJob.await() which has this code:
void await(final long timeout, final boolean raiseTimeoutException) throws TimeoutException {
  long millis = timeout > 0L ? timeout : Long.MAX_VALUE;
  long elapsed = 0L;
  long start = System.currentTimeMillis();
  while ((results.size() < tasks.size()) && ((elapsed = System.currentTimeMillis() - start) < millis)) results.goToSleep(millis - elapsed);
  if ((elapsed >= millis) && raiseTimeoutException) throw new TimeoutException("timeout expired");
The big problem here is results.goToSleep(millis - elapsed) which performs a synchronized call to results.wait(...). If the job has already completed before the while loop is executed, then the thread gets stuck forever. Instead we should do a results.goToSleep(1L).

As a side note, this fix is already implemented in 5.0, 5.1 and in the trunk
Comment posted by
Oct 18, 09:02
Fixed in: