JPPF Issue Tracker
star_faded.png
Please log in to bookmark issues
bug_report_small.png
CLOSED  Bug report JPPF-498  -  Client reconnect to driver failure
Posted Apr 05, 2017 - updated Apr 07, 2017
icon_info.png This issue has been closed with status "Closed" and resolution "RESOLVED".
Issue details
  • Type of issue
    Bug report
  • Status
     
    Closed
  • Assigned to
     lolo4j
  • Progress
       
  • Type of bug
    Not triaged
  • Likelihood
    Not triaged
  • Effect
    Not triaged
  • Posted by
     steve4j
  • Owned by
    Not owned by anyone
  • Category
    J2EE
  • Resolution
    RESOLVED
  • Priority
    Critical
  • Reproducability
    Always
  • Severity
    Critical
  • Targetted for
    icon_milestones.png JPPF 5.2.7
Issue description
We Testet this on 2 different machines.

Here the mixed output of client and node:

<Apr 5, 2017, 11:29:36,582 AM MESZ> <Warning> <org.jppf.client.ClassServerDelegateImpl> <BEA-000000> <[jppf_discovery-1-4 - ClassServer] caught java.io.EOFException: could only read 0 bytes out of 4, will re-initialise ...> 
[client: jppf_discovery-1-4 - ClassServer] Attempting connection to the class server at FDAC8040.HR.LOCAL:11111
Apr 05, 2017 11:29:36 AM org.jppf.client.ClassServerDelegateImpl run
WARNING: [jppf_discovery-1-1 - ClassServer] caught java.io.EOFException: could only read 0 bytes out of 4, will re-initialise ...
Apr 05, 2017 11:29:36 AM org.jppf.client.ClassServerDelegateImpl init
INFO: [client: jppf_discovery-1-1 - ClassServer] Attempting connection to the class server at FDAC8040.HR.LOCAL:11111
Apr 05, 2017 11:29:36 AM org.jppf.client.ClassServerDelegateImpl run
WARNING: [jppf_discovery-1-5 - ClassServer] caught java.io.EOFException: could only read 0 bytes out of 4, will re-initialise ...
Apr 05, 2017 11:29:36 AM org.jppf.client.ClassServerDelegateImpl init
INFO: [client: jppf_discovery-1-5 - ClassServer] Attempting connection to the class server at FDAC8040.HR.LOCAL:11111
Apr 05, 2017 11:29:36 AM org.jppf.client.ClassServerDelegateImpl run
WARNING: [jppf_discovery-1-2 - ClassServer] caught java.io.EOFException: could only read 0 bytes out of 4, will re-initialise ...
Apr 05, 2017 11:29:36 AM org.jppf.client.ClassServerDelegateImpl init
INFO: [client: jppf_discovery-1-2 - ClassServer] Attempting connection to the class server at FDAC8040.HR.LOCAL:11111
Apr 05, 2017 11:29:36 AM org.jppf.client.ClassServerDelegateImpl run
WARNING: [jppf_discovery-1-3 - ClassServer] caught java.io.EOFException: could only read 0 bytes out of 4, will re-initialise ...
Apr 05, 2017 11:29:36 AM org.jppf.client.ClassServerDelegateImpl init
INFO: [client: jppf_discovery-1-3 - ClassServer] Attempting connection to the class server at FDAC8040.HR.LOCAL:11111
Apr 05, 2017 11:29:36 AM org.jppf.client.ClassServerDelegateImpl run
WARNING: [jppf_discovery-1-4 - ClassServer] caught java.io.EOFException: could only read 0 bytes out of 4, will re-initialise ...
Apr 05, 2017 11:29:36 AM org.jppf.client.ClassServerDelegateImpl init
INFO: [client: jppf_discovery-1-4 - ClassServer] Attempting connection to the class server at FDAC8040.HR.LOCAL:11111
Attempting connection to the class server at 10.225.120.160:11111
process exited with code 0
Apr 05, 2017 11:29:39 AM GenericClientCommunicatorAdmin close
INFO: java.io.IOException: The connection is not currently established.
java.io.IOException: The connection is not currently established.
 at com.sun.jmx.remote.generic.ClientSynchroMessageConnectionImpl.checkState(ClientSynchroMessageConnectionImpl.java:514)
  at com.sun.jmx.remote.generic.ClientSynchroMessageConnectionImpl.sendOneWay(ClientSynchroMessageConnectionImpl.java:217)
  at javax.management.remote.generic.GenericConnector.close(GenericConnector.java:292)
  at javax.management.remote.generic.GenericConnector.close(GenericConnector.java:265)
  at javax.management.remote.generic.GenericClientCommunicatorAdmin.doStop(GenericClientCommunicatorAdmin.java:145)
 at com.sun.jmx.remote.opt.internal.ClientCommunicatorAdmin.restart(ClientCommunicatorAdmin.java:238)
  at com.sun.jmx.remote.opt.internal.ClientCommunicatorAdmin.gotIOException(ClientCommunicatorAdmin.java:133)
 at javax.management.remote.generic.GenericConnector$RequestHandler.execute(GenericConnector.java:372)
 at com.sun.jmx.remote.generic.ClientSynchroMessageConnectionImpl$RemoteJob.run(ClientSynchroMessageConnectionImpl.java:477)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
  at java.lang.Thread.run(Thread.java:745)
resetting with stopJmx=true
node process id: 10724, uuid: A647D6AA-9961-27FF-7EB9-E7F7C8B5E311
Attempting connection to the class server at localhost:11111
resetting with stopJmx=true
node process id: 10724, uuid: A647D6AA-9961-27FF-7EB9-E7F7C8B5E311
Attempting connection to the class server at localhost:11111
resetting with stopJmx=true
node process id: 10724, uuid: A647D6AA-9961-27FF-7EB9-E7F7C8B5E311
Attempting connection to the class server at localhost:11111
resetting with stopJmx=true
node process id: 10724, uuid: A647D6AA-9961-27FF-7EB9-E7F7C8B5E311
Attempting connection to the class server at localhost:11111
resetting with stopJmx=true
node process id: 10724, uuid: A647D6AA-9961-27FF-7EB9-E7F7C8B5E311
Attempting connection to the class server at localhost:11111
resetting with stopJmx=true
node process id: 10724, uuid: A647D6AA-9961-27FF-7EB9-E7F7C8B5E311
Attempting connection to the class server at localhost:11111
resetting with stopJmx=true
node process id: 10724, uuid: A647D6AA-9961-27FF-7EB9-E7F7C8B5E311
Attempting connection to the class server at localhost:11111
resetting with stopJmx=true
node process id: 10724, uuid: A647D6AA-9961-27FF-7EB9-E7F7C8B5E311
Attempting connection to the class server at localhost:11111
resetting with stopJmx=true
node process id: 10724, uuid: A647D6AA-9961-27FF-7EB9-E7F7C8B5E311
Attempting connection to the class server at 10.225.120.160:11111
[client: jppf_discovery-1-1 - ClassServer] Reconnected to the class server
[client: jppf_discovery-1-4 - ClassServer] Reconnected to the class server
[client: jppf_discovery-1-3 - ClassServer] Reconnected to the class server
[client: jppf_discovery-1-5 - ClassServer] Reconnected to the class server
Apr 05, 2017 11:30:41 AM org.jppf.client.ClassServerDelegateImpl init
INFO: [client: jppf_discovery-1-1 - ClassServer] Reconnected to the class server
Apr 05, 2017 11:30:41 AM org.jppf.client.ClassServerDelegateImpl init
INFO: [client: jppf_discovery-1-4 - ClassServer] Reconnected to the class server
Apr 05, 2017 11:30:41 AM org.jppf.client.ClassServerDelegateImpl init
INFO: [client: jppf_discovery-1-3 - ClassServer] Reconnected to the class server
Apr 05, 2017 11:30:41 AM org.jppf.client.ClassServerDelegateImpl init
INFO: [client: jppf_discovery-1-5 - ClassServer] Reconnected to the class server
Apr 05, 2017 11:30:41 AM org.jppf.client.ClassServerDelegateImpl init
INFO: [client: jppf_discovery-1-2 - ClassServer] Reconnected to the class server
[client: jppf_discovery-1-2 - ClassServer] Reconnected to the class server
RemoteClassLoaderConnection: Reconnected to the class server
JPPF Node management initialized on port 12001
Attempting connection to the node server at 10.225.120.160:11111
Reconnected to the node server
Node successfully initialized
Apr 05, 2017 11:31:38 AM org.jppf.client.balancer.ChannelWrapperRemote$RemoteRunnable run
WARNING: java.net.SocketException: Software caused connection abort: socket write error
Apr 05, 2017 11:31:38 AM org.jppf.client.balancer.ChannelWrapperRemote$RemoteRunnable run
SEVERE: future already removed
java.lang.IllegalStateException: future already removed
  at org.jppf.client.balancer.ClientJob.taskCompleted(ClientJob.java:327)
 at org.jppf.client.balancer.ClientTaskBundle.taskCompleted(ClientTaskBundle.java:174)
 at org.jppf.client.balancer.ChannelWrapperRemote$RemoteRunnable.run(ChannelWrapperRemote.java:250)
  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
  at java.lang.Thread.run(Thread.java:745)
 
Apr 05, 2017 11:31:38 AM org.jppf.client.TaskServerConnectionHandler init
INFO: [client: jppf_discovery-1-3 - TasksServer] Attempting connection to the task server at FDAC8040.HR.LOCAL:11111
Apr 05, 2017 11:31:38 AM org.jppf.client.TaskServerConnectionHandler init
INFO: [client: jppf_discovery-1-3 - TasksServer] Reconnected to the JPPF task server
<Apr 5, 2017, 11:31:38,371 AM MESZ> <Warning> <org.jppf.client.balancer.ChannelWrapperRemote> <BEA-000000> <java.net.SocketException: Software caused connection abort: socket write error> 
[client: jppf_discovery-1-3 - TasksServer] Attempting connection to the task server at FDAC8040.HR.LOCAL:11111
<Apr 5, 2017, 11:31:38,385 AM MESZ> <Error> <org.jppf.client.balancer.ChannelWrapperRemote> <BEA-000000> <future already removed
java.lang.IllegalStateException: future already removed
 at org.jppf.client.balancer.ClientJob.taskCompleted(ClientJob.java:327)
 at org.jppf.client.balancer.ClientTaskBundle.taskCompleted(ClientTaskBundle.java:174)
 at org.jppf.client.balancer.ChannelWrapperRemote$RemoteRunnable.run(ChannelWrapperRemote.java:250)
  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
  Truncated. see log file for complete stacktrace
> 
[client: jppf_discovery-1-3 - TasksServer] Reconnected to the JPPF task server


Steps to reproduce this issue
  1. Install the Test WebLogic Client in a WebLogic 12c
  2. Install a Node and Driver
  3. Start a Submit job over the Web
  4. Stop the Driver
  5. Wait like 10s
  6. Start the Driver
  7. Try to Submit a job over the Web

#3
Comment posted by
 lolo4j
Apr 06, 09:27
I reproduced with Weblogic 12c, but also with a standalone client, using the following JUnit test:
public class TestJPPFDriverAdminMBean2 extends Setup1D1N1C {
  @Test(timeout = 10000)
  public void testRestartDriver() throws Exception {
    int nbTasks = 1;
    long duration = 1L;
    final JMXDriverConnectionWrapper driver = BaseSetup.getJMXConnection(client);
    print(false, false, "submitting job 1");
    List<Task<?>> results = client.submitJob(BaseTestHelper.createJob(ReflectionUtils.getCurrentMethodName() + "-1", true, false, nbTasks, LifeCycleTask.class, duration));
    assertNotNull(results);
    assertEquals(nbTasks, results.size());
    for (Task<?> task: results) {
      assertNotNull(task);
      assertNotNull(task.getResult());
      assertNull(task.getThrowable());
    }
    print(false, false, "restarting driver");
    new Thread(new Runnable() {
      @Override
      public void run() {
        try {
          driver.restartShutdown(1L, 1000L);
        } catch (Exception e) {
          e.printStackTrace();
        }
      }
    }).start();
    JPPFConnectionPool pool;
    print(false, false, "waiting for 0 connection");
    while (!client.findConnectionPools(JPPFClientConnectionStatus.ACTIVE, JPPFClientConnectionStatus.EXECUTING).isEmpty()) Thread.sleep(10L);
    print(false, false, "waiting for 1 connection");
    while ((pool = client.awaitWorkingConnectionPool()) == null) Thread.sleep(10L);
    print(false, false, "submitting job 2");
    results = client.submitJob(BaseTestHelper.createJob(ReflectionUtils.getCurrentMethodName() + "-2", true, false, nbTasks, LifeCycleTask.class, duration));
    assertNotNull(results);
    assertEquals(nbTasks, results.size());
    for (Task<?> task: results) {
      assertNotNull(task);
      assertNotNull(task.getResult());
      assertNull(task.getThrowable());
    }
  }
}
#4
Comment posted by
 lolo4j
Apr 06, 09:38
A file was uploaded. Generated logs from TestJPPFDriverAdminMBean2 testicon_open_new.png This comment was attached:

Attached full set of logs generated by the TestJPPFDriverAdminMBean2 test JUnit test
#6
Comment posted by
 lolo4j
Apr 07, 07:30
Fixed in:
#7
Comment posted by
 lolo4j
Apr 07, 08:00
Released the fix in patch 01 for jPPF 5.2.6