![]() Please wait while updating issue type...
Could not save your changes
This issue has been changed since you started editing it
Data that has been changed is highlighted in red below. Undo your changes to see the updated information
You have changed this issue, but haven't saved your changes yet. To save it, press the Save changes button to the right
This issue is blocking the next release
![]() There are no comments
There is nothing attached to this issue
This issue has no duplicates
There are no code checkins for this issue |
|||||||||||||||||||||||||||||
Really delete this comment?
Really delete this comment?
The current design of the provisioning is that the master node only knows its slaves as local processes (as in executables) on the same machine. Basically a java.lang.Process object. When a slave node is closed, it is not shutdown as with the JMX operation, it is actually killed with a Process.destroy() call, the equivalent of (I think) a kill -9 on Linux. So currently the master doesn't know what its slaves are doing. It can just start them, know when they die and kill them. One consequence of that is that the master does not know the JMX port of its slaves, which prevents any management operation from occurring within a master/slaves group.
On the other hand, there is a mechanism which creates a TCP connection between a master and each of its slaves. Each slave reads on that connection, but nothing is ever sent by the master. The intent of this mechanism is that, when a master dies, this will cause an IOException in each slave, which will then automatically close itself. This avoids having Java processes hanging on the local machine. We should be able to expand this mechanism to handle a basic protocol, allowing the master to tell the node whether to shutdown immediately or to wait until the current tasks are complete. Here, toggling the node's active state cannot be done, because this state is only maintained on the server side: it means whether the node is available for job scheduling or not. So we'd need to slightly modify the protocol between node and server, simply adding a flag in the node's response header, to prevent the server from scheduling a new job on that node in the time between the node response being sent and the node actually shutting down.
In comparison, it will be much easier to add an "onTaskComplete" flag to the shutdown() JMX call.
Really delete this comment?
Really delete this comment?
Thanks for considering this.
I didn't think it was appropriate for a maintenance release but the form demanded I enter a target, and 4.3, which I thought it might fit into, was not available. I will expand on the use case of how/why I would find such a feature useful.
In my environment (Rackspace) I pay-per-minute for my nodes, so it is important to me to structure my network in such a way as to shut down nodes as soon as possible when they complete their work. When my queue gets short I want to restructure some servers on which I have excess nodes, in order to make sure each node is using the full CPU to finish the work more quickly (e.g., I run a master+3 slave nodes on a 2GB/2CPU machine, but when the queue gets short I want to reduce that "smoothly" to only 2 nodes; letting the 3rd and 4th nodes work complete without starting new tasks on them.)
At present I have a monitoring process detect when the queue is empty and nodes are idle, and shuts them down. But it would be nice to just do a single loop when the queue empties to tell every node to shut down when it's done. And even better, for nodes to be able to to a final task upon shutting down (sending the API a request to delete themselves, although this can probably be server-side scripted...)
Really delete this comment?
Really delete this comment?
Really delete this comment?
Incidentally, this will make it very easy to implement Feature request JPPF-6 - Improvements for nodes in idle mode
Really delete this comment?