JPPF, java, parallel computing, distributed computing, grid computing, parallel, distributed, cluster, grid, cloud, open source, android, .net
JPPF, java, parallel computing, distributed computing, grid computing, parallel, distributed, cluster, grid, cloud, open source, android, .net
JPPF

The open source
grid computing
solution

 Home   About   Features   Download   Documentation   On Github   Forums 

Driver and node JVM health monitoring MBean

From JPPF 6.2 Documentation

Jump to: navigation, search

Contents

Main Page > Management and monitoring > Health monitoring


The JPPF management APIs provide some basic abilties to monitor the JVM of remote nodes and drivers in a Grid.

These capabilities include:

  • memory usage (heap and non-heap)
  • CPU load
  • count of live threads and deadlock detection
  • triggering remote thread dumps and displaying them locally
  • triggering remote garbage collections
  • triggering remote heap dumps
  • customizable health and diagnostic indicators


These features are available via the built-in MBean interface DiagnosticsMBean, defined as follows:

public interface DiagnosticsMBean {
  // The name of this MBean in a driver
  String MBEAN_NAME_DRIVER = "org.jppf:name=diagnostics,type=driver";
  // The name of this MBean in a node
  String MBEAN_NAME_NODE = "org.jppf:name=diagnostics,type=node";
  
  // Get the memory usage info for the whole JVM
  MemoryInformation memoryInformation() throws Exception;
  // Perform a garbage collection, equivalent to System.gc()
  void gc() throws Exception;
  // Get a full thread dump, including detection of deadlocks
  ThreadDump threadDump() throws Exception;
  // Determine whether a deadlock is detected in the JVM
  Boolean hasDeadlock() throws Exception;
  // Get a summarized snapshot of the JVM health
  HealthSnapshot healthSnapshot() throws Exception;
  // Trigger a heap dump of the JVM
  String heapDump() throws Exception;
  // Get an approximation of the current CPU load
  Double cpuLoad();
}

Example usage:

// connect to the driver's JMX server
JMXDriverConnectionWrapper driver = new JMXDriverConnectionWrapper(driverHost, driverPort);
driver.connectAndWait(5000L);
// obtain a proxy to the diagnostics MBean
DiagnosticsMBean diagnosticsMBean = driver.getDiagnosticsProxy();
// get a thread dump of the remote JVM
ThreadDump threadDump = diagnosticsMBean.threadDump();
// format the thread dump as easily readable text
String s = TextThreadDumpWriter.printToString(tdump, "driver thread dump");
System.out.println(s);

The MBean interface is exactly the same for nodes and drivers, only the MBean name varies.

1 Memory usage

A detailed memory usage can be obtained by calling the method DiagnosticsMBean.memoryInformation(). This method returns an instance of MemoryInformation, defined as follows:

public class MemoryInformation implements Serializable {
  // Get the heap memory usage
  public MemoryUsageInformation getHeapMemoryUsage()

  // Get the non-heap memory usage
  public MemoryUsageInformation getNonHeapMemoryUsage()
}

Both heap and non-heap usage are provided as instances of the class MemoryUsageInformation:

public class MemoryUsageInformation implements Serializable {
  // Get the initial memory size
  public long getInit()

  // Get the current memory size
  public long getCommitted()

  // Get the used memory size
  public long getUsed()

  // Get the maximum memory size
  public long getMax()

  // Return the ratio of used memory over max available memory
  public double getUsedRatio()
}

2 Thread dumps

You can trigger and obtain a full thread dump of the remote JVM by calling DiagnosticsMBean.threadDump(). This method returns an instance of ThreadDump. Please see the Javadoc for full details of this class.

JPPF also provides facilities to easily translate thread dumps into readable formats. There are two classes that you can use to print a heap dump to a character stream:


Both implement the ThreadDumpWriter interface, defined as follows:

public interface ThreadDumpWriter extends Closeable {
  // Print the specified string without line terminator
  void printString(String message);

  // Print the deadlocked threads information
  void printDeadlocks(ThreadDump threadDump);

  // Print information about a thread
  void printThread(ThreadInformation threadInformation);

  // Print the specified thread dump
  void printThreadDump(ThreadDump threadDump);
}

Each of these classes provides a static method to print a thread dump directly into a String:

  • static String TextThreadDumpWriter.printToString(ThreadDump tdump, String title)
  • static String HTMLThreadDumpWriter.printToString(ThreadDump tdump, String title)


Example usage:

// get a thread dump from a remote node or driver
DiagnosticsMBean diagnostics = ...;
ThreadDump tdump = diagnostics.threadDump();
// we will print it to an HTML file
FileWriter fileWriter = new FileWriter("MyThreadDump.html");
HTMLThreadDumpWriter htmlPrinter = new HTMLThreadDumpWriter(fileWriter, "My Title");
htmlPrinter.printTreadDump(tdump);
// close the underlying writer
htmlPrinter.close();

Here is an example of the output, as rendered in the JPPF administration console:

tdump.gif

3 Health snapshots

You can obtain a summarized snapshot of the JVM state by calling DiagnosticsMBean.healthSnapshot(), which returns an object of type HealthSnapshot, defined as follows:

public class HealthSnapshot implements Serializable
  // Get all key / value pairs representing monitoring data in this snapshot
  public TypedProperties getProperties()
  // Get the value of a property as an int
  public int getInt(String name)
  // Get the value of a property as a long
  public long getLong(String name)
  // Get the value of a property as a float
  public float getFloat(String name)
  // Get the value of a property as a double
  public double getDouble(String name)
  // Get the value of a property as a boolean
  public boolean getBoolean(String name)
  // Get the value of a property as a String
  public String getString(String name)
  // Get the value of a property as a String
  public String getDurationString(String name)
}

The complete list of properties available in the snapshots is obtained by invoking the getProperties() method. Some of these properties may be brought by custom monitoring data providers and therefore cannot be known in advance. JPPF provides a number of built-in properties, which are enumerated in the MonitoringConstants class.

Example usage:

JMXNodeConnectionWrapper node = ...;
DiagnosticsMBean diag = node.getDiagnosticsProxy();
// get a snapshot of the node's health
HealthSnapshot snapshot = diag.healthSnapshot();
// retrieve and print the node's system cpu load in %
double cpuLoad = snapshot.getDouble(MonitoringConstants.SYSTEM_CPU_LOAD);
System.out.println("node cpu load is " + cpuLoad + " %");

4 Introspecting the properties of the health snapshots

The method DiagnosticsMBean.getMonitoringDataProperties() provides metadata about the properties found in health snapshots. It returns a list of JPPFProperty instances, as described in the JPPF configuration API section.

Example usage:

JMXDriverConnectionWrapper driver = ...;
DiagnosticsMBean diag = driver.getDiagnosticsProxy();
// get the descriptors for all available monitoring data properties
List<JPPFProperty<?>> properties = diag.getMonitoringDataProperties();
// print the name and type of each property
for (JPPFProperty<?> property: properties) {
  System.out.println("property " + property.getName() + " is a " + property.valueType());
}

5 CPU load

The CPU load of a remote JVM can be obtained separately by calling DiagnosticsMBean.cpuLoad(). This method returns an approximation of the latest computed CPU load in the JVM. It is important to understand that the CPU load is not computed each time this method is called. Instead, it is computed at regular intervals, and the latest computed value is returned. The purpose of this is to prevent the computation itself from using up too much CPU time, which would throw off the computed value.

The actual computed value is equal to SUMi{cpuTime(threadi)} / interval, for all the live threads of the JVM at the time of the computation. Thus, errors may occur, since many threads may have been created then died between two computations. However, in most cases this is a reasonable approximation that does not tax the CPU too heavily.

The interval between computations can be adjusted by setting the following property in a node or driver configuration: jppf.cpu.load.computation.interval = time_in_millis. If unspecified, the default value is 1000 ms.

Example usage:

JMXDriverConnectionWrapper driver = ...;
DiagnosticsMBean diag = driver.getDiagnosticsProxy();
// get the cpu load and convert to percentage
double cpuPercent = 100d * diag.cpuLoad();
// format with a precision of 2 decimals
String formatted = String.format("%.2f", cpuPercent);
System.out.println("the driver CPU load is " + formatted + " %");

6 Deadlock indicator

To determine whether a JVM has deadlocked threads, simply call DiagnosticsMBean.hasDeadlock(). This method is useful if you only need that information, without having to process an entire thread dump, which represents a significant overhead, especially from a network perspective.

Example usage:

DiagnosticsMBean diag = ...;
// check whether the ndoe has a deadlock
if (diag.hasDeadlock()) {
  // print the thread dump, including the deadlock information
  ThreadDump threadDump = diag.threadDump();
  System.out.println(threadDump.toPlainTextString("deadlock information"));
}

7 Triggering a heap dump

Remotely triggering a JVM heap dump is done by calling DiagnosticsMBean.heapDump(). This method is JVM implementation-dependent, as it relies on non-standard APIs. Thus, it will only work with the Oracle standard and JRockit JVMs, along with the IBM JVM. The returned value is a description of the outcome: it will contain the name of the genreated heap dump file for Oracle JVMs, and a success indicator only for IBM JVMs.

8 Triggering a garbage collection

A remote garbage collection can be triggered by callling DiagnosticsMBean.gc(). This method remotely calls System.gc() and thus has the exact same semantics and constraints.

Main Page > Management and monitoring > Health monitoring

JPPF Copyright © 2005-2020 JPPF.org Powered by MediaWiki