What does the sample do?
This sample performs the multplication of 2 square dense matrices on a GPU.
The GPU computation is handled thanks to the APARAPI
library, which provides Java bindings for OpenCL
How does it work?
This sample submits a JPPF job with one or more tasks to be executed on a GPU.
The tasks contain APARAPI-conformant code, whose bytecode is introspected at runtime to generate OpenCL code (see example here
This generated code is then compiled and executed on an OpenCL device if any is available.
How do I run it?
Before running this sample application, you must have a JPPF server and at least one node running.
For information on how to set up a node and server, please refer to the JPPF documentation
The node will require some additional configuration. In effect, since the APARAPI library load a native library, the file "aparapi.jar" must be added directly to the node's classpath.
If you simply keep it in the client's classpath, the node will attempt to load it for each distinct client. This will only work the first time, and fail on subsequent attempts.
For your convenience, we have included a set of files that will take care of this:
- copy the file jppf-node.properties in GPU/config/node to your node's config/ folder to replace the existing config file with the new version.
You will notice that this configuration file has APARAPI-specific settings in the jppf.jvm.options property
- copy the file GPU/lib/aparapi.jar to your node's lib/ folder
- copy the appropriate native library from GPU/lib/ for your platform to the node's lib/ folder:
- aparapi_x86.dll or aparapi_x86_64.dll for Windows 32/64 bits platforms
- libaparapi_x86.so or libaparapi_x86_64.so for Linux 32/64 bits
- libaparapi_x86_64.dylib for 64 bits Mac OS
- Once this is done, you can start the server and node, then run the sample by typing "run.bat" on Windows or "./run.sh" on Linux/Unix
- During the execution, the node will print out a message indicating whether the task was actually executed on a GPU, plus additional information on the OpenCL devices available to the platform
- if the task cannot be executed on a GPU, if will fall back to executing in a Java thread pool (i.e. CPU-bound)
How do I use it?
This sample doesn't have a graphical user interface, however you can modify some of the parameters in the JPPF configuration file:
- open the file "config/jppf-client.properties" in a text editor
- at the end of the file, you will see the following properties:
# number of jobs to submit in sequence
iterations = 10
# number of tasks in each job
tasksPerJob = 1
# the size of the matrices to multiply
matrixSize = 1500
# execution mode, either GPU or JTP (Java Thread Pool)
execMode = GPU
- You can experiment with various values, for instance to find out the JTP vs. GPU execution speedup.
You may find out that for relatively small values of the matrix size there is no speedup, due to the overhead of generating and compiling the OpenCL code
Sample's source files
- MatrixKernel.java: this is the class that will be translated into OpenCL code
- GeneratedOpenCL.c: the OpenCL code generated from MatrixKernel which will be actually executed on the GPU
- AparapiTask.java: the JPPF task which invokes the GPU bindings API
- AparapiRunner.java: the JPPF client application which submits the jobs to the grid
- SquareMatrix.java: a simple representation of a square dense matrix, whose values are stored in a one-dimensional float array
How can I build the sample?
To compile the source code, from a command prompt, type: "ant compile"
To generate the Javadoc, from a command prompt, type: "ant javadoc"
I have additional questions and comments, where can I go?
If you need more insight into the code of this demo, you can consult the source, or have a look at the
In addition, There are 2 privileged places you can go to: