What does it do?
From JPPF 6.2 Documentation
|
Main Page > Load Balancing > What does it do? |
1 Definition
In JPPF, the goal of a load-balancer is to compute the distribution of the tasks in a job, to one or more drivers on the client side, or to one or more nodes on the server side, in order to optimize the performance of the job's execution.
Practically, a load-balancer will determine how the tasks in a job are split into multiple disjoint subsets, where each subset is sent to a separate driver or node. Since the tasks in a job are known and ordered, it is enough for the load-balancer to compute the size of the subsets.
In JPPF terminology, the tasks subsets are called task bundles or just bundles, and the code that computes the bundle size is called a load-balancing algorithm or just algorithm or bundler from there on. JPPF provides a number of built-in algorithms, along with an API that allows you to define your own algorithms.
2 Impact on grid resources usage
Beyond splitting jobs into tasks bundles to send to the nodes, the load-balancer has a significant impact on how the resources in the grid infrastructure will be used:
- shape of the network traffic: sending the tasks one by one or in larger bundles will directly influence the number and frequency of data packets sent over the network
- CPU utilization: if the number of tasks sent to a node is less than its number of processing threads, then the CPU may be under-utilized
- tasks wait time: contrary to CPU under-utilization, when more tasks than the number of processing threads are sent to a node, then some of the tasks may be waiting for a thread to become available, when they could have been sent to another node instead
- heap/memory usage: the more tasks are sent at once, the more memory they will consume. Limiting the bundle size can help avoid out-of-memory conditions.
3 Server-side vs client-side load balancing
Load-balancing is performed in both the JPPF clients and servers. The question which arises is then: what do we balance against? JPPF introduces the generic notion of execution channel, or just channel, which has a different meaning depending on where it is applied.
For a server, an execution channel can be a connection to a node, or a connection to another server. For example, if server A has 3 nodes connected, server B has 2 nodes, servers A and B are connected to each other, then server A will load-balance against 4 channels (3 nodes + server B) and server B will against 3 channels (2 nodes + server A).
For a client, an execution channel is either a connection to a server or a local executor, knowing that a client can have any combination of one or more connections to a single server, one or more connections to multiple servers or a single local executor. For example, if a client has 3 connections to server A, 2 connections to server B and its local executor enabled, then it will load-balance against 6 execution channels.
Adding to this, let's remember than both clients and servers can handle multiple jobs concurrently, and that these jobs can vary vastly in how they use grid resources, and we can see that load-balancing is a failrly non-trivial task.
4 Qualitative characteristics of the algorithms
Load-balancing algorithms can compute a bundle size in many different ways, with various levels of complexity and access to multiple sources of information their computations can be based on. To qualify these algorithms for a better understanding of how they work, JPPF uses three distinguishing characteristics:
Static vs. adaptive: a static algorithm always returns the same bundle size for a given (node, job) pair, whereas an adaptive algorithm will adjust the bundle size based on feedaback from, and information on, the node and/or job.
Deterministic vs. heuristic: a deterministic algorithm does not use any random step in its computations. In other words, given the same input and information, it will always return the same value. A heuristic algorithm, on the other hand, will make informed random guesses while exploring the solutions space.
Local vs. global: a local algorithm only computes the bundle size for a single channel, based on information that only applies to this channel (hence the locality), whereas a global algorithm will recompute or at least impact the bundle size computation on all the available channels.
Main Page > Load Balancing > What does it do? |