GPU cluster

A GPU cluster is a computer cluster in which each node is equipped with a graphics processing unit (GPU). By harnessing the computational power of modern GPUs via general-purpose computing on graphics processing units (GPGPU), very fast calculations can be performed with a GPU cluster.

Hardware (GPU)

The hardware classification of GPU clusters fall into two categories: Heterogeneous and Homogeneous.

Heterogeneous

Hardware from both of the major IHV's can be used (AMD and NVIDIA). Even if different models of the same GPU are used (e.g. 8800GT mixed with 8800GTX) the GPU cluster is considered heterogeneous.

Homogeneous

Every single GPU is of the same hardware class, make, and model. (i.e. a homogeneous cluster comprising 100 8800GTs, all with the same amount of memory)

Classifying a GPU cluster according to the above semantics largely directs software development on the cluster, as different GPUs have different capabilities that can be utilized.

Hardware (Other)

Interconnect

In addition to the computer nodes and their respective GPUs, a fast enough interconnect is needed in order to shuttle data amongst the nodes. The type of interconnect largely depends on the number of nodes present. Some examples of interconnects include Gigabit Ethernet and InfiniBand.

Vendors

NVIDIA provides a list of dedicated Tesla Preferred Partners (TPP) with the capability of building and delivering a fully configured GPU cluster using the Tesla 20-series GPGPUs. AMAX Information Technologies, Dell, Hewlett-Packard and Silicon Graphics are some of the few companies that provide a complete line of GPU clusters and systems.^[1]

Software

The software components that are required to make many GPU-equipped machines act as one include:

Operating System
GPU driver for the each type of GPU present in each cluster node.
Clustering API (such as the Message Passing Interface, MPI).
VirtualCL (VCL) cluster platform [1] is a wrapper for OpenCL™ that allows most unmodified applications to transparently utilize multiple OpenCL devices in a cluster as if all the devices are on the local computer.

Algorithm mapping

Mapping an algorithm to run a GPU cluster is somewhat similar to mapping an algorithm to run on a traditional computer cluster. Example: rather than distributing pieces of an array from RAM, a texture is divided up amongst the nodes of the GPU cluster.

References and external links

Are Magnus Bruaset, Aslak Tveito (2006). Numerical Solution of Partial Differential Equations on Parallel Computers. Birkhäuser. ISBN 3-540-29076-1.
NCSA's Accelerator Cluster
GPU Clusters for High-Performance Computing
GPU cluster at STFC Daresbury Laboratory
GPU Cores Temperature Monitoring

^ http://www.nvidia.com/object/tesla_wtb.html

[1] ttp://www.nvidia.com/object/tesla_wtb.html

[1]