CLAIX / RWTH Compute Cluster

 

The IT Center (formerly the Computing and Communication Center) has been operating high-performance computers (HPC systems) for many years to support research and teaching at RWTH Aachen University. The current system is called CLAIX (Cluster Aix-la-Chapelle) and consists of three parts: the Tier 2 share from the 2016 and 2018 procurement phases and a Tier 3 share. The technical documentation for the RWTH Compute Cluster services provided by the IT Center can be found at: RWTH Compute Cluster.

Tier-2 System CLAIX-2016

The IT Center has selected NEC as supplier for the first expansion stage of the high-performance computer CLAIX (Cluster Aix-la-Chapelle) for RWTH Aachen University. In a procurement process that aimed, in particular, at minimizing the total costs for a collection of representative computational jobs, called RWTH Job Mix, including acquisition costs, execution time, energy and cooling costs, NEC prevailed over its competitors. The new system consists of just over 600 systems equipped with 2x Intel Broadwell processors. Specialized node types with up to 144 computing cores on 1 Terabyte main memory or integrated GPGPUs or NVRAM supplement the system for special tasks. All nodes as well as the parallel Lustre file system with a capacity of 3 petabytes are interconnected through an Omni-Path network with 100 GigaBit/s from Intel. The overall system achieves a computing power of approx. 670 TeraFlop/s.

  Photo of the high-performance computer CLAIX-2018 Copyright: IT Center CLAIX-2018

Tier-2 System CLAIX-2018

In July 2018, NEC was again selected as the supplier for the second expansion stage. CLAIX-2018 consists of 1032 computing nodes with 2x Intel Skylake processors, each with 24 cores and 192 GB RAM. In addition, there are 48 computing nodes of identical architecture, each equipped with two NVIDIA Volta V100 GPUs (incl. NVLink) as accelerators and available for special applications such as machine learning.

For interactive working with the system, CLAIX also has eight additional dialogue systems, which are equipped with the same CPUs but with 384 GB more RAM. All the CLAIX 2018 computing nodes are connected to an Intel Omni-Path 100G network. A high-performance Lustre-based storage system offers a file system capacity of 10 petabytes and a bandwidth of 150 gigabytes/s (read and write), and the parallel file system is available as $HPCWORK.

CLAIX-2018 started in November 2018 in test operation and, since January 2019, the system has been available without restrictions for use for computational projects. As a Tier-2 cluster in the HPC supply pyramid of the Gauß-Allianz in Germany, researchers from all over Germany can apply for computing time on the system.

CLAIX-2018 demonstrates the technological development of the last two years. For the multitude of simulation applications, CLAIX-2018 achieves a significant performance improvement compared to the first stage from 2016. In comparison, the average pro-core performance of the benchmarks of the RWTH job mix increases by 30% using the same data sets. In the list of the 500 fastest high-performance computers worldwide, the system ranked 92nd in November 2018 with a theoretical computing power of 3.55 petaflops, making it the fastest university computer in Germany. In the Green500 ranking of, CLAIX-2018 achieved 51st position.

 

Tier-3 HPC at RWTH Aachen University

Due to their funding structure, Tier-2 systems are not intended for computing time requirements in teaching and learning. In order to close this supply gap, an application was successfully submitted to the federal state of North Rhine-Westphalia (NRW); as a result, in January 2019, an additional 215 computing nodes with 2x Intel Skylake CPUs (24 cores each) as well as six computing nodes with two NVIDIA Volta V100 GPUs each (incl. NVLink) were procured for approx. 2 million euros and integrated into the high-performance computer. These systems, which are identical in construction to CLAIX-2018, form the Tier-3 system for RWTH Aachen University. A basic computing time quota is now available to all employees and students.

 

HPC High Performance Computing – Operational Strategy

The high-performance computer is operated by the IT Center at RWTH Aachen University and is available to members of RWTH and scientists from all over Germany. Following the 1-cluster concept", the operational strategy makes all resources of the cluster available to the users via an interface, so that different expansion stages, innovative architectures and data can be used by means of the same processes.