Summit (supercomputer)


Summit or OLCF-4 is a supercomputer developed by IBM for use at Oak Ridge National Laboratory, capable of 200 petaFLOPS, making it the fastest supercomputer in the world from November 2018 to June 2020. Its current LINPACK benchmark is clocked at 148.6 petaFLOPS. As of November 2019, the supercomputer is also the 5th most energy efficient in the world with a measured power efficiency of 14.668 gigaFLOPS/watt. Summit is the first supercomputer to reach exaflop speed, achieving 1.88 exaflops during a genomic analysis and is expected to reach 3.3 exaflops using mixed precision calculations.

History

The United States Department of Energy awarded a $325 million contract in November 2014 to IBM, Nvidia and Mellanox. The effort resulted in construction of Summit and Sierra. Summit is tasked with civilian scientific research and is located at the Oak Ridge National Laboratory in Tennessee. Sierra is designed for nuclear weapons simulations and is located at the Lawrence Livermore National Laboratory in California. Summit is estimated to cover the space of two basketball courts and require 136 miles of cabling. Researchers will utilize Summit for diverse fields such as cosmology, medicine and climatology.
In 2015, the project called Collaboration of Oak Ridge, Argonne and Lawrence Livermore included a third supercomputer named Aurora and was planned for installation at Argonne National Laboratory. By 2018, Aurora was re-engineered with completion anticipated in 2021 as an exascale computing project along with Frontier and El Capitan to be completed shortly thereafter.

Uses

The Summit supercomputer provides scientists and researchers the opportunity to solve complex tasks in the fields of energy, artificial intelligence, human health and other research areas.

Design

Each one of its 4,608 nodes has over 600 GB of coherent memory which is addressable by all CPUs and GPUs plus 800 GB of non-volatile RAM that can be used as a burst buffer or as extended memory. The POWER9 CPUs and Volta GPUs are connected using Nvidia's high speed NVLink. This allows for a heterogeneous computing model. To provide a high rate of data throughput, the nodes will be connected in a non-blocking fat-tree topology using a dual-rail Mellanox EDR InfiniBand interconnect for both storage and inter-process communications traffic which delivers both 200Gb/s bandwidth between nodes and in-network computing acceleration for communications frameworks such as MPI and SHMEM/PGAS.