December 23, 2024

Brighton Journal

Complete News World

Over 63K Xeon GPU Max, 21K Xeon CPU Max

Over 63K Xeon GPU Max, 21K Xeon CPU Max

Intel and Argonne National Laboratory declare Successful installation of the blade into the Aurora supercomputer, bringing it one step closer to full functionality.

The Intel-based Aurora supercomputer boasts ExaFLOPS 2 computing power, which could push AMD’s limits

The Aurora supercomputer has had many delays since its inception, but we may finally see it in action. For those unaware, the Aurora supercomputer features Intel’s Xeon CPU Max and Xeon GPU Max series, bumping its performance up to 2 ExaFLOPS. One application of the Aurora platform will be to provide a modern artificial intelligence model for science.

It provides 10,624 nodes comprising 21,248 Xeon CPUs of the Sapphire-Rapid SP lineup. It comes with a total of 63,744 GPUs based on the Ponte Vecchio design, which enables it to deliver a peak injection rate of 2.12 PB/s and a maximum partition bandwidth of 0.69 PB/s.

Here’s how the Intel-powered Aurora supercomputer has an advantage, as previously explained by Jeff McVeigh, Vice President of Intel Super Compute Group:

  • Intel’s data center GPU Max Series outperforms the Nvidia H100 PCIe card by an average of 30% on diverse workloads1, while independent software vendor Ansys shows up to 50% acceleration for its Max Series GPU on the H100 on AI-accelerated HPC applications.
  • The Xeon Max Series CPU, the only x86 processor with high bandwidth memory, shows a 65% improvement over AMD’s Genoa processor on the HPCG 1 benchmark, using less power. High memory bandwidth has been noted as among the most desirable features for HPC customers.
  • The 4th generation Intel Xeon Scalable processors — the most widely used in HPCs — offer an average speed of 50% over AMD’s Milan4, and power company BP’s latest Xeon HPC range delivers an 8x performance increase over previous-generation processors while improving power efficiency.
  • Gaudi2’s deep learning accelerator performs competitively in deep learning training and inference, with performance up to 2.4 times faster than the Nvidia A100.

In terms of memory capacity, the Aurora supercomputer features 10.9PB of DDR5 memory for DDR5, 1.36PB of HBM through CPUs, and 8.16PB of HBM through GPUs. Moreover, it uses an arrangement of 1024 storage nodes which provides a total capacity of 220 TB. If you’re curious about how to use this giant system, here’s a quick explanation:

From tackling climate change to finding cures for deadly diseases, researchers face daunting challenges that require advanced computing technologies on a massive scale. Aurora is poised to meet the needs of the HPC and AI communities, providing the tools to push the boundaries of scientific exploration.

Intel’s latest data center GPU Max 1550 series, running on Aurora, delivers the best SimpleFOMP performance, outperforming NVIDIA A100 and AMD Instinct MI250X accelerators. However, the supercomputer has not yet passed the initial test. After that, it’s expected to appear on the Top500.org list, potentially overtaking the AMD-powered Frontier supercomputer. The Aurora supercomputer is on track to be fully operational by this year.