Pawsey Supercomputing Centre is welcoming a new GPU cluster, Garrawarla, a critical resource for the Murchison Widefield Array (MWA) radio telescope project in Australia, a precursor to the Square Kilometre Array. Procured by CSIRO, Australia’s national science agency, from HPE in early 2020 at a cost of $2 million, Garrawarla is now fully operational and available to astronomy researchers. The upgrade is one part of the $70 million Capital Refresh project funded by the Australian government.
Garrawarla will enable astronomers to process in excess of 30 PB of MWA telescope data using Pawsey infrastructure. “Garrawarla” means spider in the language of the Wajarri people, whose land the Murchison Radio-astronomy Observatory is on. The 546 teraflops (peak) system offers GPU capabilities to power computational work, machine learning workflows and data analytics, all in support of accelerating astronomy research.
Garrawarla spans 78 HPE XL190 Gen10 server nodes, each equipped with two Intel Xeon 6230 Cascade Lake processors (2Ă—20 cores running at 2.1 GHz) and a single Nvidia V100 GPU with 32 GB of high-bandwidth memory, plus 960 GB of local NVMe storage and 384 GB of main memory. Housed within two racks, nodes are connected by HDR100 InfiniBand.
Software experts are facilitating the transfer of MWA workloads over to the new cluster, which will support researchers from 20 organizations across five countries, according to MWA Director Professor Melanie Johnston-Hollitt.
“The MWA collects a petabyte of data a month, and we’ve already archived over 30 petabytes of data at Pawsey,” said Johnston-Hollitt. “Having a dedicated system means that we can work with Pawsey to finetune the system and maximise the efficiency of our workflows. Garrawarla and Pawsey’s expertise will accelerate our investigation and processing of this data, helping us to reveal more about the origins and structure of the Universe.”
Early adopters are benefiting from faster processing capability, noting speedups of two to eight times versus a previous HPC resource.
CSIRO has also contracted with HPE to procure beefier ingest nodes for the Australian Square Kilometre Array Pathfinder (ASKAP) project at a cost of $1 million.
Sixteen ASKAP ingest nodes are being replaced to meet the torrents of data coming off the telescopes. In the coming months, HPE will provide 18 data ingest nodes (16 plus two spares), powered by single AMD Epyc 7402P “Rome” processor (24 cores, operating at 2.8 Ghz). Three additional general purpose nodes, using the same AMD SKUs, will be available for running support services. A dedicated ClusterStor E1000 Lustre Appliance will provide approximately half a petabyte of NVMe storage, capable of 160 Gbps read and 120 Gbps write performance.
“The ASKAP ingest nodes are one of the most critical components of the pipeline between ASKAP and the Pawsey data store, which houses the telescope’s final data products,” reports Pawsey. “The nodes are responsible for receiving data in real time from the correlators located at CSIRO’s Murchison Radio-astronomy Observatory (MRO) in the Mid West region of Western Australia, and writing the data to disk for processing on Pawsey’s Galaxy supercomputer.”
Garrawarla will take some of the load off of Galaxy, Pawsey’s real-time supercomputing system dedicated to radio astronomy.
The larger $70 million HPC refresh project is being carried out as a series of upgrades to Pawsey’s supercomputing infrastructure. The funding follows a $80 million investment made in 2009 to establish a petascale supercomputing facility.
Current systems Magnus (a Cray XC40) and Galaxy (a Cray XC30) will be transitioned over to an exponentially more powerful next-generation Pawsey supercomputer in two phases. Phase one is scheduled to become operational this year, and a larger phase two expansion is on track to be deployed in mid-2022.
Investments are also being made in remote visualization, network interconnect, long-term storage and cloud high-throughput computing (as laid out in this PDF).
Pawsey leaders see Garrawarla and the ASKAP ingest nodes as major milestones that reflect the soaring data processing demands of advanced astronomical instruments.