On Thursday, November 17, the Association for Computing Machinery’s 2022 Gordon Bell Award will be announced at the SC22 conference in Dallas, Texas, in recognition of the year’s most innovative work in computing. Among the finalists is a team of researchers from King Abdullah University of Science and Technology (KAUST) in Saudi Arabia, who have developed methods to increase computational performance by distinguishing components within large simulations that require high precision, and those that can be calculated. less precisely. By focusing on large-scale climate simulations, the team was able to increase performance 12 times over the benchmark peak implementation.
The Gordon Bell Awards usually reward simulations run on the largest computers in the world, and in this case the KAUST team turned to Fugaku – a supercomputer located in Japan’s RIKEN Center for Computational Science which is currently the second largest system in the world – to complete the race cited in the list of finalists for the prize. However, to access this system, the researchers first had to prove the effectiveness and scalability of their method. To do this, they tested it on other HPC systems with a variety of architectures, including the Stuttgart High Performance Computing Center (HLRS) Hawk supercomputer, one of the world’s first supercomputers to use AMD hardware, which offers unprecedented cache capacity and bandwidth. .
“For new algorithmic advances to be adopted by the scientific community, they must be portable across a variety of high-performance computing architectures, and Hawk represents an important architecture for us,” said Dr. David Keyes, Director of KAUST Extreme Computing. Research Center (ECRC) and Principal Investigator on the KAUST team project. In addition to Hawk, they also tested their method using the Shaheen-2 supercomputer at ECRC and on Summit at Oak Ridge National Laboratory in the United States.
While pursuing computer science gold, the team wanted to ensure that their efforts would benefit research that aligned with KAUST’s core mission, which led them to focus on how to improve the performance of global climate models.
Variety is the spice of science
Since 2012, the high performance computing (HPC) world has increasingly turned to the fast computing capabilities of graphics processing units (GPUs) in search of performance gains. In the Top500 list released in June 2022, most of the 20 fastest machines used GPUs or similar accelerators to achieve peak computing performance. Nevertheless, the increasing complexity of simulations still limits the amount of detail that researchers can include in ambitious large-scale simulations, such as global climate models.
With more powerful computers, climatologists want to incorporate as many factors as possible into their simulations, from cloud cover and historical weather data to sea ice change and particles in the atmosphere. Rather than calculating these elements from first principles, an approach that even today would be computationally expensive, researchers include this information in the form of statistical models. KAUST researchers focused specifically on a subset of statistical modeling, called spatial statistics, which relates statistical data to specific geographic regions in a simulation. Although this approach reduces the computational load of many physical equations, even major world systems continue to bend under the sheer volume of linear algebra – in this case, the solving of large, complex matrix-based mathematical problems – needed to solve these spatial statistical models.
Recently, GPUs have offered new options for handling these data-intensive applications, including artificial intelligence (AI) workflows. In addition to being able to solve relatively simple calculations very quickly, innovative algorithms can take advantage of GPUs to run large and complex simulations more efficiently than traditional processors by reducing precision in parts of a simulation while performing expensive calculations. into calculations in other parts where they are most needed. In collaboration with computer scientists, the KAUST team has developed new ways to reduce the computational cost of these large simulations without sacrificing accuracy.
Practical precision in pursuit of progress
Traditional HPC applications perform “double precision” calculations, which occupy 64 bits of computer memory (FP64). Computations that can still provide significant input to a simulation but are less exact than FP64 are called single precision and take up only 32 bits of memory (FP32). Beyond these two poles, researchers are developing computational approaches that use different levels of precision, depending on the level of detail needed for certain calculations within a simulation. These so-called mixed-precision algorithms can provide the best of all worlds when developed and implemented correctly. In fact, they can even support more than these two precisions typical for HPC, up to “half precision” or FP16.
“With the advent of AI applications in 2012, there was enormous pressure for GPUs to more efficiently support lower precision computations,” said Dr. Hatem Ltaief, Principal Investigator at ECRC of KAUST. “We realized we had to get on that train or let it pass. We started working with application scientists to show how their applications could tolerate some selective loss of precision and still operate as if everything had been done with 64-bit precision. Identifying these places in complex workflows is not easy, and you need to involve scientists in the field to ensure that algorithm developers like us don’t just shoot in the dark.
Keyes, Ltaief, Professor Marc Genton, Dr. Sameh Abdulah and their team at KAUST began working with researchers at the University of Tennessee’s Innovative Computing Laboratory to not only effectively integrate spatial statistics into climate models, but also to identify the tiles in the matrix. where they could adaptively increase and decrease the precision of their calculations throughout a simulation by using adaptive half-precision, single-precision or double-precision – essentially, identifying where lower-precision calculations would still give accurate results, preventing researchers from “over-solving” their simulations.
This method relies on the PaRSEC dynamic execution system to orchestrate the scheduling of computational tasks and on-demand conversion precision arithmetic. This allows researchers to reduce the computational load on parts of a simulation that are less significant or less relevant to the outcome in question. “The key to our recent developments is that we don’t have to anticipate these application-dependent mathematical behaviors in advance. We can detect them on the fly and adapt our economies to what we find in the interactive tiles in the simulations,” Keyes said.
Through their work, the team discovered that they could achieve a 12x performance improvement over state-of-the-art traditional dense matrix computations. “By looking at the problem as a whole, addressing the entire matrix, and looking for opportunities in the data sparse structure of the matrix where we could approximate and use mixed precision, we provide the solution for the community and beyond,” Ltaief said.
Do linear algebra; see the world
The algorithmic innovations of the KAUST team will benefit both computer scientists and HPC centers. New large supercomputers are built every year and their global demand for electricity continues to grow. As the world faces the twin challenges of the need to reduce carbon emissions and an energy supply crisis, HPC centers continue to seek new ways to reduce their energy footprint. While some of these efforts focus on improving hardware efficiency – for example, reusing waste heat generated by the machine or using hotter water to cool the system – there is a new focus on ensuring that simulations run as efficiently as possible and data movement is kept to a minimum.
“Data movement is very costly in terms of power consumption, and by compressing and reducing the overall accuracy of a simulation, we are also reducing the amount of data movement occurring throughout the simulation,” said said Ltaief. “We are moving from a computationally intensive large dataset simulation to a memory bound simulation running on a dataset with a smaller memory footprint and therefore there is a lot of gain in terms of energy consumption.”
HLRS has long been focused on finding ways to reduce and reuse the energy consumed by its supercomputers. So when Ltaief contacted HLRS Director Professor Michael Resch about the KAUST project, the team was able to test his method on Hawk’s architecture while the machine was in acceptance testing.
As an international research center, we are open to collaboration with world-renowned scientists, we are happy to contribute to progress in our field of HPC,” said HLRS Director Professor Michael Resch. “Working with the KAUST team has been a pleasure and an honor.”
Keyes pointed out that the team’s approach holds promise for applying to other fields of science as well, and that this dynamism is one of the most rewarding aspects of being a computer scientist. “Do linear algebra; see the world,” he said. The team has extended their approach to signal decoding in wireless telecommunications, adaptive optics in ground-based telescopes, subterranean imaging, and genotype-phenotype associations, and plans to extend it to materials science in the future. ‘coming.