How To Win The Nobel Prize Of Supercomputing

Winning the ACM Gordon Bell Prize in High Performance Computing requires a team effort, says Professor Fu Haohuan, a member of the 12-man team that won the competition in 2016.

AsianScientist (Sep. 5, 2017) – One week before Hurricane Sandy made landfall, it was the European Centre for Medium-range Weather Forecasting (ECMWF) that raised the alarm, accurately predicting the hurricane’s trajectory: an unusual ‘left hook’ which would devastate the US, ultimately destroying over 650,000 homes and racking up an estimated US$60 billion in damages. The ECMWF prediction also showed up the shortcomings of their US counterpart, which took four days to come to the same conclusion.

Now, a team from China has developed a new weather simulation that could be a vast improvement over both systems. In 2016, the team was awarded the ACM Gordon Bell Prize in High Performance Computing for their work, becoming the first team from China to win the highly coveted award.

On the sidelines of the Supercomputing Frontiers conference held in Singapore from 13–16 March 2017, Supercomputing Asia caught up with Professor Fu Haohuan, a member of the winning team, to find out what led up to their historic achievement.

What makes weather and climate prediction so computationally intensive?

Fu Haohuan: Predictions only matter if they are accurate, and making accurate predictions about weather and climate systems is very challenging as they are complicated systems that are affected by many different factors.

To improve the accuracy of each element, you’d want to be running the respective sub-model at a higher resolution. For weather, we may have been using a resolution of 100 km or 20 km in the past, but now people are pushing the resolution down to 3 km or even 500 m.

In weather forecasting, people normally do not run a single model but instead use ensembles—different models with different initial conditions or parameters. Weather agencies typically run about 50 or 100, but if you increase the size of the ensembles you will get more accurate predictions.

As you can see, if we combine all these things, the computing requirement can be really high, easily filling up the largest systems in the world.

Your Gordon Bell Prize-winning work was for a fully implicit solver for nonhydrostatic atmospheric dynamics. Could you explain the difference between implicit solvers and explicit solvers?

FH: When we do the math, it’s all about the equation Ax=b. For explicit solvers, you know A and you know x, so you just need to do the multiplication to get b. Because it is a straightforward calculation, it is easier to parallelize and get good efficiency.

Implicit solvers also use Ax=b, but you know A and b and now you want to compute the x. It is a more complicated calculation, but of course has its advantages, which make it worth doing. Implicit solvers are usually more stable and enable long-term simulations.

What makes it difficult to scale a fully implicit solver on a large system like Sunway TaihuLight?

FH: One of the main problems is that the equation system in our problem, like many other scientific problems, is a sparse matrix rather than a dense matrix. For this kind of computation, it is not easy to divide it into parallel tasks, unlike explicit solvers.

Usually, you are also largely bound by the memory bandwidth, because for sparse matrix operations, you read a lot of data but you don’t do a lot of computation. This limits the performance quite a bit.

In our case, we had to solve both issues and try to make it scalable and also utilize the many-core architecture efficiently.

What was the biggest challenge you faced in developing this implicit solver?

FH: This was a difficult problem in pretty much every way. Thankfully, we have a strong team from many different backgrounds. The lead author for this study is Professor Yang Chao from the Chinese Academy of Sciences’ Institute of Software. He worked on the numerical methods, making it scalable and suitable for a heterogenous system.

“It is not a single’s game but a team sport, like football or basketball; you need to have excellent players on different fronts to make it work.”

Professor Xue Wei from Tsinghua University worked on the message passing interface (MPI). He and I worked on the accelerators to find the most efficient way to run the different kernels on the many-core architecture of Sunway TaihuLight. Then we also have Professor Wang Lanning from Beijing Normal University, who is our climate scientist.

It is a really tough challenge. It is not a single’s game but a team sport, like football or basketball; you need to have excellent players on different fronts to make it work.

How significant is this win for China?

FH: Winning the Gordon Bell prize is a very big breakthrough for Chinese researchers and a good recognition of the efforts of those working on HPC software and applications in China. Three of the six finalists were from China, so I was confident that one of us would win; we had a 50 percent probability!

Although China has nice hardware like Sunway TaihuLight, we do not have the ecosystem present in the US where commercial companies like Intel or NVIDIA create new products and help to keep everything going. I think the gap [between China and the US] on the software side is even larger. I hope that this win can help to promote the importance of software development in HPC, especially for applications.

This article was first published in the print version of Supercomputing Asia, July 2017. Click here to subscribe to Supercomputing Asia in print.


Copyright: Asian Scientist Magazine.
Disclaimer: This article does not necessarily reflect the views of AsianScientist or its staff.

Rebecca did her PhD at the National University of Singapore where she studied how macrophages integrate multiple signals from the toll-like receptor system. She was formerly the editor-in-chief of Asian Scientist Magazine.

Related Stories from Asian Scientist