
In addition to the awards presentation, the ceremony day also featured a technical forum which gave the students an opportunity to learn from academic and industry leaders, including one of the competition’s judges, Professor Dhabaleswar K. (D.K.) Panda of Ohio State University. Panda is known throughout the HPC community for his work on MVAPICH, an open source software powering some of the fastest supercomputers in the world.
The Message Passing Interface (MPI), a standardized way for different computing nodes to pass instructions to each other, forms the very foundation of parallel computing. MPI libraries act as the middleware for HPC applications, allowing applications to seamlessly move from platform to platform. However, different implementations of MPI work with different types of networks, and when the InfiniBand network entered the market in 2000, there were no MPI libraries available to take advantage of its high bandwidth and low latency.
Sensing an opportunity, Panda and his team immediately got to work on an MPI library tailored for InfiniBand and other high performance networks. Since MVAPICH was demonstrated at the Supercomputing Conference in 2002, MVAPICH has been downloaded over half a million times by more than 3,000 organizations in 89 countries, underscoring its importance to the global HPC community. We caught up with Panda at the sidelines of the awards ceremony to find out more about MVAPICH, as well as his opinions on the convergence of HPC and AI.
What motivated you to develop the MVAPICH open source software?
DK Panda: Direct memory access (DMA) allowed us to move data between the input/output and memory without involving the CPU. InfiniBand brought us the concept of remote direct memory access (RDMA), allowing us to transmit data to the memory from node to node with very few CPU cycles. However, when InfiniBand first came out, nobody knew how to redesign an MPI library using RDMA. We were the very first ones to come up with MPI libraries that could run over InfiniBand networks with very low overheads. This is what gave the network performance and scalability.
How has MVAPICH evolved since it was first launched?
DKP: We have continuously been enhancing MVAPICH over the last 18 years. Seven or eight years back, we worked on having a very tight integration with GPU clusters, combining RDMA with GPUs to deliver very good performance. We are keeping our eyes open for new developments, like when InfiniBand goes from 200 gigabits per second (Gbps) to 400 Gbps next year. Other than higher bandwidth, we are also developing support for features such as adaptive routing, which would allow the system to recover if a node fails.
Why are exascale systems a challenge for existing MPI libraries?
DKP: You can think of MPI like a car with many internal components. Let’s say you want a regular car to become a Formula One race car. You would have to redesign the wheels, the transmission and so on. In the same way, when you take an MPI library from petascale to exascale, each and every component needs to be redesigned to take it to the next level. What runs on a 16-node system is not going to run the same way on an 8,000-node system.
We are continuously re-evaluating our designs to improve not just performance and scalability but also a very important third dimension: the memory footprint. Fat nodes are coming; the latest AMD ‘Rome’ CPU, for example, has 128 cores per node. But if you do the math, as the number of cores increase, the memory per core is decreasing, even though memory is also increasing. When designing the MPI library, we don’t want it to consume too much memory. If the MPI library takes up70 percent of the memory, then the application gets only30 percent and the users will not be happy. So that is one big challenge we are working on right now.
What can HPC and AI practitioners learn from each other?
DKP: I think that there is a good momentum taking place in both fields right now, with each one trying to leverage the other. At the same time, there is also convergence taking place. The training phase of AI, in particular, is essentially a HPC problem, no different from weather forecasting, molecular dynamics or any other ‘traditional’ HPC problem.
In deep learning, the biggest challenge is for people to come up with solutions such that training can be done faster, so that whatever used to take years can be reduced to months, days or even hours. If you can go down to minutes or seconds, the whole field will change because people can try different kinds of solutions. As deep learning models become larger, you have to use HPC systems to make them run faster. So AI is a HPC problem and it is becoming dependent on HPC.
On the other hand, people are also trying to use deep learning or AI to make HPC simulations faster. For example, a molecular dynamics application might have normally taken seven days, but if you can analyze the patterns with deep learning, you might be able to identify the direction that is the most profitable and cut the execution time down to two days. I think that we will see a big revolution in this field over the next few years; all the traditional scientific applications will be accelerated through this principle.
Please share with us more about your recent work on libraries for big data and deep learning.
DKP: About eight years ago, my research group started to look at commonly used big data libraries like Hadoop and Spark. As each code base is different, you cannot just copy and paste our high performance MPI libraries into those architectures. So we took high performance ideas to Hadoop, Spark and so on, with our project called HiBD, high performance big data. Similarly, we are also trying to bring high performance to deep learning with HiDL. The objective of these projects is to exploit modern HPC solutions to scale out and accelerate big data and deep learning applications
This article was first published in the print version of Supercomputing Asia, January 2020.
Click here to subscribe to Asian Scientist Magazine in print.
———
Copyright: Asian Scientist Magazine; Illustration: Shelly Liew/Supercomputing Asia.
Disclaimer: This article does not necessarily reflect the views of AsianScientist or its staff.