
Built to make a difference
It took ten years, about US$1 billion and the work of thousands of members of Japan’s HPC community to realize Fugaku. RIKEN began working with Fujitsu in 2011 to design a processor that would be 100 times faster than applications on the K computer.
The result is the A64FX processor, based on the ARM 8.2A instruction architecture set for supercomputers, with hardware barrier, sector cache and prefetch Fujitsu extensions. The A64FX has 48 calculation cores and either two or four assistant cores, with a theoretical peak performance of 3.3792 teraflops for double precision floating point calculations.
“It’s like a chimera,” said Matsuoka. “It has GPU-class speed but you can program it like a normal CPU, so it could run the same stuff as the CPU in your smartphone. Secondly, the power efficiency is also similar to that of a GPU for high-end applications. We were able to do that by coming up with new technologies that have not been tested on CPUs before.”
Another notable aspect of the processor is that it attempts to buck the trend in which compute power has increased at the expense of the ability to move data around. Fugaku has solid performance across the board because the A64FX emphasizes data movement, Matsuoka said, adding that he expects other HPC researchers and developers to follow the same approach.
“Most applications are limited by data movement rather than compute,” he said. “People have built machines while somewhat ignoring their utility and they get these artificial results which may not be telling the whole story with regards to the true performance of the chip or the machine across many applications. Fugaku’s objective is to break away from that.”
The processors are arranged in 158,976 nodes, each consisting of one CPU and 32 GiB of memory, in 432 racks, and provide a peak performance in 2.2 GHz boost mode of 537 petaFLOPS for double precision (64 bit) and 1.07 exaFLOPS for single precision (32 bit).
“It’s large scale, uses no accelerators, is very good for a number of applications having very fast, very good interconnect between the processors, has very high bandwidth, has quite a bit of fast memory on this machine,” said Dongarra in a YouTube chat with Primeur Magazine. “We’re looking forward to see how it performs on real scientific problems.”
For Matsuoka, what distinguishes Fugaku from the K and other supercomputers is the fact that it’s dedicated to being general purpose. He compares developing a very high performance ‘normal’ supercomputer, as he describes it, to the extremely difficult challenge that Toyota Motor faced in the 1990s when it developed the Prius, the world’s first mass-produced hybrid vehicle.
While fuel-efficient hybrids are normal now, developing the Prius involved overcoming technological, production and economic challenges. In so doing, however, Toyota helped establish a new normal.
“In order to satisfy the goal of being able to cover a broad range of applications, Fugaku had to be absolutely standard. It’s one thing to have this hardware, but people have to be able to program the software. It’s all about the so-called software ecosystem, but at the same time as being normal, it has to be a supercomputer,” said Matsuoka.
“It’s like a Prius but it’s as fast as a supercar, so you can drive it to the supermarket but it can compete against any other supercar in the world. It is ease of use and performance at the same time. I think that’s what makes Fugaku quite unique.”