nav search
Data Center Software Security Transformation DevOps Business Personal Tech Science Emergent Tech Bootnotes BOFH

Hot iron: Knights Landing hits 100 gigaflops in plasma physics benchmark

Russian physicists give Chipzilla's HPC star the elephant stamp

By Richard Chirgwin, 9 Aug 2016

Russian researchers working with an Intel supercomputer have put its Knights Landing hardware through its paces, and are pleased with what they've found.

The boffins reckon the many-cored hot rod more than doubled their plasma simulation performance with nothing more than a simple recompile. Here's what they had to say at Arxiv:

A straightforward rebuilding of the code yields a 2.43 x speedup compared to the previous Knights Corner generation. Further code optimization results in an additional 1.89 x speedup.

The researchers explain that they chose a particular plasma simulation, “particle in cell” because it's a well-studied problem on many-core architectures.

The tests ran on a node of the Endeavor supercomputer, which let the boffins run the code on Intel 14-core Haswell processors (Xeon E5-2697 v3 at 2.6 GHz with 36 MB cache), Knights Corner (Phi 7120 with 61 cores, 1.2 GHz, and 30.5 MB cache), and Knights Landing running in quadrant cluster mode (Phi 7250 with 68 cores, 1.4 GHz, and 16 GB MCDRAM).

The researchers used their own plasma simulation code, called PICADOR, with help from Intel's Zakhar Matveev. For HPC-and-plasma-physics nerds, the config used for the benchmark was a frozen plasma problem in 40 x 40 x 40 grid with 50 particles per cell, over 1,000 time steps. This, they write, provides a good baseline because it can be solved on a single CPU.

Apart from the headline results, the benchmark produced interesting side-results. For example, the performance boost maxed out when the code ran on eight processes, 34 OpenMP threads per process: “further increasing the number of processes results in performance degradation”.

Overall, the paper says, Intel's Knights Landing was able to run suitably optimised PICADOR code at a not-too-shabby 100 gigaflops.

The full paper is at Arxiv. ®

The Register - Independent news and views for the tech community. Part of Situation Publishing