Neighboring Particle Search Method for Molecular Dynamics on Long Vector Computers#

Atsushi Ito (NIFS)

Abstract#

A key issue in tuning molecular dynamics (MD) simulations is to rapidly identify interacting particle pairs, which is called neighbor list or pair list. Especially in MD with short-range interaction potentials, as in the case of solids, the computation of the neighbor list occupies a significant percentage of computation time. A well-known method is the cell linked-list method[1]. In this method, the simulation box is divided into cells with a width of the cutoff length, and the particles are stored by each cell as a linked-list data structure. Each particle measures the relative distance to the particles within the same cell and to the particles within the neighboring cells, and judges hit. In three-dimensional system, the cells to be searched are 3 x 3 x 3 = 27 cells, including the center cell and adjacent cells. However, the linked-list data structure is not suited for computers that use modern vector arithmetic because the data is not continuous alignment. In vector arithmetic, it is essential that the data has continuous alignment. For modern vector arithmetic, the particles are then sorted according to the ID of the cell to which they belong. Furthermore, particles for 3 cells arranged in a certain one-dimensional direction can be processed at once using vector operations. For a three-dimensional system, simultaneous access of 3 cells would be performed for 9 lanes. This method is very useful for SIMD architectures, especially for AVX-512 with the ‘vcompress’ instruction. Among them, the vcompress instruction is very important because it can pack and store in memory only those particles that are determined to be hits which have a distance between particles shorter than the cutoff length. Here, in MD simulations with short-range interactions, it is practical for the number of particles in a cell to be between 0 and 10. The 3 cells for 1 lane access have up to 30 particles, and this is much shorter than the long vector instruction length, 256, of SX-AURORA TSUBASA. Therefore, we developed new method for searching neighbor particles for the long vector architecture. In our method, cell division takes place four times. Instead, the particles of the central and neighboring cells are accessed only once as one lane. By this, long vector lengths can be utilized. The benchmark was performed on the Plasma Simulator SX-AURORA TSUBASA at NIFS and the calculation speed was comparable to Intel skylake systems with AVX-512 SIMD instructions. [1] M.P. Allen, D.J. Tildesley, Computer Simulation of Liquids (Oxford University Press, New York, 1990).

<= Go back