Performance

Here we present results on FEN ZI scalability and performance, i.e.,
FEN ZI Performance for different DMPC Membranes
FEN ZI Performance for Dihydrofolate Reductase (DHFR)
FEN ZI vs. CHARMM or GPU Performance vs. CPU Performance

FEN ZI Performance for Different Dimyristoyl Phosphatidylcholine (DMPC) Membranes

To study FEN ZI scalability, we compare the performance in terms of ns/day of three lipid bilayer membranes (DMPC) systems  with different sizes. Each size is four times larger than the previous.

  • DMPC 1x1: 46.8A X 46.8A X 76.0A, 17,004 atoms (14,096 bonds, 19,108 angles, and 22,536 diheds), 2,836 explicit water molecules
  • DMPC 2x2: 93.6A X 93.6A X 76.0A, 68,484 atoms (56,696 bonds, 76,588 angles, and 90,144 diheds), 11,500 explicit water molecules
  • DMPC 4x4: 187.2A X 187.2A X 76.0A, 273,936 atoms (226,784 bonds, 306,352 angles, 360,576 diheds), 46,863 explicit water molecules


Figure: Simulations of three lipid bilayer membranes (DMPC) with three different sizes, each four time larger than the previous. (Click on the figure to enlarge it)

FEN ZI simulations were performed on a single C2050 GPU (Fermi) with 448 cores and 3 GB memory.

FEN ZI Performance for Dihydrofolate Reductase (DHFR)

To compare FEN ZI with other GPU codes, we compare the performance in terms of ns/day of a DHFR system with 23558 atoms, 16569 bonds, 11584 angles, and 6701 diheds on different GPUs. MD simulations were performed using a cutoff of 8A with a buffer cutoff for the list updates of 9.5A. Step sizes considered are 1 fs and 2 fs long.


Figure: Comparison of performance of DHFR MD simulations with FEN ZI on GTX 480 and C 2050 (Fermi chip).(Click on the figure to enlarge it)

FEN ZI simulations were run on a single GTX 480 GPU (Fermi) with 480 cores and 1.5 GB memory and a single C2050 GPU (Fermi) with 448 cores and 3 GB memory.

FEN ZI vs. CHARMM or GPU performance vs. CPU Performance

To compare FEN ZI with traditional CPU MD codes, we measure FEN ZI performance in ns/day for the smaller DMPC membrane (DMPC 1x1) versus the performance of the same MD simulation using CHARMM on a multi-core node of a cluster. MD simulations were performed using a cutoff of 8A with a buffer cutoff for the list updates of 9.5A. Each step is 1 fs long.


Figure: Comparison of performance in terms of ns/day for CHARMM on 1, 2, 4, and 8 CPU cores versus FEN ZI on GTX 480 and C 2050 (Fermi chip).(Click on the figure to enlarge it)

CHARMM simulations were run on 1, 2, 4, and 8 cores of an Intel Xeon with speed 2.6GHz and 8GB of memory. The code was optimized for the used platform. FEN ZI simulations were run on a single GTX 480 GPU (Fermi) with 480 cores and 1.5 GB memory and a single C2050 GPU (Fermi) with 448 cores and 3 GB memory.