[AMBER] Cellulose NVE with pmemd.cuda.MPI

From: Nakashima, Yoshihisa <nakashima_y.jp.fujitsu.com>
Date: Tue, 18 Jun 2013 06:19:50 +0000

Dear Amber community

Hello,

I tried to run Cellulose NVE included in Amber12_GPU_BMT suite with GPGPU(K20X).
The case of serial version (pmemd.cuda) and the parallel version (pmemd.cuda.MPI) with 2process + 2GPU were OK,
but the caes of the parallel version (pmemd.cuda.MPI) with 1process + 1GPU,
the following message was desplayed and this test failed.

***********
# mpiexec -np 1 pmemd.cuda.MPI -O -i mdin -p prmtop -c inpcrd -o mdout_intel_gpu1pro_0618

gpu_download_partial_forces: download failed unspecified launch failure

===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= EXIT CODE: 255
= CLEANING UP REMAINING PROCESSES
= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================
**************

(e.g. the case of no problem)
No problem: # mpiexec -np 2 pmemd.cuda.MPI -O -i mdin -p prmtop -c inpcrd -o mdout_intel_gpu2pro_0618
No problem: # pmemd.cuda -O -i mdin -p prmtop -c inpcrd -o mdout_intel_gpu1pro_0618


This problems is only this case (pmemd.cuda.MPI + 1 GPU) .
With other 8 BMT (Cellulose NPT, TRPCage and so on), there is no problem.

I don't know why the problem occur.
Could you give me an advice to solve this problem ?



Following is information

- Configuration
OS RHEL6.1
CPU 2x Xeon E5-2680
Amber version: 12 (Patched bugfix from 1 to 18)
AmberTools version: 13 (Patched bugfix from 1 to 9)
MPI: MPICH2-1.5
GNU: 4.4.5
GPU: 2x K20X
GPU Device Driver: 304.64
CUDA: 5.0


- Input file is following, it is the same as the file that is described on Amber's web site.
(http://ambermd.org/gpus/benchmarks.htm)

5) Cellulose NVE = 408,609 atoms
************
Typical Production MD NVE with
GOOD energy conservation.
 &cntrl
   ntx=5, irest=1,
   ntc=2, ntf=2, tol=0.000001,
   nstlim=10000,
   ntpr=1000, ntwx=1000,
   ntwr=10000,
   dt=0.002, cut=8.,
   ntt=0, ntb=1, ntp=0,
   ioutfm=1,
 /
 &ewald
  dsum_tol=0.000001,
 /
**************


- The last part of output file is

**************
--------------------------------------------------------------------------------
   4. RESULTS
--------------------------------------------------------------------------------

 ---------------------------------------------------
 APPROXIMATING switch and d/dx switch using CUBIC SPLINE INTERPOLATION
 using 5000.0 points per unit in tabled values
 TESTING RELATIVE ERROR over r ranging from 0.0 to cutoff
| CHECK switch(x): max rel err = 0.2738E-14 at 2.422500
| CHECK d/dx switch(x): max rel err = 0.8987E-11 at 2.875760
 ---------------------------------------------------
|---------------------------------------------------
| APPROXIMATING direct energy using CUBIC SPLINE INTERPOLATION
| with 50.0 points per unit in tabled values
| Relative Error Limit not exceeded for r .gt. 2.52
| APPROXIMATING direct force using CUBIC SPLINE INTERPOLATION
| with 50.0 points per unit in tabled values
| Relative Error Limit not exceeded for r .gt. 2.92
|---------------------------------------------------
************


Thank you for your support.

Best wishes,
Y. Nakashima




----
-----------------------------------------
Yoshihisa Nakashima
Tel: +81-44-754-3174
E-mail:(nakashima_y.jp.fujitsu.com)
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Mon Jun 17 2013 - 23:30:02 PDT
Custom Search