Dear Ross, dear Scott,
Thank you for your quick reply.
I've got your status.
I checked the discussion in the Amber mailing list archive that
the performance with MPI is lower than the performance with non-MPI, because there is overhead.
And I understood it.
By the way, I have used Amber for Benchmark test.
In this case, the reason why
I want to know how much difference of the performance between the case with MPI and non-MPI.
And I want to know the reason why didn't execute only "Cellulose NVE" with MPI("mpirun -np 1") + GPGPU(cuda),
but I executed other 8 BMT (TRPCage, Myoglobin and so on) with MPI.
If there is a workaround to run, could you inform me ?
Thanks and best wishes,
Y. Nakashima
----
-----------------------------------------
Yoshihisa Nakashima
XTC Development Unit
Fujitsu Ltd.
Tel: +81-44-754-3174
E-mail:(nakashima_y.jp.fujitsu.com)
From: Scott Le Grand [mailto:varelse2005.gmail.com]
Sent: Thursday, June 20, 2013 2:08 AM
To: AMBER Mailing List
Cc: Nakashima, Yoshihisa/中島 嘉久
Subject: Re: [AMBER] Cellulose NVE with pmemd.cuda.MPI
My current suggestion is the same as Ross's - do not run MPI with one process.
This is not a bug worth fixing IMO. And that's because I'm currently redesigning the multi-GPU architecture to reflect the tripling in AMBER performance since I wrote it. I'd rather put my time into the future than address a use case that's kinda useless.
if see this with REMD runs, that's a whole different matter. Otherwise it's will not fix because I'm building something to replace it.
Scott
On Wed, Jun 19, 2013 at 5:19 AM, Ross Walker <ross.rosswalker.co.uk<mailto:ross.rosswalker.co.uk>> wrote:
Dear Yoshihisa,
Granted this should not be failing in this way but I have to question why
you would want to run with mpirun -np 1? - All it does is add overhead and
slow the simulation down. It is also a configuration that is not tested,
hence why it has not been noticed that it was failing.
We'll take a look.
All the best
Ross
On 6/17/13 11:19 PM, "Nakashima, Yoshihisa" <nakashima_y.jp.fujitsu.com<mailto:nakashima_y.jp.fujitsu.com>>
wrote:
>Dear Amber community
>
>Hello,
>
>I tried to run Cellulose NVE included in Amber12_GPU_BMT suite with
>GPGPU(K20X).
>The case of serial version (pmemd.cuda) and the parallel version
>(pmemd.cuda.MPI) with 2process + 2GPU were OK,
>but the caes of the parallel version (pmemd.cuda.MPI) with 1process +
>1GPU,
>the following message was desplayed and this test failed.
>
>***********
># mpiexec -np 1 pmemd.cuda.MPI -O -i mdin -p prmtop -c inpcrd -o
>mdout_intel_gpu1pro_0618
>
>gpu_download_partial_forces: download failed unspecified launch failure
>
>==========================================================================
>=========
>= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>= EXIT CODE: 255
>= CLEANING UP REMAINING PROCESSES
>= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
>==========================================================================
>=========
>**************
>
>(e.g. the case of no problem)
>No problem: # mpiexec -np 2 pmemd.cuda.MPI -O -i mdin -p prmtop -c inpcrd
>-o mdout_intel_gpu2pro_0618
>No problem: # pmemd.cuda -O -i mdin -p prmtop -c inpcrd -o
>mdout_intel_gpu1pro_0618
>
>
>This problems is only this case (pmemd.cuda.MPI + 1 GPU) .
>With other 8 BMT (Cellulose NPT, TRPCage and so on), there is no problem.
>
>I don't know why the problem occur.
>Could you give me an advice to solve this problem ?
>
>
>
>Following is information
>
>- Configuration
>OS RHEL6.1
>CPU 2x Xeon E5-2680
>Amber version: 12 (Patched bugfix from 1 to 18)
>AmberTools version: 13 (Patched bugfix from 1 to 9)
>MPI: MPICH2-1.5
>GNU: 4.4.5
>GPU: 2x K20X
>GPU Device Driver: 304.64
>CUDA: 5.0
>
>
>- Input file is following, it is the same as the file that is described
>on Amber's web site.
>(http://ambermd.org/gpus/benchmarks.htm)
>
>5) Cellulose NVE = 408,609 atoms
>************
>Typical Production MD NVE with
>GOOD energy conservation.
> &cntrl
> ntx=5, irest=1,
> ntc=2, ntf=2, tol=0.000001,
> nstlim=10000,
> ntpr=1000, ntwx=1000,
> ntwr=10000,
> dt=0.002, cut=8.,
> ntt=0, ntb=1, ntp=0,
> ioutfm=1,
> /
> &ewald
> dsum_tol=0.000001,
> /
>**************
>
>
>- The last part of output file is
>
>**************
>--------------------------------------------------------------------------
>------
> 4. RESULTS
>--------------------------------------------------------------------------
>------
>
> ---------------------------------------------------
> APPROXIMATING switch and d/dx switch using CUBIC SPLINE INTERPOLATION
> using 5000.0 points per unit in tabled values
> TESTING RELATIVE ERROR over r ranging from 0.0 to cutoff
>| CHECK switch(x): max rel err = 0.2738E-14 at 2.422500
>| CHECK d/dx switch(x): max rel err = 0.8987E-11 at 2.875760
> ---------------------------------------------------
>|---------------------------------------------------
>| APPROXIMATING direct energy using CUBIC SPLINE INTERPOLATION
>| with 50.0 points per unit in tabled values
>| Relative Error Limit not exceeded for r .gt. 2.52
>| APPROXIMATING direct force using CUBIC SPLINE INTERPOLATION
>| with 50.0 points per unit in tabled values
>| Relative Error Limit not exceeded for r .gt. 2.92
>|---------------------------------------------------
>************
>
>
>Thank you for your support.
>
>Best wishes,
>Y. Nakashima
>
>
>
>
>----
>-----------------------------------------
>Yoshihisa Nakashima
>Tel: +81-44-754-3174
>E-mail:(nakashima_y.jp.fujitsu.com<mailto:nakashima_y.jp.fujitsu.com>)
>
>
>
>_______________________________________________
>AMBER mailing list
>AMBER.ambermd.org<mailto:AMBER.ambermd.org>
>http://lists.ambermd.org/mailman/listinfo/amber
_______________________________________________
AMBER mailing list
AMBER.ambermd.org<mailto:AMBER.ambermd.org>
http://lists.ambermd.org/mailman/listinfo/amber
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Wed Jun 19 2013 - 20:00:03 PDT