Hello Scott,
Is Nvidia investigating the documented problem of the GTX 400 series cards in running pmemd or should we kiss those cards goodbye and move on? A sincere answer will be appreciated.
Thanks,
Sergio Aragon
-----Original Message-----
From: Scott Le Grand [mailto:SLeGrand.nvidia.com]
Sent: Friday, November 19, 2010 9:38 AM
To: 'AMBER Mailing List'
Subject: Re: [AMBER] GPU related issues
I have a quick fix for the 10 A cutoff. This only happens when there are 2 or fewer nonbond boxes per any axis. The kludge is to rebuild the neighbor list every step. The better solution is going to come with the fix to allow an arbitrary number of GPUs in multi-gpu for PME runs.
The situation there is I have relieved that constraint 2 ways.
The first approach slows down small jobs but provides a 10% speed kick to larger jobs at 8 GPUs
The second approach hit 58 ns/day for JAC for 8 GPUs but slowed down larger molecules 1-2%
Since the roadmap points to larger molecules (500K to 2M atoms) I am focused on fixing the shortcomings of the second approach.
Please send me the input file and I can verify you're hitting what I think you're hitting. If so, I'll check in the kludge for now.
-----Original Message-----
From: Ross Walker [mailto:ross.rosswalker.co.uk]
Sent: Friday, November 19, 2010 09:24
To: 'AMBER Mailing List'
Subject: Re: [AMBER] GPU related issues
Hi Ye,
> I have applied all the patches to AMBER 11 on my GPU machine, which has 4
> C2050 cards. But sometimes the jobs still fail. AMBER 11 is compiled with
Can you just confirm specifically up to what patch number you applied. Just
to be sure you have ALL of the latest patches.
> In the first job, the system has 12157 atoms only, and the simulation is
under
> NPT ensemble. If cut is set to 8A, this job runs fine. But if cut is set
to 10, it
> dies with a lot of NaN in energy terms and coordinates.
Can you confirm that you can run this simulation on the CPU with cut=10
without issue?
> In the second job, the system has 34116 atoms. The serial cuda run is OK.
But
> in parallel CUDA run, it dies with error message "max pairlist cutoff must
be
> less than unit cell max sphere radius". However, cut is set to 8A, and the
> distance between the protein and cell boundary is set to 10A.
This occurs when the system blows up, however why it blows up is the issue.
> Can anyone help me out?
Can you please send you input files so we can try to reproduce this.
Thanks,
Ross
/\
\/
|\oss Walker
---------------------------------------------------------
| Assistant Research Professor |
| San Diego Supercomputer Center |
| Adjunct Assistant Professor |
| Dept. of Chemistry and Biochemistry |
| University of California San Diego |
| NVIDIA Fellow |
|
http://www.rosswalker.co.uk |
http://www.wmd-lab.org/ |
| Tel: +1 858 822 0854 | EMail:- ross.rosswalker.co.uk |
---------------------------------------------------------
Note: Electronic Mail is not secure, has no guarantee of delivery, may not
be read every day, and should not be used for urgent or sensitive issues.
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
-----------------------------------------------------------------------------------
This email message is for the sole use of the intended recipient(s) and may contain
confidential information. Any unauthorized review, use, disclosure or distribution
is prohibited. If you are not the intended recipient, please contact the sender by
reply email and destroy all copies of the original message.
-----------------------------------------------------------------------------------
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Fri Nov 19 2010 - 10:30:03 PST