Re: [AMBER] multinode pmemd.cuda.MPI jac9999 behavior

From: David Case <david.case.rutgers.edu>
Date: Tue, 25 Apr 2017 13:27:45 -0400

On Tue, Apr 25, 2017, Scott Brozell wrote:
>
> On 1.:
> Perhaps it was not clear, but i showed both the small and large
> test results. In other words, intel gives 4 energies for small
> and 2 energies for large; gnu gives 1 energy for small and
> 1 energy for large.
>
> There have also been multiple experiments over the whole cluster
> yielding the same exact energies.
>
> This certainly seems like good gpu's and some issue with intel
> compilers. Perhaps it is time to contact our intel colleagues
> if we have no explanation.

Agreed...but it may also be time to stop trying to support intel compilers for
pmemd.cuda. People will try that (to no benefit) simply because it is an
avialable option and because they think the Intel compilers are "better".

(There are a very small number of people who somehow have Intel compilers
but not gnu compilers available, and for some reason cannot install the
latter. On the other side of the equation, 99+% of the real-world testing
of pmemd.cuda is done using the gnu compilers, and Intel keeps putting in
new bugs in their compiler every year, so it's a big headache to support
this combination.)

Have you tried putting the Intel compilers into debug mode (-O0), to see
what happens?

....dac

>
> On Tue, Apr 25, 2017 at 09:09:14AM -0400, Daniel Roe wrote:
> >
> > My experience with the GPU validation test is that with Intel
> > compilers I usually end up with final energies flipping between two
> > different values. With GNU compilers and the same GPUs I only get one
> > energy each time. This is why I only use GNU compilers for the CUDA
> > stuff. If there is more variation than that (i.e. 2 values for Intel,
> > 1 for GNU) that indicates a "bad" GPU.
> >

_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Tue Apr 25 2017 - 10:30:02 PDT
Custom Search