Re: [AMBER] Amber 14 w/ CUDA - unclear "make test"-errors from Falko Jähnert on 2016-02-11 (Amber Archive Feb 2016)

From: Falko Jähnert <falko.jaehnert.biochemtech.uni-halle.de>
Date: Thu, 11 Feb 2016 14:37:32 +0100

Dear Jason,

thanks a lot for the quick reply. So as it is a deficiency in the testing procedure it isn’t a problem for using Amber with CUDA, right? I'm not entirely sure if I understand your answer correctly. Where did this inserted lines in one of the compared files arise from? I believed this extra lines came from an alternative logging manner?

Thanks in advance,
Falko Jähnert

-----Ursprüngliche Nachricht-----
Von: Jason Swails [mailto:jason.swails.gmail.com]
Gesendet: Donnerstag, 11. Februar 2016 14:03
An: AMBER Mailing List <amber.ambermd.org>
Betreff: Re: [AMBER] Amber 14 w/ CUDA - unclear "make test"-errors

On Thu, Feb 11, 2016 at 7:36 AM, Falko Jähnert < falko.jaehnert.biochemtech.uni-halle.de> wrote:

> Dear Amberlings,
>
>
>
> at first, thanks a lot helping me out with my last problem „Howto
> cpptraj - multiple trajin-commands in one line“. @Jean-Marc Billod: I
> did it your way and this works just fine!
>
>
>
> Now I’ve got a little concern about the results of my installation of
> Amber 14. The make test-procedure at the parallel installation level
> (both with 2 and 4 threads) went through without a single error, even
> without rounding mistakes. After that i’ve compiled Amber 14 the usual
> way to gather CUDA-support. Now the make test produce some rounding
> errors which are okay (I hope), but also errors where lines one of the
> compared files (*.diff) are inserted and thus produce a lot of
> differences. If one compares the numbers of the correctly aligned
> lines then everything is fine (I hope – again with some rounding
> errors). To understand my problem better I attached the *log- and the
> *.diff-files which are shortened to display only the unclear diffs.
>
>
>
> Can I ignore this diffs safely? If not, may someone provide any
> information handling this problem?
>

This is a known deficiency in the CUDA testing infrastructure. All of the larger failures (i.e., that are not round-off) arise from stochastic methods (ntt=2 or ntt=3) where the random number stream is different on every GPU.

While there is a way to fix it (and it is on the to-do list), it apparently hasn't been important enough to make it to the top yet.

HTH,
Jason

--
Jason M. Swails
BioMaPS,
Rutgers University
Postdoctoral Researcher
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber

Received on Thu Feb 11 2016 - 06:00:03 PST