In reply to Jan-Philip,
No, I didn't test to see if the error is more common with ECC off. It also
maybe be possible to increase the ECC-on error rate on an M2070 by using a
larger system size, which could allow for more reliable analysis of the
factors.
In reply to Dmitry,
Do you think the error is something that tends to occur early in a
simulation, and if you get past a certain critical point you are ALMOST
safe? I suppose I could test this by having restart files written every
step for instance and look at the step-dependant distribution of the
error. My intuitive impression has been that the errors tend to creep up
quickly, and if the simulation is stable for a few minutes it continues to
be stable, but that is really not based on any hard evidence, just a gut
feeling.
Currently I've just taken to using NAMD on the GTX580 and doing things that
really need AMBER on the more expensive cards. This is certainly not
ideal, as AMBER is generally faster on the GPU at present.
I know in previous discussions a lot of people claimed the problem was just
using the poorer quality consumer graphics cards (GTX) compared with the
workstation cards, but as I have seen the error a few times on the
workstation cards, I'm not convinced that it is an all or none thing.
Something to note is the clock speeds between these cards: I see the most
frequent problems in the order GTX580 > GTX570 > M2070, and the core clock
frequencies for those cards also follow that same trend with the GTX580
being ~1.55 GHz, the GTX570 being ~1.4 GHz, and the M2070 around ~1.1 GHz.
Maybe there is a problem with ramping up the clock speeds on these chips.
But who knows what else might be a factor, the M2070 is inside a similarly
expensive workstation with expensive components, whereas the GTX cards are
inside cheaply built desktops.
~Aron
On Mon, Feb 20, 2012 at 1:54 PM, Dmitry Mukha <dvmukha.gmail.com> wrote:
> 2012/2/20 Aron Broom <broomsday.gmail.com>
>
> >
> > 4) The error MAY be dependent on hardware settings, particularly voltage:
> >
> > Actually, 'yes' and 'no'. This is true for ever-crashed system, where bug
> appearance shifts depending on hardware. But I have rerun successful
> trajectories that cover a long time of modeling (up to 500 ns) on GTX580,
> and bug did not appear. Input files were the same for both types of
> systems. Difference was only in the initial calculations, esp. heating,
>
> --
> Sincerely,
> Dmitry Mukha
> Institute of Bioorganic Chemistry, NAS, Minsk, Belarus
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>
--
Aron Broom M.Sc
PhD Student
Department of Chemistry
University of Waterloo
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Mon Feb 20 2012 - 13:00:02 PST