Dear all,
we encountered a strange behaviour during a pmemd restart on our cluster
(infiniband network, job ran on 8 nodes with 4 processors each):
A restart using a certain rst file crashed because there were numbers
too large for the output format and thus only asterics appeared (known
problem). When we reran the MD calculation starting from the previous
rst file that also served as input for the trajectory that produced the
corrupted rst file, pmemd then produced a valid rst file at the end of
this new trajectory. Thus, the same rst file on the same (homogeneous)
cluster with the same executable yielded a different result. How can
this be explained?
Is it possible that the parallel implementation is the reason for this
non-deterministic behaviour - or does this simply indicate a potential
hardware problem on some of the machines?
Any hints are welcome.
Regards,
Anselm Horn
Bioinformatik
Emil-Fischer-Zentrum
Friedrich-Alexander-Universität Erlangen-Nürnberg
-----------------------------------------------------------------------
The AMBER Mail Reflector
To post, send mail to amber.scripps.edu
To unsubscribe, send "unsubscribe amber" to majordomo.scripps.edu
Received on Wed Dec 19 2007 - 06:07:14 PST