sander crash / linux cluster

From: C. Klein <cklein_at_pharma.ethz.ch>
Date: Mon 30 Apr 2001 17:06:58 +0200

Dear Colleagues,

I want to run a parallel version of sander (Amber 6) on a cluster of 10x
Dual Pentiums with LAM-MPI. The compilation (using Tru Huynh's configure
script, posted to this list in March) goes smoothly and some first tests
went fine.

However, when I run some longer simulations, the computer (i.e., one of
the participating nodes) hangs, not providing any kind of diagnostic
information. I have not been able to correlate anything with the
crashes. The same job may run for 2 hours or only for 5 minutes.
Currently, I use SuSE linux with the 2.2.18 SMP kernel. Upgrading to 2.4
SMP kernel did not help. Playing with the compiler options did not help
either. I compiled a MPICH version of sander, which also crashes in a
random fashion. Applying bugfix #19 from the Amber 6 bug list did not
get things fixed.

Do you have any suggestion what to do now? How to approach this problem
in a rational manner? Any hints will be appreciated.

Best regards,

Chris Klein



-- 
Dr. C. Klein; cklein(at)pharma.ethz.ch
Dept. of Applied Biosciences, Pharmaceutical Chemistry
ETH Zuerich
Winterthurerstr. 190
CH-8057 Zuerich
Phone: +41 1 6356072
FAX  : +41 1 6356884
A neutron goes into a bar and asks the bartender, "How much for a beer?"
The bartender replies, "For you, no charge."
Received on Mon Apr 30 2001 - 08:06:58 PDT
Custom Search