RE: AMBER: problems for running sander.MPI

From: Ross Walker <ross.rosswalker.co.uk>
Date: Thu, 12 Oct 2006 10:41:19 -0700

Dear Christophe,
 
This is my first experience with openmpi. Which openmpi test suite are you
refering to? Where is it documented?

I have never used Openmpi myself either. I tend to use mpich2. There should
be some kind of test suite distributed with the source code though. Check
the install docs. Typically you do something like: ./configure; make; make
test; make install
 
It is the make test bit that you need to lookup.
 
Unfortunately, the error is not always from the same node!
 
HHmmm, then it could be the switch but could also be an issue with the
openmpi installation. Try downloading mpich2 and trying that out instead and
see if it works.
 
You could also try building pmemd in $AMBERHOME/src/pmemd and then testing
this. If you see similar problems then it is definately an issue with the
openmpi installation or the hardware.
 
All the best
Ross

/\
\/
|\oss Walker

| HPC Consultant and Staff Scientist |
| San Diego Supercomputer Center |
| Tel: +1 858 822 0854 | EMail:- ross.rosswalker.co.uk |
| http://www.rosswalker.co.uk <http://www.rosswalker.co.uk/> | PGP Key
available on request |

Note: Electronic Mail is not secure, has no guarantee of delivery, may not
be read every day, and should not be used for urgent or sensitive issues.

 


  _____

From: owner-amber.scripps.edu [mailto:owner-amber.scripps.edu] On Behalf Of
Christophe Deprez
Sent: Thursday, October 12, 2006 06:55
To: amber.scripps.edu
Subject: Re: AMBER: problems for running sander.MPI


Ross Walker wrote:


Hi Qizhi

enode05:03662] mca_btl_tcp_frag_send: writev failed with errno=104



(enode05 is one of the node names of the cluster.)



Normmally, there is no problem for minimization and constant

NVT steps.

The problems often occur during constant NPT and production run.



Hi Ross, and thanks for your reply.
I'm working as sysadmin with Qizhi to troubleshoot this issue.


This looks like a hardware problem to me. Unfortunately a Google search

sheds little light. E.g.:

http://www.open-mpi.org/community/lists/users/2006/02/0684.php



Have you seen this with any other codes? Can you run the openmpi test suite

successfully?

This is my first experience with openmpi. Which openmpi test suite are you
refering to? Where is it documented?


I would check to see if the error is always from the same node. If you

unplug that node and use the remaining nodes do you see the problem.

Unfortunately, the error is not always from the same node!

I would also try compiling with g95 instead of gfortran. While it appears

that gfortran is now mature enough to compile Amber I don't know if it has

been thoroughly tested. You will probably have to recompile openmpi with g95

as well.

I'll give this a try.

Thanks for your suggestions

-- 
Christophe Deprez                         christophe.deprez.bri.nrc.ca
----------------------------------------------------------------------
Institut de Recherche en Biotechnologies / Biotech. Research Institute
6100 Royalmount, Montréal (QC) H4P 2R2, Canada     Tel: (514) 496-6164 
-----------------------------------------------------------------------
The AMBER Mail Reflector
To post, send mail to amber.scripps.edu
To unsubscribe, send "unsubscribe amber" to majordomo.scripps.edu
Received on Thu Oct 12 2006 - 20:36:15 PDT
Custom Search