Re: AMBER: scyld beowulf --amber10--openmpi

From: Rima Chaudhuri <rima.chaudhuri.gmail.com>
Date: Mon, 20 Oct 2008 20:46:13 -0500

hey,
I set it to export DO_PARALLEL='mpirun -no_local=1 -np=4'

thanks

On Mon, Oct 20, 2008 at 8:38 PM, Ross Walker <ross.rosswalker.co.uk> wrote:
> Hi Rima,
>
>> However, when I try to test the suite:
>> cd $AMBERHOME/test
>> make test.parallel
>
> What did you set the DO_PARALLEL environment variable to here? You don't
> mention it.
>
>> I get the following error:
>>
>> bash-2.05b# make test.parallel
>> export TESTsander=/home/rchaud/Amber10_openmpi/amber10/exe/sander.MPI;
>> make test.sander.BASIC
>> make[1]: Entering directory `/home/rchaud/Amber10_openmpi/amber10/test'
>> cd cytosine && ./Run.cytosine
>> [helios.structure.uic.edu:17718] [0,0,0] ORTE_ERROR_LOG: Not available
>> in file ras_bjs.c at line 247
>> --------------------------------------------------------------------------
>> Failed to find the following executable:
>>
>> Host: helios.structure.uic.edu
>> Executable: -o
>>
>> Cannot continue.
>> --------------------------------------------------------------------------
>
> The error is that the mpi process could not find the executable called '-o'
> this suggests that it is looking for the wrong thing. I suspect that
> DO_PARALLEL has not been set or has been set incorrectly hence the problems.
> Note when running parallel all nodes have to see the exact same file system
> in the same place so you should make sure that this is the case.
>
> To test sander.MPI interactively you would do something like
>
> unset TESTsander
> export DO_PARALLEL='mpirun -np 4 --machinefile mymachfile'
>
> cd $AMBERHOME/test
> make test.parallel
>
> to test pmemd you would do:
> unset TESTsander
> export DO_PARALLEL='mpirun -np 4 --machinefile mymachfile'
>
> cd $AMBERHOME/test
> make test.pmemd
>
> if you are running through a queuing system like pbs then you should request
> an interactive run following the instructions for your queuing system. Then
> do something like (this will vary wildly based on the queuing system):
>
> export DO_PARALLEL='mpirun -np 4 --machinefile $PBSNODEFILE'
> cd $AMBERHOME/test
> make test.pmemd
>
> Note you need to ensure that your environment gets correctly exported to all
> of the nodes.
>
> BTW if this a semi-decent interconnect like infiniband and you plan to run
> on more than a couple of nodes I seriously suggest that you choose a good
> MPI implementation like MVAPICH or Intel MPI - OpenMPI's performance is
> pretty aweful. E.g. for PMEMD on a dualxquad core clovertown system with SDR
> infiniband:
>
> FactorIX benchmark
>
> ps/day
>
> ncpus openmpi MVAPICH2
> 2 383.43
> 8 1136.84 1157.14
> 16 1963.64 2090.32
> 32 2817.39 3410.53
> 64 3600.00 5400.00
> 128 2945.45 8100.00
>
>> [helios.structure.uic.edu:17718] [0,0,0] ORTE_ERROR_LOG: Not found in
>> file rmgr_urm.c at line 462
>
> Note the fact that the mpi script can't find the ORTE_ERROR_LOG file
> suggests that you do not have the MPI environment set up correctly. Check
> the openmpi docs to make sure you are setting the paths / environment
> variables correctly.
>
>> If I understand correctly, it cannot find the shared lib files? but I
>> have defined the LD_LIBRARY_PATH in both the .bashrc and
>> .bash_profile.
>
> No I don't see this at all from the errors above. If this were the case it
> would say something like "Error loading shared library...". The error you
> are seeing is that it is trying to execute '-o' instead of
> $AMBERHOME/exe/sander.MPI
>
>> I edited the config_amber.h to add
>> -L/home/rchaud/openmpi-1.2.6/openmpi-1.2.6_ifort/lib -lmpi_f90
>> -lmpi_f77 -lmpi -lopen-rte -lopen-pal -ldl -Wl,--export-dynamic -lnsl
>> -lutil -lm -ldl to LOADLIB, and then did 'make parallel' in
>> $AMBERHOME/src
>
>
> You shouldn't need to do any of this - this should all be taken care of by
> just calling the mpif90 script. If it looked to compile properly and gave
> you sander.MPI in the exe directory then don't mess with the config_amber.h
> file
>
>> which mpirun
>> /home/rchaud/openmpi-1.2.6/openmpi-1.2.6_ifort/bin/mpirun
>> However if I echo $LD_LIBRARY_PATH ..it gives me nothing (when logged
>> in as root), as a regular user, it echos the path
>> fine.(/home/rchaud/openmpi-1.2.6/openmpi-1.2.6_ifort/lib)
>
> Are you trying to run this as root!!?? OpenMPI will most likely not run as
> root - most mpi implementations won't run as root because they cannot rsh to
> each node to start the job.
>
> I would start with something simple - make sure you can run mpi jobs. Try
> something like 'mpirun -np 8 ls' and see if you get 8 copies of the
> directory listed. Try the openmpi tests - I haven't looked at openmpi myself
> but I assume it includes test cases.
>
> Good luck,
> Ross
>
>
> /\
> \/
> |\oss Walker
>
> | Assistant Research Professor |
> | San Diego Supercomputer Center |
> | Tel: +1 858 822 0854 | EMail:- ross.rosswalker.co.uk |
> | http://www.rosswalker.co.uk | PGP Key available on request |
>
> Note: Electronic Mail is not secure, has no guarantee of delivery, may not
> be read every day, and should not be used for urgent or sensitive issues.
>
>
>
>
> -----------------------------------------------------------------------
> The AMBER Mail Reflector
> To post, send mail to amber.scripps.edu
> To unsubscribe, send "unsubscribe amber" (in the *body* of the email)
> to majordomo.scripps.edu
>



-- 
-Rima
-----------------------------------------------------------------------
The AMBER Mail Reflector
To post, send mail to amber.scripps.edu
To unsubscribe, send "unsubscribe amber" (in the *body* of the email)
      to majordomo.scripps.edu
Received on Wed Oct 22 2008 - 05:08:59 PDT
Custom Search