Re: AMBER: failure in testing amber8 parallel parts on IBM-SP4 machines

From: Nicolas Lux Fawzi <fawzin.berkeley.edu>
Date: Fri, 24 Nov 2006 11:16:01 -0800

Hi Rachel
Previously, I succeeded in getting amber (9) to run and pass the test
on AIX (v 5.something, on power3 and power5?) and followed the
procedure that I think was already described on this list in response
to one of your previous emails.
I think Ross Walker already answered this question earlier, but there
were additional problems that you solved first.
i've pasted in the response below. In summary, I haven't seen AIX
configured to use "mpirun" as the program to run mpi parallel programs,
but rather I've seen it configured with poe. I've copied Ross Walker's
answer below and highlighted the part that I think you still need to
fix:

> mpirun: Command not found.
> I part of your problem. On IBM Power 4 machines running AIX (I assume
> you are running AIX) one typically uses IBM's parallel operating
> environment "POE".
> You would typically run a parallel job setting:
>  
> setenv MP_NODES 1
> setenv MP_TASKS_PER_NODE 8
> poe $AMBERHOME/exe/sander.MPI -O blah blah blah
>  
> poe reads the environment variables to determine the number of cpus to
> use rather than using a -np x option like mpirun would.
> Thus to run the parallel test cases in amber9 you would set the above
> environment variables to the combination of nodes and tasks you want
> and then run:
>  
> setenv DO_PARALLEL poe
> make test.sander.parallel

Those commands work for c-shell (csh, tcsh, etc.) If you are using sh,
bash, etc, use the "export" command instead. Let us know if you need
help with that. If you type "env" at your shell, you should see if
DO_PARALLEL is set to poe or to mpirun. If "which poe" doesn't turn up
the path to poe, then your machine may not be configured for it, or
it's simply not in your path and you should look for poe.

Hope this is helpful!

-Nick




Begin forwarded message:

> From: "Ross Walker" <ross.rosswalker.co.uk>
> Date: November 22, 2006 9:55:45 AM PST
> To: <amber.scripps.edu>
> Subject: RE: AMBER: compile amber8 on IBM-sp4
> Reply-To: amber.scripps.edu
>
> Dear Rachel,
>  
> The fact that you are seeing the error:
>  
> mpirun: Command not found.
> I part of your problem. On IBM Power 4 machines running AIX (I assume
> you are running AIX) one typically uses IBM's parallel operating
> environment "POE".
>  
> You would typically run a parallel job setting:
>  
> setenv MP_NODES 1
> setenv MP_TASKS_PER_NODE 8
> poe $AMBERHOME/exe/sander.MPI -O blah blah blah
>  
> poe reads the environment variables to determine the number of cpus to
> use rather than using a -np x option like mpirun would.
>  
> Thus to run the parallel test cases in amber9 you would set the above
> environment variables to the combination of nodes and tasks you want
> and then run:
>  
> setenv DO_PARALLEL poe
> make test.sander.parallel
>  
> However, I believe the main issue you are having is with compiling and
> from your earlier emails this seemed to be an issue with the mpi
> installation:
>  
> #include file "mpif.h" not found.
> This could come from two situations. Firstly IBM's mpi might never
> have been installed on your machine or secondly your environment
> variables might not be configured correctly.
>  
> Try 'which mpxlf90' and see if it returns /usr/bin/mpxlf90
>  
> if it says 'mpxlf90: Command not found.' then I suspect mpi is not
> installed or possibly installed in a non standard place. Check with
> whoever setup the machine to find out which mpi they installed and
> where.
>  
> For the moment you should be able to build and test the serial
> version. I would advise you to do this first to make sure things are
> working. Do:
>  
> cd $AMBERHOME/src
> ./configure -nopar xlf90_aix
> make clean
> make
> cd ../test
> make clean
> make
>  
> All the best
> Ross
>  
> /\
> \/
> |\oss Walker
>
> | HPC Consultant and Staff Scientist |
> | San Diego Supercomputer Center |
> | Tel: +1 858 822 0854 | EMail:- ross.rosswalker.co.uk |
> | http://www.rosswalker.co.uk | PGP Key available on request |
>
> Note: Electronic Mail is not secure, has no guarantee of delivery, may
> not be read every day, and should not be used for urgent or sensitive
> issues.
>  
>
>> From: owner-amber.scripps.edu [mailto:owner-amber.scripps.edu] On
>> Behalf Of Rachel
>> Sent: Wednesday, November 22, 2006 02:09
>> To: amber.scripps.edu
>> Subject: Re: AMBER: compile amber8 on IBM-sp4
>>
>> Hi, Scott,
>>  
>> Thanks very much for your reply, however, i am still not clear how to
>> solve the problem, you suggested to use 'make -n'? and the results
>> are as following:
>> ----------------------------------------------------------------------
>>         cd dmp; ./Run.dmp
>>         cd adenine; ./Run.adenine
>>         cd cytosine; ./Run.cytosine
>>         cd nonper; ./Run.nonper
>>         cd nonper; ./Run.nonper.belly
>>         cd nonper; ./Run.nonper.belly.mask
>>         cd nonper; ./Run.nonper.min
>>         cd nonper; ./Run.cap
>>         cd nonper; ./Run.nonper.nocut
>>         cd tip4p; ./Run.tip4p
>>         cd tip5p; ./Run.tip5p
>>         cd 4096wat; ./Run.pure_wat
>>         cd dhfr; ./Run.dhfr
>>         cd dhfr; ./Run.dhfr.noshake
>>         cd dhfr; ./Run.dhfr.min
>>         cd gb_rna; ./Run.gbrna
>>         cd gb_rna; ./Run.gbrna.min
>>         cd gb_rna; ./Run.gbrna.ln
>>         cd gbsa_xfin; ./Run.gbsa
>>         cd polarizable_water; ./Run.pol_wat
>>         cd ubiquitin; ./Run.ubiquitin
>>         cd dna_pol; ./Run.dna_pol
>>         cd aspash; ./Run.aspash
>>         cd circ_dna; ./Run.circdna
>>         cd gb2_trx; ./Run.trxox
>>         cd trx; ./Run.trx
>>         cd trx; ./Run.trx.cpln
>>         cd cnstph; ./Run.cnstph
>>         cd rdc; ./Run.dip
>>         cd tgtmd/change_target; ./Run.tgtmd
>>         cd tgtmd/change_target.rms; ./Run.tgtmd
>>         cd tgtmd/change_target.ntr; ./Run.tgtmd
>>         cd tgtmd/conserve_ene; ./Run.tgtmd
>>         cd tgtmd/minimize; ./Run.tgtmin
>>         cd tgtmd/PME; ./Run.tgtPME
>>         cd trajene; ./Run.trajene
>>         cd pheTI; ./Run.0; ./Run.1; ./Run.p0; ./Run.p1
>>         cd ion_wat; ./Run.ion_wat
>>         cd alp; ./Run.alp
>>         cd pb_pgb; ./Run.pbpgb
>>         cd umbrella; ./Run.umbrella
>>         cd LES_noPME; ./Run.LESmd
>>         cd LES_noPME; ./Run.LESmd.rdiel
>>         cd LES; ./Run.PME_LES
>>         cd LES_CUT; ./Run.LES
>>         cd LES_TEMP; ./Run.2temp
>>         cd LES_GB; ./Run.LES
>> make: 1254-002 Cannot find a rule to create target \ from
>> dependencies.
>> Stop.
>> ----------------------------------------------------------------------
>> ----------------------------------
>>  
>> As you said the compilation line is broken, how can I make sure i
>> provided enough details? and if i use 'make test.sander', etc.
>> instead of make test.parallel, then i got:
>> ----------------------------------------------------------------------
>> ---------------------------------
>>  cd dmp; ./Run.dmp
>> This test not set up for parallel
>>  cannot run in parallel with #residues < #pes
>>         cd adenine; ./Run.adenine
>> This test not set up for parallel
>>  cannot run in parallel with #residues < #pes
>> ==============================================================
>>         cd cytosine; ./Run.cytosine
>> mpirun: Command not found.
>>   ./Run.cytosine:  Program error
>> make: 1254-004 The error code from the last command is 1.
>> ----------------------------------------------------------------------
>> ---------------------------------
>>  
>> This is the first time I compile amber myself, and as it is IBM-SP4
>> machine, so what i did was just follow Carlos Sosa's Running Amber on
>> IBM systems (
>> http://www.msi.umn.edu/~cpsosa/ChemApps/MolMech/amber/patches/amber8/
>> INSTALL_ibm) on AMBER website, I really appreciate your help if
>> anyone can let me know some more details of how to do it.
>>  
>> Best regards,
>> Rachel
>>
>>
>>  
>> On 11/21/06, Scott Brozell <sbrozell.scripps.edu> wrote: Hi,
>>>
>>> On Tue, 21 Nov 2006, Rachel  wrote:
>>>
>>> > Dear all,
>>> >
>>> > I am trying to compile amber8 on the IBM-SP4 machines, I used
>>> './configure
>>> > -mpi xlf90_aix', after i used 'make paralle', i got the following
>>> error
>>> > message:
>>>
>>> > "egb.f", line 131.12: 1506-296 (S) #include file "mpif.h" not
>>> found.
>>> > make: 1254-004 The error code from the last command is 1.
>>>
>>> It looks like the compilation line is broken.
>>> You didn't provide enough detail, so the full command
>>> invoked by make is not clear.
>>>
>>> make should be invoking something like, eg
>>>
>>>   /lib/cpp -traditional -I/usr/lpp/ppe.poe/include -P
>>> -I/usr/local/srb/9/src/include -DMPI  -DNMLEQ -DCLINK_PLAIN -Drs6000
>>> -DPOE   sander.f > _sander.f
>>>   mpxlf90_r -bmaxdata:0x80000000 -c -qfixed -c   -qfree  -o sander.o
>>> _sander.f
>>>
>>> Compare this with the output of make -n
>>> What happens if you enter mpxlf90_r -show -help ?
>>>
>>> > and then if i go to the $AMBERHOME/test directory and use 'make
>>> > test.parallel', it says:
>>>
>>> See page 7 errata
>>> http://amber.scripps.edu/doc8/errata.html
>>>
>>> Scott
>>>
>>> ---------------------------------------------------------------------
>>> --
>>> The AMBER Mail Reflector
>>> To post, send mail to amber.scripps.edu
>>> To unsubscribe, send "unsubscribe amber" to majordomo.scripps.edu

On Nov 24, 2006, at 9:58 AM, Rachel wrote:

> Dear all,
>  
> When I use 'make test.sander' in $AMBERHOME/test for the amber8
> parallel parts on IBM_SP4 machines, I got the following error
> messages, I searched in the mailing list, and found out that the
> should be related to "mpi installation and compilation". The system I
> am using is a standard IBM's AIX operating system. Can anyone suggest
> what is the reason for the errors and how can I correctly test the
> parallel part? Thanks a lot.
>  
> ######################################################################
>         cd dmp; ./Run.dmp
> This test not set up for parallel
>  cannot run in parallel with #residues < #pes
>         cd adenine; ./Run.adenine
> This test not set up for parallel
>  cannot run in parallel with #residues < #pes
> ==============================================================
>         cd cytosine; ./Run.cytosine
> mpirun: Command not found.
>   ./Run.cytosine:  Program error
> make: 1254-004 The error code from the last command is 1.
>
>
> Stop.
> bash-2.05a$ make test.sander
>         cd dmp; ./Run.dmp
> This test not set up for parallel
>  cannot run in parallel with #residues < #pes
>         cd adenine; ./Run.adenine
> This test not set up for parallel
>  cannot run in parallel with #residues < #pes
> ==============================================================
>         cd cytosine; ./Run.cytosine
> mpirun: Command not found.
>   ./Run.cytosine:  Program error
> make: 1254-004 The error code from the last command is 1.
>
> Stop.
> ######################################################################
>  
> With my best regards,
> Rachel
>  
-----------------------------------------------------------------------
The AMBER Mail Reflector
To post, send mail to amber.scripps.edu
To unsubscribe, send "unsubscribe amber" to majordomo.scripps.edu
Received on Sun Nov 26 2006 - 06:07:27 PST
Custom Search