[AMBER] Amber with MPI (GNU and Intel) on Mac SL

From: Alan <alanwilter.gmail.com>
Date: Mon, 3 May 2010 18:56:13 +0200

Hi there,

First I must say that I am very very happy with Amber11. Much easier to
install on Mac (at least with GNU).

That said, I must report some results I have so far.

1) AT 1.4 and Amber11 with GNU and OMPI
All went fine, no failures. The only detail here is that I use FINK. And
openmpi from Fink (v. 1.3.3), whose mpicc (/sw/bin/mpicc) instead of using
gcc-4 from fink (/sw/bin/gcc-4) is using apple system gcc (/usr/bin/gcc),
which is WRONG in my opinion (and I believe macports did the right thing
here); this will lead to several issues if you're not knowing what you're
doing. I contact Fink developers and there's a openmpi 1.4.1 for tests
addressing this issue.

That said, if you want to compile Amber package with current openmpi from
Fink, you have to, once ran ./configure -mpi gnu, edit config.h and replace
mpicc by:

gcc-4 -D_REENTRANT -I/sw/include -I/sw/include/openmpi -L/sw/lib/openmpi
-lmpi -lopen-rte -lopen-pal -lutil

(mpif90 is calling Fink's gfortran correctly, and couldn't be different
since Apple doesn't provide its OSX with gfortran)

2) AT 1.4 and Amber11 with Intel Compilers (v. 11.1.088)

2.1) With Amber's OMPI 1.4.1 script:
Apart some issues with tests in serial, the paralles versions of ptraj.MPI,
nab, sander.MPI, sander.LES.MPI and pmemd.MPI all failed to run with
something like:

otool -L ../../bin/sander.MPI
../../bin/sander.MPI:
/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version
125.0.1)
 libsvml.dylib (compatibility version 0.0.0, current version 0.0.0)
/Users/alan/Programmes/amber11/AmberTools/lib/libmpi_f77.0.dylib
(compatibility version 1.0.0, current version 1.0.0)
 /Users/alan/Programmes/amber11/AmberTools/lib/libmpi.0.dylib (compatibility
version 1.0.0, current version 1.1.0)
/Users/alan/Programmes/amber11/AmberTools/lib/libopen-rte.0.dylib
(compatibility version 1.0.0, current version 1.0.0)
 /Users/alan/Programmes/amber11/AmberTools/lib/libopen-pal.0.dylib
(compatibility version 1.0.0, current version 1.0.0)
/usr/lib/libutil.dylib (compatibility version 1.0.0, current version 1.0.0)
 /usr/lib/libgcc_s.1.dylib (compatibility version 1.0.0, current version
438.0.0)
/usr/lib/libmx.A.dylib (compatibility version 1.0.0, current version
315.0.0)
 libirc.dylib (compatibility version 1.0.0, current version 1.11.0)

amber11/test/4096wat% $AMBERHOME/bin/mpirun -np 2 ../../bin/sander.MPI -O -i
gbin -c eq1.x -o mdout.pure_wat
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image PC Routine Line Source

libmpi.0.dylib 0000000105E7C957 Unknown Unknown Unknown
libmpi_f77.0.dyli 0000000105D70046 Unknown Unknown Unknown
sander.MPI 00000001001183AD Unknown Unknown Unknown
sander.MPI 000000010036C434 Unknown Unknown Unknown
sander.MPI 00000001000EC230 Unknown Unknown Unknown
sander.MPI 00000001000AC25E Unknown Unknown Unknown
sander.MPI 00000001000A2339 Unknown Unknown Unknown
sander.MPI 000000010000192C Unknown Unknown Unknown
sander.MPI 00000001000018C4 Unknown Unknown Unknown
--------------------------------------------------------------------------
mpirun has exited due to process rank 1 with PID 42746 on
node amadeus.local exiting without calling "finalize". This may
have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).
--------------------------------------------------------------------------

test/4096wat% $AMBERHOME/bin/mpirun -np 2 ../../bin/pmemd.MPI -O -i gbin -c
eq1.x -o mdout.pure_wat
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image PC Routine Line Source

libmpi.0.dylib 0000000100702957 Unknown Unknown Unknown
libmpi_f77.0.dyli 00000001005F6046 Unknown Unknown Unknown
pmemd.MPI 0000000100050111 Unknown Unknown Unknown
pmemd.MPI 00000001000021E8 Unknown Unknown Unknown
pmemd.MPI 000000010000168C Unknown Unknown Unknown
pmemd.MPI 0000000100001624 Unknown Unknown Unknown
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image PC Routine Line Source

libmpi.0.dylib 0000000100702957 Unknown Unknown Unknown
libmpi_f77.0.dyli 00000001005F6046 Unknown Unknown Unknown
pmemd.MPI 0000000100050111 Unknown Unknown Unknown
pmemd.MPI 00000001000021E8 Unknown Unknown Unknown
pmemd.MPI 000000010000168C Unknown Unknown Unknown
pmemd.MPI 0000000100001624 Unknown Unknown Unknown
--------------------------------------------------------------------------
mpirun has exited due to process rank 0 with PID 42783 on
node amadeus.local exiting without calling "finalize". This may
have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).
--------------------------------------------------------------------------

For Run.lmod:
.../amber11/AmberTools/test/nab% ../../bin/nab -o tlmod tlmod.nab

otool -L tlmod
tlmod:
/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version
125.0.1)
 libifport.dylib (compatibility version 1.0.0, current version 1.11.0)
libifcore.dylib (compatibility version 1.0.0, current version 1.11.0)
 libsvml.dylib (compatibility version 0.0.0, current version 0.0.0)
/Users/alan/Programmes/amber11/AmberTools/lib/libmpi.0.dylib (compatibility
version 1.0.0, current version 1.1.0)
 /Users/alan/Programmes/amber11/AmberTools/lib/libopen-rte.0.dylib
(compatibility version 1.0.0, current version 1.0.0)
/Users/alan/Programmes/amber11/AmberTools/lib/libopen-pal.0.dylib
(compatibility version 1.0.0, current version 1.0.0)
 /usr/lib/libutil.dylib (compatibility version 1.0.0, current version 1.0.0)
/usr/lib/libgcc_s.1.dylib (compatibility version 1.0.0, current version
438.0.0)
 /usr/lib/libmx.A.dylib (compatibility version 1.0.0, current version
315.0.0)
libirc.dylib (compatibility version 1.0.0, current version 1.11.0)

$AMBERHOME/bin/mpirun -np 2 ./tlmod
[amadeus:06792] *** Process received signal ***
[amadeus:06792] Signal: Segmentation fault (11)
[amadeus:06792] Signal code: Address not mapped (1)
[amadeus:06792] Failing at address: 0x362f62b68
[amadeus:06792] *** End of error message ***
[amadeus:06793] *** Process received signal ***
[amadeus:06793] Signal: Segmentation fault (11)
[amadeus:06793] Signal code: Address not mapped (1)
[amadeus:06793] Failing at address: 0x362f62b68
[amadeus:06793] *** End of error message ***
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 6792 on node amadeus.local
exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------

2.2) Using Fink Openmpi (1.3.3) to wrap icc and ifort:
Using a similar approach I did to circumvent the problem I had with gcc-4
from Fink and mpicc, I did:

./configure -mpi intel

Edited config.h and replace mpicc and mpif90 by:

icc -D_REENTRANT -I/sw/include -I/sw/include/openmpi -L/sw/lib/openmpi -lmpi
-lopen-rte -lopen-pal -lutil
and
ifort -I/sw/include -I/sw/lib/openmpi -L/sw/lib/openmpi -lmpi_f90 -lmpi_f77
-lmpi -lopen-rte -lopen-pal -lutil

respectively, and compilation went very well.

Doing tests in Parallel I got the issues reported with Intel compilers
serial and all MPI version worked well (yes!) except *pmemd.MPI*, which
gives the error:

/sw/bin/om-mpirun -np 2 ../../bin/pmemd.MPI -O -i gbin -c eq1.x -o
mdout.pure_wat
[amadeus.local:42651] *** An error occurred in MPI_Waitany
[amadeus.local:42651] *** on communicator MPI_COMM_WORLD
[amadeus.local:42651] *** MPI_ERR_TRUNCATE: message truncated
[amadeus.local:42651] *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
--------------------------------------------------------------------------
om-mpirun has exited due to process rank 0 with PID 42651 on
node amadeus.local exiting without calling "finalize". This may
have caused other processes in the application to be
terminated by signals sent by om-mpirun (as reported here).
--------------------------------------------------------------------------

And repeating, with sander.MPI, nab (MPI) etc., all test went fine or at
least ran.

I am wondering if someone else got Amber working with intel and ompi.

Cheers,

All


-- 
Alan Wilter S. da Silva, D.Sc. - CCPN Research Associate
Department of Biochemistry, University of Cambridge.
80 Tennis Court Road, Cambridge CB2 1GA, UK.
>>http://www.bio.cam.ac.uk/~awd28<<
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Mon May 03 2010 - 10:00:03 PDT
Custom Search