Re: [AMBER] Building parallel Amber11 on CRAY XD1

From: Ross Walker <ross.rosswalker.co.uk>
Date: Tue, 25 Oct 2011 11:57:00 -0700

Hi Dean,

This looks like something is very wrong with the way in which mpi jobs are
being run on your machine. I would start by seeing if you can run any simple
MPI test programs such as a ping pong or bandwidth test. These should be
included with your MPI implementation. You'll need to get these working
first before attempting to run pmemd.MPI or sander.MPI. It looks like your
mpirun command is just executing 16 copies of pmemd.MPI outside of an MPI
environment. One guess would be that the MPI used to compile pmemd.MPI is
not the same MPI that corresponds to the mpirun command you are using. I
would check this. It is possible your environment variables, particularly
your PATH, are not being inherited properly inside the qsub job.

With regards to the serial pmemd error PMEMD v11 and earlier does NOT
support vacuum simulations. Only PME and GB simulations are supported.
Although the code should have exited with a more appropriate error. I'll
need to check where this is tested for.

All the best
Ross

> -----Original Message-----
> From: Dean Cuebas [mailto:deancuebas.missouristate.edu]
> Sent: Tuesday, October 25, 2011 11:31 AM
> To: AMBER Mailing List
> Subject: [AMBER] Building parallel Amber11 on CRAY XD1
>
> Hi amber people,
>
> My IT guy has sander serial installed ok, and he say he just installed
> parallel amber11.
>
> Serial sander runs fine at the command line with the input files shown
> below.
>
> My input script for pmemd.MPI is:
> _______________________________
> #!/bin/bash
> # qsub -cwd -pe am.mpi 16 -l pn=compute test.sh
> mpirun -np $NSLOTS -hostfile $TMPDIR/machines
> /var/amber11/bin/pmemd.MPI
> -O \
> -i test.in \
> -o test1.out -p ben57mp2.top -c test.crd \
> -r test1.rst -x test1.mdcrd
> ___________________________________
>
>
> I submit the job on the command line as follows:
>
> > qsub -cwd -pe am.mpi 16 -l pn=compute test.sh
>
> The command line says the job was submitted.
>
> Standard error and output files are created with the job not having
> run:
> _________________________________________________
> -catch_rsh
> /opt/gridengine/default/spool/hal9000-274-
> 3/active_jobs/11.1/pe_hostfile
> hal9000-274-3
> hal9000-274-3
> hal9000-274-3
> hal9000-274-3
> hal9000-274-4
> hal9000-274-4
> hal9000-274-4
> hal9000-274-4
> hal9000-274-5
> hal9000-274-5
> hal9000-274-5
> hal9000-274-5
> hal9000-274-6
> hal9000-274-6
> hal9000-274-6
> hal9000-274-6
> _______________________________________________
>
> application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0
> application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0
> application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0
> application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0
> application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0
> application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0
> application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0
> application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0
> application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0
> application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0
> application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0
> application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0
> application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0
> application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0
> application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0
> application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0
> ___________________________________________________________
>
> /opt/gridengine/bin/lx26-amd64/qrsh -inherit hal9000-274-3 cd
> /home/dcuebas/Documents/Amber; /usr/bin/env MPIRUN_HOST=hal9000-274-3
> MPIRUN_PORT=39164 MPIRUN_RANK=2 MPIRUN_NPROCS=16 MPIRUN_ID=6107
> RAIDEV_DEVICE=/dev/rai_hbx0 /var/amber11/bin/pmemd.MPI "-O" "-i"
> "test.in"
> "-o" "test1.out" "-p" "ben57mp2.top" "-c" "test.crd" "-r" "test1.rst"
> "-x"
> "test1.mdcrd"
> /opt/gridengine/bin/lx26-amd64/qrsh -inherit hal9000-274-4 cd
> /home/dcuebas/Documents/Amber; /usr/bin/env MPIRUN_HOST=hal9000-274-3
> MPIRUN_PORT=39164 MPIRUN_RANK=5 MPIRUN_NPROCS=16 MPIRUN_ID=6107
> RAIDEV_DEVICE=/dev/rai_hbx0 /var/amber11/bin/pmemd.MPI "-O" "-i"
> "test.in"
> "-o" "test1.out" "-p" "ben57mp2.top" "-c" "test.crd" "-r" "test1.rst"
> "-x"
> "test1.mdcrd"
> /opt/gridengine/bin/lx26-amd64/qrsh -inherit hal9000-274-6 cd
> /home/dcuebas/Documents/Amber; /usr/bin/env MPIRUN_HOST=hal9000-274-3
> MPIRUN_PORT=39164 MPIRUN_RANK=13 MPIRUN_NPROCS=16 MPIRUN_ID=6107
> RAIDEV_DEVICE=/dev/rai_hbx0 /var/amber11/bin/pmemd.MPI "-O" "-i"
> "test.in"
> "-o" "test1.out" "-p" "ben57mp2.top" "-c" "test.crd" "-r" "test1.rst"
> "-x"
> "test1.mdcrd"
> /opt/gridengine/bin/lx26-amd64/qrsh -inherit hal9000-274-5 cd
> /home/dcuebas/Documents/Amber; /usr/bin/env MPIRUN_HOST=hal9000-274-3
> MPIRUN_PORT=39164 MPIRUN_RANK=10 MPIRUN_NPROCS=16 MPIRUN_ID=6107
> RAIDEV_DEVICE=/dev/rai_hbx0 /var/amber11/bin/pmemd.MPI "-O" "-i"
> "test.in"
> "-o" "test1.out" "-p" "ben57mp2.top" "-c" "test.crd" "-r" "test1.rst"
> "-x"
> "test1.mdcrd"
> /opt/gridengine/bin/lx26-amd64/qrsh -inherit hal9000-274-4 cd
> /home/dcuebas/Documents/Amber; /usr/bin/env MPIRUN_HOST=hal9000-274-3
> MPIRUN_PORT=39164 MPIRUN_RANK=6 MPIRUN_NPROCS=16 MPIRUN_ID=6107
> RAIDEV_DEVICE=/dev/rai_hbx0 /var/amber11/bin/pmemd.MPI "-O" "-i"
> "test.in"
> "-o" "test1.out" "-p" "ben57mp2.top" "-c" "test.crd" "-r" "test1.rst"
> "-x"
> "test1.mdcrd"
> /opt/gridengine/bin/lx26-amd64/qrsh -inherit hal9000-274-6 cd
> /home/dcuebas/Documents/Amber; /usr/bin/env MPIRUN_HOST=hal9000-274-3
> MPIRUN_PORT=39164 MPIRUN_RANK=15 MPIRUN_NPROCS=16 MPIRUN_ID=6107
> RAIDEV_DEVICE=/dev/rai_hbx0 /var/amber11/bin/pmemd.MPI "-O" "-i"
> "test.in"
> "-o" "test1.out" "-p" "ben57mp2.top" "-c" "test.crd" "-r" "test1.rst"
> "-x"
> "test1.mdcrd"
> /opt/gridengine/bin/lx26-amd64/qrsh -inherit hal9000-274-4 cd
> /home/dcuebas/Documents/Amber; /usr/bin/env MPIRUN_HOST=hal9000-274-3
> MPIRUN_PORT=39164 MPIRUN_RANK=4 MPIRUN_NPROCS=16 MPIRUN_ID=6107
> RAIDEV_DEVICE=/dev/rai_hbx0 /var/amber11/bin/pmemd.MPI "-O" "-i"
> "test.in"
> "-o" "test1.out" "-p" "ben57mp2.top" "-c" "test.crd" "-r" "test1.rst"
> "-x"
> "test1.mdcrd"
> /opt/gridengine/bin/lx26-amd64/qrsh -inherit hal9000-274-4 cd
> /home/dcuebas/Documents/Amber; /usr/bin/env MPIRUN_HOST=hal9000-274-3
> MPIRUN_PORT=39164 MPIRUN_RANK=7 MPIRUN_NPROCS=16 MPIRUN_ID=6107
> RAIDEV_DEVICE=/dev/rai_hbx0 /var/amber11/bin/pmemd.MPI "-O" "-i"
> "test.in"
> "-o" "test1.out" "-p" "ben57mp2.top" "-c" "test.crd" "-r" "test1.rst"
> "-x"
> "test1.mdcrd"
> /opt/gridengine/bin/lx26-amd64/qrsh -inherit hal9000-274-6 cd
> /home/dcuebas/Documents/Amber; /usr/bin/env MPIRUN_HOST=hal9000-274-3
> MPIRUN_PORT=39164 MPIRUN_RANK=12 MPIRUN_NPROCS=16 MPIRUN_ID=6107
> RAIDEV_DEVICE=/dev/rai_hbx0 /var/amber11/bin/pmemd.MPI "-O" "-i"
> "test.in"
> "-o" "test1.out" "-p" "ben57mp2.top" "-c" "test.crd" "-r" "test1.rst"
> "-x"
> "test1.mdcrd"
> /opt/gridengine/bin/lx26-amd64/qrsh -inherit hal9000-274-3 cd
> /home/dcuebas/Documents/Amber; /usr/bin/env MPIRUN_HOST=hal9000-274-3
> MPIRUN_PORT=39164 MPIRUN_RANK=1 MPIRUN_NPROCS=16 MPIRUN_ID=6107
> RAIDEV_DEVICE=/dev/rai_hbx0 /var/amber11/bin/pmemd.MPI "-O" "-i"
> "test.in"
> "-o" "test1.out" "-p" "ben57mp2.top" "-c" "test.crd" "-r" "test1.rst"
> "-x"
> "test1.mdcrd"
> /opt/gridengine/bin/lx26-amd64/qrsh -inherit hal9000-274-5 cd
> /home/dcuebas/Documents/Amber; /usr/bin/env MPIRUN_HOST=hal9000-274-3
> MPIRUN_PORT=39164 MPIRUN_RANK=8 MPIRUN_NPROCS=16 MPIRUN_ID=6107
> RAIDEV_DEVICE=/dev/rai_hbx0 /var/amber11/bin/pmemd.MPI "-O" "-i"
> "test.in"
> "-o" "test1.out" "-p" "ben57mp2.top" "-c" "test.crd" "-r" "test1.rst"
> "-x"
> "test1.mdcrd"
> /opt/gridengine/bin/lx26-amd64/qrsh -inherit hal9000-274-6 cd
> /home/dcuebas/Documents/Amber; /usr/bin/env MPIRUN_HOST=hal9000-274-3
> MPIRUN_PORT=39164 MPIRUN_RANK=14 MPIRUN_NPROCS=16 MPIRUN_ID=6107
> RAIDEV_DEVICE=/dev/rai_hbx0 /var/amber11/bin/pmemd.MPI "-O" "-i"
> "test.in"
> "-o" "test1.out" "-p" "ben57mp2.top" "-c" "test.crd" "-r" "test1.rst"
> "-x"
> "test1.mdcrd"
> /opt/gridengine/bin/lx26-amd64/qrsh -inherit hal9000-274-3 cd
> /home/dcuebas/Documents/Amber; /usr/bin/env MPIRUN_HOST=hal9000-274-3
> MPIRUN_PORT=39164 MPIRUN_RANK=0 MPIRUN_NPROCS=16 MPIRUN_ID=6107
> RAIDEV_DEVICE=/dev/rai_hbx0 /var/amber11/bin/pmemd.MPI "-O" "-i"
> "test.in"
> "-o" "test1.out" "-p" "ben57mp2.top" "-c" "test.crd" "-r" "test1.rst"
> "-x"
> "test1.mdcrd"
> /opt/gridengine/bin/lx26-amd64/qrsh -inherit hal9000-274-5 cd
> /home/dcuebas/Documents/Amber; /usr/bin/env MPIRUN_HOST=hal9000-274-3
> MPIRUN_PORT=39164 MPIRUN_RANK=9 MPIRUN_NPROCS=16 MPIRUN_ID=6107
> RAIDEV_DEVICE=/dev/rai_hbx0 /var/amber11/bin/pmemd.MPI "-O" "-i"
> "test.in"
> "-o" "test1.out" "-p" "ben57mp2.top" "-c" "test.crd" "-r" "test1.rst"
> "-x"
> "test1.mdcrd"
> /opt/gridengine/bin/lx26-amd64/qrsh -inherit hal9000-274-3 cd
> /home/dcuebas/Documents/Amber; /usr/bin/env MPIRUN_HOST=hal9000-274-3
> MPIRUN_PORT=39164 MPIRUN_RANK=3 MPIRUN_NPROCS=16 MPIRUN_ID=6107
> RAIDEV_DEVICE=/dev/rai_hbx0 /var/amber11/bin/pmemd.MPI "-O" "-i"
> "test.in"
> "-o" "test1.out" "-p" "ben57mp2.top" "-c" "test.crd" "-r" "test1.rst"
> "-x"
> "test1.mdcrd"
> /opt/gridengine/bin/lx26-amd64/qrsh -inherit hal9000-274-5 cd
> /home/dcuebas/Documents/Amber; /usr/bin/env MPIRUN_HOST=hal9000-274-3
> MPIRUN_PORT=39164 MPIRUN_RANK=11 MPIRUN_NPROCS=16 MPIRUN_ID=6107
> RAIDEV_DEVICE=/dev/rai_hbx0 /var/amber11/bin/pmemd.MPI "-O" "-i"
> "test.in"
> "-o" "test1.out" "-p" "ben57mp2.top" "-c" "test.crd" "-r" "test1.rst"
> "-x"
> "test1.mdcrd"
> MPI version of PMEMD must be used with 2 or more processors!
> MPI version of PMEMD must be used with 2 or more processors!
> MPI version of PMEMD must be used with 2 or more processors!
> MPI version of PMEMD must be used with 2 or more processors!
> MPI version of PMEMD must be used with 2 or more processors!
> MPI version of PMEMD must be used with 2 or more processors!
> MPI version of PMEMD must be used with 2 or more processors!
> MPI version of PMEMD must be used with 2 or more processors!
> MPI version of PMEMD must be used with 2 or more processors!
> MPI version of PMEMD must be used with 2 or more processors!
> MPI version of PMEMD must be used with 2 or more processors!
> MPI version of PMEMD must be used with 2 or more processors!
> MPI version of PMEMD must be used with 2 or more processors!
> MPI version of PMEMD must be used with 2 or more processors!
> MPI version of PMEMD must be used with 2 or more processors!
> MPI version of PMEMD must be used with 2 or more processors!
> _______________________________________________________________
>
>
>
>
> Interestingly, running serial pmemd (NOT pmemd.MPI) on the command
> line:
>
> >var/amber11/bin/pmemd -O \
> -i test.in \
> -o test1.out -p ben57mp2.top -c test.crd \
> -r test1.rst -x test1.mdcrd
>
> Gives the following .out file
> -------------------------------------------------------
> Amber 11 SANDER 2010
> -------------------------------------------------------
>
> | PMEMD implementation of SANDER, Release 11
>
> | Run on 10/25/2011 at 07:11:12
>
> [-O]verwriting output
>
> File Assignments:
> | MDIN: test.in
>
> | MDOUT: test1.out
>
> | INPCRD: test.crd
>
> | PARM: ben57mp2.top
>
> | RESTRT: test1.rst
>
> | REFC: refc
>
> | MDVEL: mdvel
>
> | MDEN: mden
>
> | MDCRD: test1.mdcrd
>
> | MDINFO: mdinfo
>
>
>
> Here is the input file:
>
> Vacuum simulation at 300ąC, weak coupling
>
> &cntrl
>
> imin = 0, ntb = 0, ig=-1,
>
> igb = 0, ntpr = 10, ntwx = 10,
>
> ntt = 1, tautp=0.5, gamma_ln = 0,
>
> tempi = 300.0, temp0 = 300.0
>
> nstlim = 1000000, dt = 0.0001,
>
> cut = 999
>
> /
>
>
>
>
> | ERROR: nfft1 must be in the range of 6 to 512!
> | ERROR: nfft2 must be in the range of 6 to 512!
> | ERROR: nfft3 must be in the range of 6 to 512!
> | ERROR: a must be in the range of 0.10000E+01 to 0.10000E+04!
> | ERROR: b must be in the range of 0.10000E+01 to 0.10000E+04!
> | ERROR: c must be in the range of 0.10000E+01 to 0.10000E+04!
>
> Input errors occurred. Terminating execution.
> _____________________________________________________________
>
>
> Does anyone have any suggestions? I would greatly appreciate it!!!!!
>
> Thanks a million in advance.
>
> Dean
>
>
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber


_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Tue Oct 25 2011 - 12:00:03 PDT
Custom Search