[AMBER] MMPBSA jobs on a cluster Calcerror:

From: Vivek Shankar Bharadwaj <vbharadw.mymail.mines.edu>
Date: Wed, 6 Mar 2013 13:05:01 -0700

Hi Amber Users,

I am trying to run MMPBSA jobs (with decomposition) on our cluster
supercomputer with Amber12.

The job seems to run for about 3 hours with out error. and suddenly
terminates with a calc error (below in red).

I have attached the MMPBSA input file used for this job. I also find that I
am able to run this job on my workstation without any issues.

I have attached mmpbsa.in and the script file used for the job.

I also recompiled ambertools12 and amber12 with all bug fixes applied and
updated.

Has anyone come across this error?

Thanks

ERROR
-- 
--------------------------------------------------------------------------
An MPI process has executed an operation involving a call to the
"fork()" system call to create a child process.  Open MPI is currently
operating in a condition that could result in memory corruption or
other system errors; your MPI job may hang, crash, or produce silent
data corruption.  The use of fork() (or system() or other calls that
create child processes) is strongly discouraged.
The process that invoked fork was:
  Local host:          n102 (PID 3664)
  MPI_COMM_WORLD rank: 0
If you are *absolutely sure* that your application will successfully
and correctly survive a call to fork(), you may disable this warning
by setting the mpi_warn_on_fork MCA parameter to 0.
--------------------------------------------------------------------------
[n102:03654] 1 more process has sent help message help-mpi-runtime.txt /
mpi_init:warn-fork
[n102:03654] Set MCA parameter "orte_base_help_aggregate" to 0 to see all
help / error messages
[n102:03654] 30 more processes have sent help message help-mpi-runtime.txt
/ mpi_init:warn-fork
CalcError: /panfs/storage/sets/maupin/common/amber12/bin/sander failed with
prmtop complex.prmtop!
Error occured on rank 21.
Exiting. All files have been retained.
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 21 in communicator MPI_COMM_WORLD
with errorcode 1.
NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpiexec has exited due to process rank 21 with PID 3684 on
node n88 exiting without calling "finalize". This may
have caused other processes in the application to be
terminated by signals sent by mpiexec (as reported here).
--------------------------------------------------------------------------
Vivek S. Bharadwaj
Graduate Student
Department of Chemical and Biological Engg.
Colorado School of Mines
Golden Colorado




_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber

Received on Wed Mar 06 2013 - 12:30:02 PST
Custom Search