Re: [AMBER] MMPBSA jobs on a cluster Calcerror:

From: Jason Swails <jason.swails.gmail.com>
Date: Wed, 6 Mar 2013 15:25:39 -0500

Look at the _MMPBSA_complex_gb.mdout.0 file -- the error should be in there.

On Wed, Mar 6, 2013 at 3:05 PM, Vivek Shankar Bharadwaj <
vbharadw.mymail.mines.edu> wrote:

> Hi Amber Users,
>
> I am trying to run MMPBSA jobs (with decomposition) on our cluster
> supercomputer with Amber12.
>
> The job seems to run for about 3 hours with out error. and suddenly
> terminates with a calc error (below in red).
>
> I have attached the MMPBSA input file used for this job. I also find that I
> am able to run this job on my workstation without any issues.
>
> I have attached mmpbsa.in and the script file used for the job.
>
> I also recompiled ambertools12 and amber12 with all bug fixes applied and
> updated.
>
> Has anyone come across this error?
>
> Thanks
>
> ERROR
> --
> --------------------------------------------------------------------------
> An MPI process has executed an operation involving a call to the
> "fork()" system call to create a child process. Open MPI is currently
> operating in a condition that could result in memory corruption or
> other system errors; your MPI job may hang, crash, or produce silent
> data corruption. The use of fork() (or system() or other calls that
> create child processes) is strongly discouraged.
>
> The process that invoked fork was:
>
> Local host: n102 (PID 3664)
> MPI_COMM_WORLD rank: 0
>
> If you are *absolutely sure* that your application will successfully
> and correctly survive a call to fork(), you may disable this warning
> by setting the mpi_warn_on_fork MCA parameter to 0.
> --------------------------------------------------------------------------
> [n102:03654] 1 more process has sent help message help-mpi-runtime.txt /
> mpi_init:warn-fork
> [n102:03654] Set MCA parameter "orte_base_help_aggregate" to 0 to see all
> help / error messages
> [n102:03654] 30 more processes have sent help message help-mpi-runtime.txt
> / mpi_init:warn-fork
> CalcError: /panfs/storage/sets/maupin/common/amber12/bin/sander failed with
> prmtop complex.prmtop!
>
>
> Error occured on rank 21.
> Exiting. All files have been retained.
> --------------------------------------------------------------------------
> MPI_ABORT was invoked on rank 21 in communicator MPI_COMM_WORLD
> with errorcode 1.
>
> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
> You may or may not see output from other processes, depending on
> exactly when Open MPI kills them.
> --------------------------------------------------------------------------
> --------------------------------------------------------------------------
> mpiexec has exited due to process rank 21 with PID 3684 on
> node n88 exiting without calling "finalize". This may
> have caused other processes in the application to be
> terminated by signals sent by mpiexec (as reported here).
> --------------------------------------------------------------------------
>
> Vivek S. Bharadwaj
> Graduate Student
> Department of Chemical and Biological Engg.
> Colorado School of Mines
> Golden Colorado
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>
>


-- 
Jason M. Swails
Quantum Theory Project,
University of Florida
Ph.D. Candidate
352-392-4032
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Wed Mar 06 2013 - 12:30:03 PST
Custom Search