Thank you for your reply.
The mdout file contains the following header:
------------------------------------------------------- Amber 24
PMEMD 2024
-------------------------------------------------------
The pmemd.cuda executable works correctly *on the DGX cluster.*
Additionally, the constant pH replica-exchange simulations run successfully
using pmemd.cuda.MPI *on the DGX cluster.*
However, the *REAF calculation does not work on the DGX cluster*, although
the same REAF calculation runs successfully on another cluster.
On Sat, Dec 13, 2025 at 9:53 PM David A Case <dacase1.gmail.com> wrote:
> On Fri, Dec 12, 2025, Dulal Mondal via AMBER wrote:
>
> >I submit a REAF job using pmemd.cuda.MPI. But the error is
> > Primary job terminated normally, but 1 process returned
> >a non-zero exit code. Per user-direction, the job has been aborted.
> >--------------------------------------------------------------------------
> >--------------------------------------------------------------------------
> >mpirun detected that one or more processes exited with non-zero status,
> >thus causing
> >the job to be terminated. The first process to do so was:
> >
> > Process name: [[41136,1],2]
> > Exit code: 255
> >--------------------------------------------------------------------------
> >and
> >*cudaMemcpyToSymbol: SetSim copy to cSim failed invalid device symbol*
>
> This message, and the MPI one, just indicate that some error occurred, but
> offer no realy clues as to why.
>
> Is there anything in the mdout file that looks suspicious. Does the code
> work with the non-MPI version of pmemd.cuda? Is the fact that REAF is
> being
> used relevant? (That is, do non-REAF jobs work OK?) Does that system work
> OK with the CPU version of pmemd?
>
> I think you will have to do some trial and error debugging to try to
> localize the source of the problem.
>
> >
> >But amber 24 installation using cuda 11.7 and openmpi version 4.1.2 is
> >successfully completed.
>
> Does this imply that your ran the test suite (e.g. 'make test.cuda.serial')
> successfully?
>
> ...good luck...dac
>
--
*With regards,*
*Dulal Mondal,*
*Research Scholar,*
*Department of Chemistry,*
*IIT Kharagpur, Kharagpur 721302.*
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Sat Dec 13 2025 - 23:00:02 PST