You may want to contact the REAF developers directly.
On Sun, Dec 21, 2025, 10:06 AM Dulal Mondal via AMBER <amber.ambermd.org>
wrote:
> Please anyone respond.
>
> *With regards,*
> *Dulal Mondal,*
> *Research Scholar,*
> *Department of Chemistry,*
> *IIT Kharagpur, Kharagpur 721302.*
>
> On Sun, 14 Dec, 2025, 12:19 pm Dulal Mondal, <
> babunmondal.chem.kgpian.iitkgp.ac.in> wrote:
>
> > Thank you for your reply.
> >
> > The mdout file contains the following header:
> >
> > ------------------------------------------------------- Amber 24
> PMEMD 2024
> > -------------------------------------------------------
> >
> > The pmemd.cuda executable works correctly *on the DGX cluster.*
> > Additionally, the constant pH replica-exchange simulations run
> > successfully using pmemd.cuda.MPI *on the DGX cluster.*
> >
> > However, the *REAF calculation does not work on the DGX cluster*,
> > although the same REAF calculation runs successfully on another cluster.
> >
> > On Sat, Dec 13, 2025 at 9:53 PM David A Case <dacase1.gmail.com> wrote:
> >
> >> On Fri, Dec 12, 2025, Dulal Mondal via AMBER wrote:
> >>
> >> >I submit a REAF job using pmemd.cuda.MPI. But the error is
> >> > Primary job terminated normally, but 1 process returned
> >> >a non-zero exit code. Per user-direction, the job has been aborted.
> >>
> >>
> >--------------------------------------------------------------------------
> >>
> >>
> >--------------------------------------------------------------------------
> >> >mpirun detected that one or more processes exited with non-zero status,
> >> >thus causing
> >> >the job to be terminated. The first process to do so was:
> >> >
> >> > Process name: [[41136,1],2]
> >> > Exit code: 255
> >>
> >>
> >--------------------------------------------------------------------------
> >> >and
> >> >*cudaMemcpyToSymbol: SetSim copy to cSim failed invalid device symbol*
> >>
> >> This message, and the MPI one, just indicate that some error occurred,
> but
> >> offer no realy clues as to why.
> >>
> >> Is there anything in the mdout file that looks suspicious. Does the
> code
> >> work with the non-MPI version of pmemd.cuda? Is the fact that REAF is
> >> being
> >> used relevant? (That is, do non-REAF jobs work OK?) Does that system
> >> work
> >> OK with the CPU version of pmemd?
> >>
> >> I think you will have to do some trial and error debugging to try to
> >> localize the source of the problem.
> >>
> >> >
> >> >But amber 24 installation using cuda 11.7 and openmpi version 4.1.2 is
> >> >successfully completed.
> >>
> >> Does this imply that your ran the test suite (e.g. 'make
> >> test.cuda.serial')
> >> successfully?
> >>
> >> ...good luck...dac
> >>
> >
> >
> > --
> > *With regards,*
> > *Dulal Mondal,*
> > *Research Scholar,*
> > *Department of Chemistry,*
> > *IIT Kharagpur, Kharagpur 721302.*
> >
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
>
> https://protect.checkpoint.com/v2/r01/___http://lists.ambermd.org/mailman/listinfo/amber___.YzJ1OnN0b255YnJvb2s6YzpnOmNkOTFmOGIzOGEzMjliZmIyM2IzOWExNzlmMTM5MmNmOjc6NzYwODo2NzIwZTA5NTZlYTUyMjVkYzk3MmQ3ZGM0ZWFjOWZhOWUxODA4ZDFlYjhmNDU2ODk1OTdlMDAxYzI0MDI4MmMwOnA6VDpG
>
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Sun Dec 21 2025 - 11:00:03 PST