Re: [AMBER] pmemd cuda error

From: Ross Walker <>
Date: Thu, 18 Sep 2014 10:28:24 -0700

Hi Kshatresh,

There is unfortunately not enough information in this message to be able
to understand what is going on here. Although an illegal memory access is
not a common error.

First why are you using mpirun to run these calculations here? - Are you
running on 2 GPUs (and the mdout files says Peer to Peer is enabled?) - or
are you running on more than 2 GPUs or on 2 GPUs that do not support peer
to peer? - In the later two cases you are probabluy actually running
slower than if you just used one GPU at a time. Non-peer to peer parallel
is also so unlikely to give a performance improvement - due to the
slowness of the CPU chipset (and even worse a node to node interconnect)
that it is not heavily tested anymore. So this could be your problem,
although I am guessing here...

It could also be that one of your GPUs is faulty. What type of GPUs are
they and who built, burnt in and validated this system?

I would suggest validating the GPUs with this:

If all that works then take a more careful look at your simulation itself.
Maybe it blew up, maybe the parameters somewhere are inappropriate, maybe
the structure is bad - e.g. something sticking through a ring system etc.

It will likely take some more debugging to figure out what is going on

All the best

On 9/18/14, 4:47 AM, "Kshatresh Dutta Dubey" <> wrote:

>Dear Users,
> I am using Amber GPU 14 for my simulations. I was successfully
>several jobs before sometime on same machine but now I am getting error
>like :
>Error: an illegal memory access was encountered launching kernel
>cudaIpcCloseMemHandle failed on gpu->pbPeerAccumulator an illegal memory
>access was encountered
>Primary job terminated normally, but 1 process returned
>a non-zero exit code.. Per user-direction, the job has been aborted.
>mpirun detected that one or more processes exited with non-zero status,
>thus causing
>the job to be terminated. The first process to do so was:
> Process name: [[5660,1],0]
> Exit code: 255
>Please help me to solve the issue.
>With best regards
>Dr. Kshatresh Dutta Dubey
>Post Doctoral Researcher,
>c/o Prof Sason Shaik,
>Hebrew University of Jerusalem, Israel
>Jerusalem, Israel
>AMBER mailing list

AMBER mailing list
Received on Thu Sep 18 2014 - 10:30:04 PDT
Custom Search