[AMBER] cudaMemcpy GpuBuffer ERROR

From: Gerardo Zerbetto De Palma <g.zerbetto.gmail.com>
Date: Fri, 16 Jul 2021 13:01:56 -0300

Hi everyone.
We were trying to run some simulations of a membrane protein on an NVIDIA
TITAN V and got stuck by some cudaMemcpy that came in different flavors:

cudaMemcpy GpuBuffer::Upload failed unspecified launch failure
cudaMemcpy GpuBuffer::Download failed unspecified launch failure
cudaMemcpy GpuBuffer::Download failed an illegal instruction was encountered

Firstly we started running the sim using amber 18, restarting the sim every
5 nanoseconds to get consecutive 5ns trajectories. After simulating 25
nanoseconds, the program stopped randomly. Then we tried to repeat the
simulation that had failed (using the same random seed and initial
coordinates) and the simulation succeeded, but the same error came up in a
subsequent simulation. These errors kept coming at a random timestep when
we restarted the simulations. Energies in the output seemed to be OK and
simulations sometimes proceeded without errors when restarted. Hoping that
this was a bug, we compiled amber 20 and ran the same simulations and had
the same random cudaMemcpy errors. Just to check if the simulated system
was fine, we are also running it in a RTX2080 with amber 18 without
problems, so far.

We are running out of ideas here so here we are reaching out to the
community for some help in this matter. We will appreciate every idea or
question that can enlighten us to solve this puzzle.

Gerardo Zerbetto De Palma

AMBER mailing list
Received on Fri Jul 16 2021 - 09:30:02 PDT
Custom Search