Re: [AMBER] GpuBuffer::Download failed for kNLBuildNeighborListOrthogonal16_kernel

From: Scott Le Grand <varelse2005.gmail.com>
Date: Fri, 9 Aug 2013 09:46:24 -0700

So, I have the fix. It "works" but it looks like you really need to adjust
the size of your solvent box.

I see + or - 60 or so on a solvent box that's 240 A^3...

the entire system has roughly 5% of the expected density of a typical
solvent box, so I tried halving your box dimensions in inpcrd and
everything works...

(I also have a test for low density now but this system then requires too
much memory for the card I currently have installed)

Scott


On Thu, Aug 8, 2013 at 8:10 PM, Scott Le Grand <varelse2005.gmail.com>wrote:

> Figured this out, fix in a couple days... Or, ironically, just fill all
> that empty space up with atomic goodness and the problem will fix itself
> (for now)...
>
>
>
> On Thu, Aug 8, 2013 at 5:31 PM, Scott Le Grand <varelse2005.gmail.com>wrote:
>
>> Reproduced at my end, guessing it's somehow due to lots of empty space in
>> the simulation box, but that still shouldn't crash....
>>
>>
>>
>> On Thu, Aug 8, 2013 at 7:12 AM, Kyle Sutherland-Cash <khs26.cam.ac.uk>wrote:
>>
>>> Hello Scott,
>>>
>>> I've attached working and failing input (they only differ by the value
>>> of the cut off). Let me know if there's anything else you need.
>>>
>>> Thanks,
>>>
>>> Kyle
>>>
>>>
>>> On 7 August 2013 18:28, Scott Le Grand <varelse2005.gmail.com> wrote:
>>>
>>>> Send me the input files?
>>>> On Aug 7, 2013 10:14 AM, "Kyle Sutherland-Cash" <khs26.cam.ac.uk>
>>>> wrote:
>>>>
>>>> > Hello,
>>>> >
>>>> > I had an issue that has been mentioned before at:
>>>> > http://archive.ambermd.org/201307/0514.html
>>>> >
>>>> > This occurs when running a minimisation for an 80,000 atom system
>>>> with cut
>>>> > offs over a certain size (10.0 and 11.0 work fine, but 12.0 and
>>>> larger do
>>>> > not). The error is reported as GpuBuffer::Download failed unspecified
>>>> > launch error (i.e. probably a CUDA segfault).
>>>> >
>>>> > I ran pmemd.cuda through cuda-memcheck and it highlighted calls to
>>>> > kNLBuildNeighborListOrthogonal16_kernel. All of the errors were of the
>>>> > following form:
>>>> >
>>>> > ========= Invalid __global__ write of size 4
>>>> > ========= at 0x00000e38 in
>>>> > kNLBuildNeighborListOrthogonal16_kernel(void)
>>>> > ========= by thread (527,0,0) in block (0,0,0)
>>>> > ========= Address 0x70eaede3c is out of bounds
>>>> >
>>>> > The threads in question were threads (512-527,0,0) in block (0,0,0)
>>>> and
>>>> > (192-207,0,0) in block (7,0,0). I tried looking at kBNL.h, but I don't
>>>> > really know CUDA well enough to work out where the indexing might
>>>> have gone
>>>> > awry.
>>>> >
>>>> > If it helps, I can upload input files as well.
>>>> >
>>>> > The code was built yesterday with all the available bugfixes using
>>>> CUDA
>>>> > 5.5, ifort and the Intel MKL.
>>>> >
>>>> > Thanks,
>>>> >
>>>> > Kyle
>>>> > _______________________________________________
>>>> > AMBER mailing list
>>>> > AMBER.ambermd.org
>>>> > http://lists.ambermd.org/mailman/listinfo/amber
>>>> >
>>>> _______________________________________________
>>>> AMBER mailing list
>>>> AMBER.ambermd.org
>>>> http://lists.ambermd.org/mailman/listinfo/amber
>>>>
>>>
>>>
>>>
>>> --
>>> Kyle Sutherland-Cash
>>> PhD student, Wales group
>>>
>>> Department of Chemistry
>>> University of Cambridge
>>> Cambridge
>>> United Kingdom
>>> CB2 1DQ
>>>
>>
>>
>
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Fri Aug 09 2013 - 10:00:03 PDT
Custom Search