Also try 255 and 256 for ReduceChargeGridBuffer...
On Thu, Oct 9, 2014 at 8:23 AM, Scott Le Grand <varelse2005.gmail.com>
wrote:
> What is nfft1, nfft2, and nfft3 for this system?
>
> On Thu, Oct 9, 2014 at 8:20 AM, Pablo Ródenas <pablo.rodenas.bsc.es>
> wrote:
>
>> Dear Scott,
>>
>> you are right, this is a messy change.
>> We only saw clear to change from 65536 to 65535.
>> But after this change on the kPMEFillChargeGridBuffer function, we got
>> the error on the kPMEReduceChargeGridBuffer one. We saw that the value
>> of block variable on the kPMEReduceChargeGridBuffer function is 66150,
>> greater than the one of our card, so we try to solve it with a quick
>> change in order to adapt it to our card maximum values. Please, could
>> you try to fix this function too?
>>
>> FYI, the maximum values of our card are:
>> Maximum number of threads per block: 1024
>> Maximum sizes of each dimension of a block: 1024 x 1024 x 64
>> Maximum sizes of each dimension of a grid: 65535 x 65535 x 65535
>>
>> Thank you very much for your answers!
>>
>> Best regards,
>> Pablo.
>>
>>
>> El 09/10/14 a las #4, Scott Le Grand escribió:
>> > Also do *not* *mess* *with* LOADSIZE. You will wake the CUDAThuhlu with
>> > such actions and no one wants that...
>> >
>> > The only change you should make here is changing 65536 to 65535.
>> >
>> > That said, thanks for hitting this corner case! Fix shortly...
>> >
>> > On Thu, Oct 9, 2014 at 8:06 AM, Scott Le Grand <varelse2005.gmail.com>
>> > wrote:
>> >
>> >> Looking closer at this thread, could you try using 65,535 instead of
>> >> 65,536?
>> >>
>> >> http://en.wikipedia.org/wiki/CUDA, specifically maximum x dimension
>> on SM
>> >> 2.0 GPUs is 65,535...
>> >>
>> >> Missed it by one... Works fine on any Kepler or better class GPU
>> because
>> >> this limit was raised to 2^31 - 1
>> >>
>> >> Ironically, y and z are still limited to 65,535. I'll check in a fix
>> >> shortly...
>> >>
>> >>
>> >>
>> >> On Thu, Oct 9, 2014 at 7:58 AM, Scott Le Grand <varelse2005.gmail.com>
>> >> wrote:
>> >>
>> >>> Broken.
>> >>>
>> >>> Do not use this.
>> >>>
>> >>> The threadblocks *have* to be 64 for this to work (4 x 4 x 4
>> >>> interpolation)...
>> >>>
>> >>>
>> >>>
>> >>>
>> >>> On Thu, Oct 9, 2014 at 7:22 AM, Pablo Ródenas <pablo.rodenas.bsc.es>
>> >>> wrote:
>> >>>
>> >>>> Dear Jason and everybody,
>> >>>>
>> >>>> finally I found that this Amber bug comes from the file
>> >>>> $AMBERHOME/src/pmemd/src/cuda/kPMEInterpolation.cu.
>> >>>>
>> >>>> There is the following hardcoded value (instead of getting it by
>> asking
>> >>>> to the card) in the function kPMEFillChargeGridBuffer which I
>> replaced
>> >>>> by the settings of my card (Tesla M2090):
>> >>>> + (line ~400)
>> >>>> int lblocks = min(blocks,
>> 65535);
>> >>>> kPMEFillChargeGridBuffer_kernel<<<lblocks, 64>>>(offset);
>> >>>> LAUNCHERROR("kPMEFillChargeGridBuffer");
>> >>>> offset += 65535;
>> >>>> blocks -= 65535;
>> >>>> -
>> >>>> int lblocks = min(blocks,
>> 65536);
>> >>>> kPMEFillChargeGridBuffer_kernel<<<lblocks, 64>>>(offset);
>> >>>> LAUNCHERROR("kPMEFillChargeGridBuffer");
>> >>>> offset += 65536;
>> >>>> blocks -= 65536;
>> >>>>
>> >>>>
>> >>>> After this change, Amber continues its execution until the next
>> error:
>> >>>> kPMEReduceChargeGridBuffer. I have also solved this error by
>> modifying
>> >>>> the function kPMEReduceChargeGridBuffer and its cuda kernel function
>> >>>> kPMEReduceChargeGridBuffer_kernel. So my changes are:
>> >>>> + (line ~166)
>> >>>> kPMEReduceChargeGridBuffer_kernel(int offset)
>> >>>> {
>> >>>> unsigned int pos = blockIdx.x *
>> >>>> blockDim.x + threadIdx.x + offset * blockDim.x;
>> >>>> -
>> >>>> kPMEReduceChargeGridBuffer_kernel()
>> >>>> {
>> >>>> unsigned int pos = blockIdx.x *
>> >>>> blockDim.x + threadIdx.x;
>> >>>>
>> >>>> and
>> >>>>
>> >>>> + (line ~209)
>> >>>> long long blocks = (gpu->sim.nfft1 * gpu->sim.nfft2 *
>> gpu->sim.nfft3
>> >>>> + 127) >> 7;
>> >>>> int offset = 0;
>> >>>>
>> >>>> while (blocks > 0)
>> >>>> {
>> >>>> long long lblocks = min(blocks,
>> >>>> 65535ll);
>> >>>> kPMEReduceChargeGridBuffer_kernel<<<lblocks,
>> 128>>>(offset);
>> >>>> LAUNCHERROR("kPMEReduceChargeGridBuffer");
>> >>>> offset += 65535;
>> >>>> blocks -= 65535;
>> >>>> }
>> >>>> -
>> >>>> unsigned int blocks = (gpu->sim.nfft1 * gpu->sim.nfft2 *
>> >>>> gpu->sim.nfft3 + 127) >> 7;
>> >>>> kPMEReduceChargeGridBuffer_kernel<<<blocks, 128>>>();
>> >>>> LAUNCHERROR("kPMEReduceChargeGridBuffer");
>> >>>>
>> >>>>
>> >>>> Now it seems to work and I got 0 errors in the amber cuda tests. But
>> I
>> >>>> cannot ensure that this code will produce the right values for our
>> >>>> calculates, the execution is simply working.
>> >>>>
>> >>>> Please, can you check your pmemd.cuda code in order to get it working
>> >>>> for cards with lower grid and block size? Then we will be very glad
>> if
>> >>>> you make a new update with a tested patch solving these issues.
>> >>>>
>> >>>> Thank you for your attention.
>> >>>>
>> >>>> Best regards,
>> >>>> Pablo.
>> >>>>
>> >>>>
>> >>>> El 04/09/14 a las #4, Jason Swails escribió:
>> >>>>> On Thu, Sep 4, 2014 at 2:17 AM, Pablo Ródenas <pablo.rodenas.bsc.es
>> >
>> >>>> wrote:
>> >>>>>> Good morning,
>> >>>>>>
>> >>>>>> could you reproduce the problem with the files provided?
>> >>>>>>
>> >>>>>>
>> >>>>> O
>> >>>>> n my computer (GTX 680, 2 GB of memory), I get a memory allocation
>> >>>> error
>> >>>>> because 2 GB is not enough for your system (ca. 700K+ atoms). When
>> I
>> >>>> move
>> >>>>> to a K20c (4 GB of memory), it runs fine for over 10 minutes (after
>> >>>> which I
>> >>>>> killed it because your input files would have run for 10 hours on
>> the
>> >>>>> K20c). That machine has the nVidia toolkit version 5.0 and the
>> 331.38
>> >>>>> drivers on it.
>> >>>>>
>> >>>>> I'm not sure why you're having problems... Have you tried running
>> the
>> >>>> GPU
>> >>>>> validation suite? I know Ross Walker posted a link to it on a
>> previous
>> >>>>> post, but I can't seem to locate it right now...
>> >>>>>
>> >>>>> HTH,
>> >>>>> Jason
>> >>>>>
>> >>>> --
>> >>>> Pablo Ródenas Barquero (pablo.rodenas.bsc.es)
>> >>>> BSC - Centro Nacional de Supercomputación
>> >>>> C/ Jordi Girona, 31 WWW: http://www.bsc.es
>> >>>> 08034 Barcelona, Spain Tel: +34-93-405 42 29
>> >>>> e-mail: support.bsc.es Fax: +34-93-413 77 21
>> >>>> -----------------------------------------------
>> >>>> CNAG - Centre Nacional Anàlisi Genòmica
>> >>>> C/ Baldiri Reixac, 4 WWW: http://www.cnag.cat
>> >>>> 08028 Barcelona, Spain Tel: +34-93-403 37 54
>> >>>> e-mail: cnag_support.bsc.es
>> >>>> -----------------------------------------------
>> >>>>
>> >>>>
>> >>>> WARNING / LEGAL TEXT: This message is intended only for the use of
>> the
>> >>>> individual or entity to which it is addressed and may contain
>> >>>> information which is privileged, confidential, proprietary, or exempt
>> >>>> from disclosure under applicable law. If you are not the intended
>> >>>> recipient or the person responsible for delivering the message to the
>> >>>> intended recipient, you are strictly prohibited from disclosing,
>> >>>> distributing, copying, or in any way using this message. If you have
>> >>>> received this communication in error, please notify the sender and
>> >>>> destroy and delete any copies you may have received.
>> >>>>
>> >>>> http://www.bsc.es/disclaimer
>> >>>>
>> >>>> _______________________________________________
>> >>>> AMBER mailing list
>> >>>> AMBER.ambermd.org
>> >>>> http://lists.ambermd.org/mailman/listinfo/amber
>> >>>>
>> >>>
>> > _______________________________________________
>> > AMBER mailing list
>> > AMBER.ambermd.org
>> > http://lists.ambermd.org/mailman/listinfo/amber
>>
>> --
>> Pablo Ródenas Barquero (pablo.rodenas.bsc.es)
>> BSC - Centro Nacional de Supercomputación
>> C/ Jordi Girona, 31 WWW: http://www.bsc.es
>> 08034 Barcelona, Spain Tel: +34-93-405 42 29
>> e-mail: support.bsc.es Fax: +34-93-413 77 21
>> -----------------------------------------------
>> CNAG - Centre Nacional Anàlisi Genòmica
>> C/ Baldiri Reixac, 4 WWW: http://www.cnag.cat
>> 08028 Barcelona, Spain Tel: +34-93-403 37 54
>> e-mail: cnag_support.bsc.es
>> -----------------------------------------------
>>
>>
>> WARNING / LEGAL TEXT: This message is intended only for the use of the
>> individual or entity to which it is addressed and may contain
>> information which is privileged, confidential, proprietary, or exempt
>> from disclosure under applicable law. If you are not the intended
>> recipient or the person responsible for delivering the message to the
>> intended recipient, you are strictly prohibited from disclosing,
>> distributing, copying, or in any way using this message. If you have
>> received this communication in error, please notify the sender and
>> destroy and delete any copies you may have received.
>>
>> http://www.bsc.es/disclaimer
>>
>> _______________________________________________
>> AMBER mailing list
>> AMBER.ambermd.org
>> http://lists.ambermd.org/mailman/listinfo/amber
>>
>
>
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Thu Oct 09 2014 - 08:30:04 PDT