Sent you patch code off list to test out. If this works, I'll check it
into the tree...
Very minor changes...
On Thu, Oct 9, 2014 at 11:36 PM, Pablo Ródenas <pablo.rodenas.bsc.es> wrote:
> Good morning Scott,
>
> after modifying the code with your suggestions I get the following results:
> with 255:
> Error: unspecified launch failure launching kernel kClearForces
> cudaFree GpuBuffer::Deallocate failed unspecified launch failure
>
> with 256:
> seems to be working.
>
> I will tell you how the executions ends.
>
> Thanks,
> Pablo.
>
>
> El 09/10/14 a las #4, Scott Le Grand escribió:
> > You missed this part:
> >
> > unsigned int blocks = (gpu->sim.nfft1 * gpu->sim.nfft2 * gpu->sim.nfft3 +
> > 255) >> 8;
> >
> >
> >
> > On Thu, Oct 9, 2014 at 8:54 AM, Pablo Ródenas <pablo.rodenas.bsc.es>
> wrote:
> >
> >> Dear Scott,
> >>
> >> I tried with both changes:
> >> kPMEReduceChargeGridBuffer_kernel<<<blocks, 255>>>();
> >> and
> >> kPMEReduceChargeGridBuffer_kernel<<<blocks, 256>>>();
> >>
> >> and the error arises again:
> >> BLOCKS: 66150
> >> NFFT: 360 196 120
> >> Error: invalid configuration argument launching kernel
> >> kPMEReduceChargeGridBuffer
> >>
> >> Is it there where I have to try with the values 255 and 256 for
> >> ReduceChargeGridBuffer?
> >>
> >> Thanks,
> >> Pablo.
> >>
> >> El 09/10/14 a las #4, Pablo Ródenas escribió:
> >>> Hi Scott,
> >>>
> >>> the nfft values are respectively the followings:
> >>> 360 196 120
> >>>
> >>> I just undo all my changes in ReduceChargeGridBuffer and I will try
> with:
> >>> kPMEReduceChargeGridBuffer_kernel<<<blocks, 255>>>();
> >>> or
> >>> kPMEReduceChargeGridBuffer_kernel<<<blocks, 256>>>();
> >>>
> >>> I will let you know any advance.
> >>>
> >>> Thank you very much.
> >>>
> >>> Best regards,
> >>> Pablo.
> >>>
> >>>
> >>> El 09/10/14 a las #4, Scott Le Grand escribió:
> >>>> Also try 255 and 256 for ReduceChargeGridBuffer...
> >>>>
> >>>>
> >>>> On Thu, Oct 9, 2014 at 8:23 AM, Scott Le Grand <varelse2005.gmail.com
> >
> >>>> wrote:
> >>>>
> >>>>> What is nfft1, nfft2, and nfft3 for this system?
> >>>>>
> >>>>> On Thu, Oct 9, 2014 at 8:20 AM, Pablo Ródenas <pablo.rodenas.bsc.es>
> >>>>> wrote:
> >>>>>
> >>>>>> Dear Scott,
> >>>>>>
> >>>>>> you are right, this is a messy change.
> >>>>>> We only saw clear to change from 65536 to 65535.
> >>>>>> But after this change on the kPMEFillChargeGridBuffer function, we
> got
> >>>>>> the error on the kPMEReduceChargeGridBuffer one. We saw that the
> value
> >>>>>> of block variable on the kPMEReduceChargeGridBuffer function is
> 66150,
> >>>>>> greater than the one of our card, so we try to solve it with a quick
> >>>>>> change in order to adapt it to our card maximum values. Please,
> could
> >>>>>> you try to fix this function too?
> >>>>>>
> >>>>>> FYI, the maximum values of our card are:
> >>>>>> Maximum number of threads per block: 1024
> >>>>>> Maximum sizes of each dimension of a block: 1024 x 1024 x
> 64
> >>>>>> Maximum sizes of each dimension of a grid: 65535 x 65535 x
> >> 65535
> >>>>>> Thank you very much for your answers!
> >>>>>>
> >>>>>> Best regards,
> >>>>>> Pablo.
> >>>>>>
> >>>>>>
> >>>>>> El 09/10/14 a las #4, Scott Le Grand escribió:
> >>>>>>> Also do *not* *mess* *with* LOADSIZE. You will wake the CUDAThuhlu
> >> with
> >>>>>>> such actions and no one wants that...
> >>>>>>>
> >>>>>>> The only change you should make here is changing 65536 to 65535.
> >>>>>>>
> >>>>>>> That said, thanks for hitting this corner case! Fix shortly...
> >>>>>>>
> >>>>>>> On Thu, Oct 9, 2014 at 8:06 AM, Scott Le Grand <
> >> varelse2005.gmail.com>
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>>> Looking closer at this thread, could you try using 65,535 instead
> of
> >>>>>>>> 65,536?
> >>>>>>>>
> >>>>>>>> http://en.wikipedia.org/wiki/CUDA, specifically maximum x
> dimension
> >>>>>> on SM
> >>>>>>>> 2.0 GPUs is 65,535...
> >>>>>>>>
> >>>>>>>> Missed it by one... Works fine on any Kepler or better class GPU
> >>>>>> because
> >>>>>>>> this limit was raised to 2^31 - 1
> >>>>>>>>
> >>>>>>>> Ironically, y and z are still limited to 65,535. I'll check in a
> >> fix
> >>>>>>>> shortly...
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On Thu, Oct 9, 2014 at 7:58 AM, Scott Le Grand <
> >> varelse2005.gmail.com>
> >>>>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> Broken.
> >>>>>>>>>
> >>>>>>>>> Do not use this.
> >>>>>>>>>
> >>>>>>>>> The threadblocks *have* to be 64 for this to work (4 x 4 x 4
> >>>>>>>>> interpolation)...
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On Thu, Oct 9, 2014 at 7:22 AM, Pablo Ródenas <
> >> pablo.rodenas.bsc.es>
> >>>>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>> Dear Jason and everybody,
> >>>>>>>>>>
> >>>>>>>>>> finally I found that this Amber bug comes from the file
> >>>>>>>>>> $AMBERHOME/src/pmemd/src/cuda/kPMEInterpolation.cu.
> >>>>>>>>>>
> >>>>>>>>>> There is the following hardcoded value (instead of getting it by
> >>>>>> asking
> >>>>>>>>>> to the card) in the function kPMEFillChargeGridBuffer which I
> >>>>>> replaced
> >>>>>>>>>> by the settings of my card (Tesla M2090):
> >>>>>>>>>> + (line ~400)
> >>>>>>>>>> int lblocks =
> min(blocks,
> >>>>>> 65535);
> >>>>>>>>>> kPMEFillChargeGridBuffer_kernel<<<lblocks,
> >> 64>>>(offset);
> >>>>>>>>>> LAUNCHERROR("kPMEFillChargeGridBuffer");
> >>>>>>>>>> offset += 65535;
> >>>>>>>>>> blocks -= 65535;
> >>>>>>>>>> -
> >>>>>>>>>> int lblocks =
> min(blocks,
> >>>>>> 65536);
> >>>>>>>>>> kPMEFillChargeGridBuffer_kernel<<<lblocks,
> >> 64>>>(offset);
> >>>>>>>>>> LAUNCHERROR("kPMEFillChargeGridBuffer");
> >>>>>>>>>> offset += 65536;
> >>>>>>>>>> blocks -= 65536;
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> After this change, Amber continues its execution until the next
> >>>>>> error:
> >>>>>>>>>> kPMEReduceChargeGridBuffer. I have also solved this error by
> >>>>>> modifying
> >>>>>>>>>> the function kPMEReduceChargeGridBuffer and its cuda kernel
> >> function
> >>>>>>>>>> kPMEReduceChargeGridBuffer_kernel. So my changes are:
> >>>>>>>>>> + (line ~166)
> >>>>>>>>>> kPMEReduceChargeGridBuffer_kernel(int offset)
> >>>>>>>>>> {
> >>>>>>>>>> unsigned int pos =
> blockIdx.x *
> >>>>>>>>>> blockDim.x + threadIdx.x + offset * blockDim.x;
> >>>>>>>>>> -
> >>>>>>>>>> kPMEReduceChargeGridBuffer_kernel()
> >>>>>>>>>> {
> >>>>>>>>>> unsigned int pos =
> blockIdx.x *
> >>>>>>>>>> blockDim.x + threadIdx.x;
> >>>>>>>>>>
> >>>>>>>>>> and
> >>>>>>>>>>
> >>>>>>>>>> + (line ~209)
> >>>>>>>>>> long long blocks = (gpu->sim.nfft1 * gpu->sim.nfft2 *
> >>>>>> gpu->sim.nfft3
> >>>>>>>>>> + 127) >> 7;
> >>>>>>>>>> int offset = 0;
> >>>>>>>>>>
> >>>>>>>>>> while (blocks > 0)
> >>>>>>>>>> {
> >>>>>>>>>> long long lblocks =
> >> min(blocks,
> >>>>>>>>>> 65535ll);
> >>>>>>>>>> kPMEReduceChargeGridBuffer_kernel<<<lblocks,
> >>>>>> 128>>>(offset);
> >>>>>>>>>> LAUNCHERROR("kPMEReduceChargeGridBuffer");
> >>>>>>>>>> offset += 65535;
> >>>>>>>>>> blocks -= 65535;
> >>>>>>>>>> }
> >>>>>>>>>> -
> >>>>>>>>>> unsigned int blocks = (gpu->sim.nfft1 * gpu->sim.nfft2 *
> >>>>>>>>>> gpu->sim.nfft3 + 127) >> 7;
> >>>>>>>>>> kPMEReduceChargeGridBuffer_kernel<<<blocks, 128>>>();
> >>>>>>>>>> LAUNCHERROR("kPMEReduceChargeGridBuffer");
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> Now it seems to work and I got 0 errors in the amber cuda tests.
> >> But
> >>>>>> I
> >>>>>>>>>> cannot ensure that this code will produce the right values for
> our
> >>>>>>>>>> calculates, the execution is simply working.
> >>>>>>>>>>
> >>>>>>>>>> Please, can you check your pmemd.cuda code in order to get it
> >> working
> >>>>>>>>>> for cards with lower grid and block size? Then we will be very
> >> glad
> >>>>>> if
> >>>>>>>>>> you make a new update with a tested patch solving these issues.
> >>>>>>>>>>
> >>>>>>>>>> Thank you for your attention.
> >>>>>>>>>>
> >>>>>>>>>> Best regards,
> >>>>>>>>>> Pablo.
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> El 04/09/14 a las #4, Jason Swails escribió:
> >>>>>>>>>>> On Thu, Sep 4, 2014 at 2:17 AM, Pablo Ródenas <
> >> pablo.rodenas.bsc.es
> >>>>>>>>>> wrote:
> >>>>>>>>>>>> Good morning,
> >>>>>>>>>>>>
> >>>>>>>>>>>> could you reproduce the problem with the files provided?
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>> O
> >>>>>>>>>>> n my computer (GTX 680, 2 GB of memory), I get a memory
> >> allocation
> >>>>>>>>>> error
> >>>>>>>>>>> because 2 GB is not enough for your system (ca. 700K+ atoms).
> >> When
> >>>>>> I
> >>>>>>>>>> move
> >>>>>>>>>>> to a K20c (4 GB of memory), it runs fine for over 10 minutes
> >> (after
> >>>>>>>>>> which I
> >>>>>>>>>>> killed it because your input files would have run for 10 hours
> on
> >>>>>> the
> >>>>>>>>>>> K20c). That machine has the nVidia toolkit version 5.0 and the
> >>>>>> 331.38
> >>>>>>>>>>> drivers on it.
> >>>>>>>>>>>
> >>>>>>>>>>> I'm not sure why you're having problems... Have you tried
> >> running
> >>>>>> the
> >>>>>>>>>> GPU
> >>>>>>>>>>> validation suite? I know Ross Walker posted a link to it on a
> >>>>>> previous
> >>>>>>>>>>> post, but I can't seem to locate it right now...
> >>>>>>>>>>>
> >>>>>>>>>>> HTH,
> >>>>>>>>>>> Jason
> >>>>>>>>>>>
> >>>>>>>>>> --
> >>>>>>>>>> Pablo Ródenas Barquero (pablo.rodenas.bsc.es)
> >>>>>>>>>> BSC - Centro Nacional de Supercomputación
> >>>>>>>>>> C/ Jordi Girona, 31 WWW: http://www.bsc.es
> >>>>>>>>>> 08034 Barcelona, Spain Tel: +34-93-405 42 29
> >>>>>>>>>> e-mail: support.bsc.es Fax: +34-93-413 77 21
> >>>>>>>>>> -----------------------------------------------
> >>>>>>>>>> CNAG - Centre Nacional Anàlisi Genòmica
> >>>>>>>>>> C/ Baldiri Reixac, 4 WWW: http://www.cnag.cat
> >>>>>>>>>> 08028 Barcelona, Spain Tel: +34-93-403 37 54
> >>>>>>>>>> e-mail: cnag_support.bsc.es
> >>>>>>>>>> -----------------------------------------------
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> WARNING / LEGAL TEXT: This message is intended only for the use
> of
> >>>>>> the
> >>>>>>>>>> individual or entity to which it is addressed and may contain
> >>>>>>>>>> information which is privileged, confidential, proprietary, or
> >> exempt
> >>>>>>>>>> from disclosure under applicable law. If you are not the
> intended
> >>>>>>>>>> recipient or the person responsible for delivering the message
> to
> >> the
> >>>>>>>>>> intended recipient, you are strictly prohibited from disclosing,
> >>>>>>>>>> distributing, copying, or in any way using this message. If you
> >> have
> >>>>>>>>>> received this communication in error, please notify the sender
> and
> >>>>>>>>>> destroy and delete any copies you may have received.
> >>>>>>>>>>
> >>>>>>>>>> http://www.bsc.es/disclaimer
> >>>>>>>>>>
> >>>>>>>>>> _______________________________________________
> >>>>>>>>>> AMBER mailing list
> >>>>>>>>>> AMBER.ambermd.org
> >>>>>>>>>> http://lists.ambermd.org/mailman/listinfo/amber
> >>>>>>>>>>
> >>>>>>> _______________________________________________
> >>>>>>> AMBER mailing list
> >>>>>>> AMBER.ambermd.org
> >>>>>>> http://lists.ambermd.org/mailman/listinfo/amber
> >>>>>> --
> >>>>>> Pablo Ródenas Barquero (pablo.rodenas.bsc.es)
> >>>>>> BSC - Centro Nacional de Supercomputación
> >>>>>> C/ Jordi Girona, 31 WWW: http://www.bsc.es
> >>>>>> 08034 Barcelona, Spain Tel: +34-93-405 42 29
> >>>>>> e-mail: support.bsc.es Fax: +34-93-413 77 21
> >>>>>> -----------------------------------------------
> >>>>>> CNAG - Centre Nacional Anàlisi Genòmica
> >>>>>> C/ Baldiri Reixac, 4 WWW: http://www.cnag.cat
> >>>>>> 08028 Barcelona, Spain Tel: +34-93-403 37 54
> >>>>>> e-mail: cnag_support.bsc.es
> >>>>>> -----------------------------------------------
> >>>>>>
> >>>>>>
> >>>>>> WARNING / LEGAL TEXT: This message is intended only for the use of
> the
> >>>>>> individual or entity to which it is addressed and may contain
> >>>>>> information which is privileged, confidential, proprietary, or
> exempt
> >>>>>> from disclosure under applicable law. If you are not the intended
> >>>>>> recipient or the person responsible for delivering the message to
> the
> >>>>>> intended recipient, you are strictly prohibited from disclosing,
> >>>>>> distributing, copying, or in any way using this message. If you have
> >>>>>> received this communication in error, please notify the sender and
> >>>>>> destroy and delete any copies you may have received.
> >>>>>>
> >>>>>> http://www.bsc.es/disclaimer
> >>>>>>
> >>>>>> _______________________________________________
> >>>>>> AMBER mailing list
> >>>>>> AMBER.ambermd.org
> >>>>>> http://lists.ambermd.org/mailman/listinfo/amber
> >>>>>>
> >>>> _______________________________________________
> >>>> AMBER mailing list
> >>>> AMBER.ambermd.org
> >>>> http://lists.ambermd.org/mailman/listinfo/amber
> >> --
> >> Pablo Ródenas Barquero (pablo.rodenas.bsc.es)
> >> BSC - Centro Nacional de Supercomputación
> >> C/ Jordi Girona, 31 WWW: http://www.bsc.es
> >> 08034 Barcelona, Spain Tel: +34-93-405 42 29
> >> e-mail: support.bsc.es Fax: +34-93-413 77 21
> >> -----------------------------------------------
> >> CNAG - Centre Nacional Anàlisi Genòmica
> >> C/ Baldiri Reixac, 4 WWW: http://www.cnag.cat
> >> 08028 Barcelona, Spain Tel: +34-93-403 37 54
> >> e-mail: cnag_support.bsc.es
> >> -----------------------------------------------
> >>
> >>
> >> WARNING / LEGAL TEXT: This message is intended only for the use of the
> >> individual or entity to which it is addressed and may contain
> >> information which is privileged, confidential, proprietary, or exempt
> >> from disclosure under applicable law. If you are not the intended
> >> recipient or the person responsible for delivering the message to the
> >> intended recipient, you are strictly prohibited from disclosing,
> >> distributing, copying, or in any way using this message. If you have
> >> received this communication in error, please notify the sender and
> >> destroy and delete any copies you may have received.
> >>
> >> http://www.bsc.es/disclaimer
> >>
> >> _______________________________________________
> >> AMBER mailing list
> >> AMBER.ambermd.org
> >> http://lists.ambermd.org/mailman/listinfo/amber
> >>
> > _______________________________________________
> > AMBER mailing list
> > AMBER.ambermd.org
> > http://lists.ambermd.org/mailman/listinfo/amber
>
> --
> Pablo Ródenas Barquero (pablo.rodenas.bsc.es)
> BSC - Centro Nacional de Supercomputación
> C/ Jordi Girona, 31 WWW: http://www.bsc.es
> 08034 Barcelona, Spain Tel: +34-93-405 42 29
> e-mail: support.bsc.es Fax: +34-93-413 77 21
> -----------------------------------------------
> CNAG - Centre Nacional Anàlisi Genòmica
> C/ Baldiri Reixac, 4 WWW: http://www.cnag.cat
> 08028 Barcelona, Spain Tel: +34-93-403 37 54
> e-mail: cnag_support.bsc.es
> -----------------------------------------------
>
>
> WARNING / LEGAL TEXT: This message is intended only for the use of the
> individual or entity to which it is addressed and may contain
> information which is privileged, confidential, proprietary, or exempt
> from disclosure under applicable law. If you are not the intended
> recipient or the person responsible for delivering the message to the
> intended recipient, you are strictly prohibited from disclosing,
> distributing, copying, or in any way using this message. If you have
> received this communication in error, please notify the sender and
> destroy and delete any copies you may have received.
>
> http://www.bsc.es/disclaimer
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Fri Oct 10 2014 - 12:30:02 PDT