Re: [AMBER] Error: invalid configuration argument launching kernel kPMEFillChargeGridBuffer

From: Pablo Ródenas <pablo.rodenas.bsc.es>
Date: Wed, 13 Aug 2014 13:11:39 +0200

Hi again,

after reinstalling Amber14 with mvapich2/2.0 the tests results are the
following:
CUDA --> 90 passed, 35 failures, 0 errors
CUDA_PARALLEL (do_parallel = 2) --> 52 passed, 35 failures, 0 errors.

But the error kPMEFillChargeGridBuffer still appears after 10~20 seconds
of execution with either pmemd.cuda and pmemd.cuda.MPI

Best regards,
Pablo.


On 08/13/2014 08:59 AM, Pablo Ródenas wrote:
> Good morning Dan,
>
> thanks for your reply.
>
> The problem is also appearing with pmemd.cuda, I will try to recompile
> it with mpich2 but I think the error is related with the CUDA side
> instead of the MPI side.
>
> Best regards,
> Pablo.
>
> On 08/12/2014 07:34 PM, Daniel Roe wrote:
>> Hi,
>>
>> It appears that the problem is only with pmemd.cuda.MPI; is that
>> right? I have often encountered issues when using openmpi. Can you try
>> compiling with mpich2 and see if the problem disappears?
>>
>> -Dan
>>
>> On Tue, Aug 12, 2014 at 12:00 AM, Pablo Ródenas <pablo.rodenas.bsc.es> wrote:
>>> Good morning,
>>>
>>> after testing the installation, the error is still appearing in every
>>> CUDA execution of the user's input but not with the CPU version of
>>> Amber14 or previous versions of Amber with CUDA. Any hint about how to
>>> avoid it or what else can I check?
>>>
>>> Many thanks,
>>> Pablo.
>>>
>>>
>>> On 08/07/2014 04:54 PM, Pablo Ródenas wrote:
>>>> Thanks for your answer Jason,
>>>>
>>>> I ran the CUDA and CUDA parallel tests and I obtained the following
>>>> results depending on the case, but all failures were due to a small
>>>> differences in the numbers:
>>>> CUDA (installation was done with gcc/4.6.1, mkl/11.1 and cuda/5.0)
>>>> 90 passed, 35 failures (expected) and 0 errors
>>>> all tests done over driver 304.54 and 331.62 with the same results.
>>>>
>>>> CUDA PARALLEL (installation was done with gcc/4.6.1, mkl/11.1, cuda/5.0
>>>> and openmpi/1.7.3)
>>>> DO_PARALLEL=4 and driver 304.54 --> 50 passed, 37 failures and 0 errors
>>>> DO_PARALLEL=4 and driver 331.62 --> 47 passed, 40 failures and 0 errors
>>>> DO_PARALLEL=8 and driver 304.54 --> 23 passed, 61 failures and 10 errors
>>>> DO_PARALLEL=8 and driver 331.62 --> 22 passed, 62 failures and 9 errors
>>>>
>>>> I really appreciate your help.
>>>>
>>>> Best regards,
>>>> Pablo.
>>>>
>>>> On 07/28/2014 03:42 PM, Jason Swails wrote:
>>>>> On Mon, Jul 28, 2014 at 9:04 AM, Pablo Ródenas <pablo.rodenas.bsc.es> wrote:
>>>>>
>>>>>> Good afternoon,
>>>>>>
>>>>>> Ruben Perez (with Amber license Q2818013A) and us (as user support team)
>>>>>> are trying to execute pmemd.CUDA of Amber14 in our cluster and we got
>>>>>> always the error of the subject with an input that works fine with the
>>>>>> GPU version of pmemd in Amber12 and with the CPU version of pmemd in
>>>>>> Amber14.
>>>>>>
>>>>>> Our version of Amber14 has been firstly updated and then compiled with
>>>>>> different options, for example, our latest tests were using
>>>>>> OpenMPI/1.7.3, gcc/4.6.1 and CUDA/5.0 but we had the same error with
>>>>>> bullxmpi/1.1.11.1 + intel compilers 14.0.1 and/or CUDA6. The tests were
>>>>>> performed in 1 node with 1 and/or 2 GPUs with the same result.
>>>>>>
>>>>>> The node specs are 12 CPUs (Intel(R) Xeon(R) CPU E5649 @ 2.53GHz) and 2
>>>>>> GPUs (Tesla M2090), Driver Version 304.54. We have also tested with
>>>>>> driver version 331.62 with no luck.
>>>>>>
>>>>>> Please, do not hesitate to ask us for more information and thank you
>>>>>> very much in advance for your help.
>>>>>>
>>>>>>
>>>>> This is a new error that I've never seen before. Have you run the CUDA
>>>>> tests? Do the tests pass? As a note, a lot of the tests will fail with
>>>>> either simple roundoff differences or with other differences stemming from
>>>>> different random number sequences when stochastic thermostats (ntt=2 or
>>>>> ntt=3) are used. So apart from those (expected) failures, it's important
>>>>> to make sure that all of the other tests pass to make sure your
>>>>> installation works.
>>>>>
>>>>> HTH,
>>>>> Jason
>>>>>
>>> --
>>> Pablo Ródenas Barquero (pablo.rodenas.bsc.es)
>>> BSC - Centro Nacional de Supercomputación
>>> C/ Jordi Girona, 31 WWW: http://www.bsc.es
>>> 08034 Barcelona, Spain Tel: +34-93-405 42 29
>>> e-mail: support.bsc.es Fax: +34-93-413 77 21
>>> -----------------------------------------------
>>> CNAG - Centre Nacional Anàlisi Genòmica
>>> C/ Baldiri Reixac, 4 WWW: http://www.cnag.cat
>>> 08028 Barcelona, Spain Tel: +34-93-403 37 54
>>> e-mail: cnag_support.bsc.es
>>> -----------------------------------------------
>>>
>>>
>>> WARNING / LEGAL TEXT: This message is intended only for the use of the
>>> individual or entity to which it is addressed and may contain
>>> information which is privileged, confidential, proprietary, or exempt
>>> from disclosure under applicable law. If you are not the intended
>>> recipient or the person responsible for delivering the message to the
>>> intended recipient, you are strictly prohibited from disclosing,
>>> distributing, copying, or in any way using this message. If you have
>>> received this communication in error, please notify the sender and
>>> destroy and delete any copies you may have received.
>>>
>>> http://www.bsc.es/disclaimer
>>>
>>> _______________________________________________
>>> AMBER mailing list
>>> AMBER.ambermd.org
>>> http://lists.ambermd.org/mailman/listinfo/amber
>>

-- 
Pablo Ródenas Barquero (pablo.rodenas.bsc.es)
BSC - Centro Nacional de Supercomputación
C/ Jordi Girona, 31    WWW: http://www.bsc.es
08034 Barcelona, Spain Tel: +34-93-405 42 29
e-mail: support.bsc.es Fax: +34-93-413 77 21
-----------------------------------------------
CNAG - Centre Nacional Anàlisi Genòmica
C/ Baldiri Reixac, 4   WWW: http://www.cnag.cat
08028 Barcelona, Spain Tel: +34-93-403 37 54
e-mail: cnag_support.bsc.es
-----------------------------------------------
WARNING / LEGAL TEXT: This message is intended only for the use of the
individual or entity to which it is addressed and may contain
information which is privileged, confidential, proprietary, or exempt
from disclosure under applicable law. If you are not the intended
recipient or the person responsible for delivering the message to the
intended recipient, you are strictly prohibited from disclosing,
distributing, copying, or in any way using this message. If you have
received this communication in error, please notify the sender and
destroy and delete any copies you may have received.
http://www.bsc.es/disclaimer
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Wed Aug 13 2014 - 04:30:02 PDT
Custom Search