Dear Dan,
Thank you for investigating this bug. In our core dump we got exactly
the same values you pasted here (75042866, 3158064, ... at the beginning
of the Selected_ vector in atom mask object).
I hope you will manage to find the cause of this memory corruption.
Andrzej
On 9.12.2022 14:56, Daniel Roe via AMBER wrote:
> OK - so I was able to reproduce the bug, and it does seem like it's a
> memory overwrite issue. I'm running an extensive valgrind memcheck to
> try to pinpoint the exact cause now.
>
> What is happening is that the selected atoms array (which contains the
> indices of each selected atom) in the atom mask in the RMS action is
> being corrupted somehow. Here you can see the first two elements are
> clearly incorrect (it should look like 0, 1, 2, 3...):
>
> (gdb) print tgtMask_.Selected_
> $12 = std::vector of length 9280, capacity 16384 = {775042866,
> 3158064, 2, 3, 4, 5, 6, 7, 8, 9,
>
> There is almost no way this could happen without some sort of memory
> corruption since the routine that sets up the selected array
> (Selected_) looks like this (AtomMask.cpp):
>
> Selected_.clear();
> for (int atom = 0; atom != Natom_; atom++) {
> if (charmask[atom] == maskChar_)
> Selected_.push_back( atom );
> }
>
> When subsequent routines try to use this corrupted mask they hit the
> huge first index which is way out of range (in a 9280 atom system)
> which is what triggers the segfault that actually stops execution.
>
> Unfortunately one of the downsides to valgrind being thorough is that
> it is also slow. I've had the run going overnight and nothing has
> triggered yet. I'll keep you up to date with what I find.
>
> -Dan
>
> On Thu, Dec 8, 2022 at 10:17 AM Daniel Roe <daniel.r.roe.gmail.com> wrote:
>> Thanks, I'm downloading it now. I was able to run the given input with
>> cpptraj on the shorter trajectory you provided with no issues;
>> valgrind showed no memory errors. This is starting to feel like an
>> out-of-memory type issue, but I will keep digging.
>>
>> I'm already seeing some areas where quality of life improvements can
>> be made to cpptraj (e.g. every frame does not need to be printed to
>> stdout for 'onlyframes' etc).
>>
>> I'll report when/if I find anything. Thanks for the files.
>>
>> -Dan
>>
>> On Thu, Dec 8, 2022 at 8:51 AM Andrzej Dorobisz via AMBER
>> <amber.ambermd.org> wrote:
>>> Hi,
>>> I just uploaded the input data (22 GB) so you can download and test
>>> cpptraj on it.
>>>
>>> - file E81A.nc
>>> https://s3.cloud.cyfronet.pl/share/amber-cpptraj-issue/E81A.nc?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=71M2J3OGZ6O5J6K1WAFP%2F20221208%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20221208T134155Z&X-Amz-Expires=604800&X-Amz-SignedHeaders=host&X-Amz-Signature=d01eba1bf5f637ccd55f1f28cdd0623ded099ad95a226a66108f8bc8cc1eeca9
>>> <https://s3.cloud.cyfronet.pl/share/amber-cpptraj-issue/E81A.nc?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=71M2J3OGZ6O5J6K1WAFP%2F20221208%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20221208T134155Z&X-Amz-Expires=604800&X-Amz-SignedHeaders=host&X-Amz-Signature=d01eba1bf5f637ccd55f1f28cdd0623ded099ad95a226a66108f8bc8cc1eeca9>
>>>
>>> - all other files (E81A.top + input-cpptraj.txt)
>>> https://s3.cloud.cyfronet.pl/share/amber-cpptraj-issue/cpptraj-SIGSEGV-files.zip?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=71M2J3OGZ6O5J6K1WAFP%2F20221208%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20221208T134129Z&X-Amz-Expires=604800&X-Amz-SignedHeaders=host&X-Amz-Signature=ca2814b488dbe69ed12261ad8b64b057870d155779cba7b147ce9d05af2f7f70
>>> <https://s3.cloud.cyfronet.pl/share/amber-cpptraj-issue/cpptraj-SIGSEGV-files.zip?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=71M2J3OGZ6O5J6K1WAFP%2F20221208%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20221208T134129Z&X-Amz-Expires=604800&X-Amz-SignedHeaders=host&X-Amz-Signature=ca2814b488dbe69ed12261ad8b64b057870d155779cba7b147ce9d05af2f7f70>
>>>
>>> The error occurs after about 1 hour and 30 minutes.
>>>
>>>
>>> Regards,
>>> Andrzej
>>>
>>>
>>> On 6.12.2022 16:43, Daniel Roe via AMBER wrote:
>>>> Hi,
>>>>
>>>> On Tue, Dec 6, 2022 at 10:21 AM Andrzej Dorobisz
>>>> <andrzej.dorobisz.cyfronet.krakow.pl> wrote:
>>>>> Thank you for the quick reply. I can send the topology file but I don't
>>>>> know how to extract frames from the trajectory file (../input/E81A.nc).
>>>> The input would be something like:
>>>>
>>>> parm E81A.top
>>>> trajin E81A.nc 1 10
>>>> trajout E81A.1-10.nc
>>>>
>>>> -Dan
>>>>
>>>> _______________________________________________
>>>> AMBER mailing list
>>>> AMBER.ambermd.org
>>>> http://lists.ambermd.org/mailman/listinfo/amber
>>> _______________________________________________
>>> AMBER mailing list
>>> AMBER.ambermd.org
>>> http://lists.ambermd.org/mailman/listinfo/amber
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Fri Dec 09 2022 - 07:30:03 PST