Re: [AMBER] make test MPI don't continue

From: Gustaf Olsson <gustaf.olsson.lnu.se>
Date: Thu, 27 Feb 2020 13:15:40 +0000

During compilation it seems that both the -mpi and -openmp utilises both CPU cores I have allocated at around 60-100%. When doing the tests, this did not seem to be the case.

After completing the -openmp compilation on top of the -mpi compilation I ran the tests again, they progressed at 2 x 60-100% on both cores until:

    TEST: /home/me/amber18/AmberTools/src/cpptraj/test/Test_RotateDihedral

      CPPTRAJ: Rotate dihedral to target value.

Then CPU usage dropped down to around 5% again. Which process seems to cause the drop seems somewhat random, sadly.

// Gustaf

> On 27 Feb 2020, at 13:48, Gustaf Olsson <gustaf.olsson.lnu.se> wrote:
>
> Some comments and findings trying to investigate this further.
>
> Running the installation seems to work for both the serial and parallell versions with “success” messages presented in both cases.
>
> Running the tests for the serial installation seems to “halt” between tests at random times, pressing the return/enter key then suddenly makes a few tests fly by in an instance
>
> Running the tests for the parallell installation initially produced the same problem I have observed on the Mac, the firewall. I now very little regarding the details though it seems that openmpi utilises “localhost” for something during testing and potentially TCP traffic (?). This throws warning regarding every single test from Windows Defender Firewall when running under WSL and by the macOS while testing on the Mac.
>
> Turning off the firewall solved this issue under WSL however, I noticed that tests were not progressing as hoped. Looking at the task manager I realised that the Microsoft Real Time Protection engine was going nuts. Turning off real time protection seemingly allowed the test to execute as intended. However, I am now looking at the sander.MPI process while running "./Run.tip4p_mcbar”. It is running however only using less then 5% of available CPU. This might have something to do with the warnings sprinkled between all MPI tests:
>
> WARNING: Linux kernel CMA support was requested via the
> btl_vader_single_copy_mechanism MCA variable, but CMA support is
> not available due to restrictive ptrace settings.
>
> The vader shared memory BTL will fall back on another single-copy
> mechanism if one is available. This may result in lower performance.
>
> Local host: DESKTOP-NQCITAE
> --------------------------------------------------------------------------
> [DESKTOP-NQCITAE:02992] 1 more process has sent help message help-btl-vader.txt / cma-permission-denied
> [DESKTOP-NQCITAE:02992] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
>
> I would suspect that this and the other tests will eventually finish this time around though if only utilising around 3% CPU, it will likely take an enormous amount of time!
>
> Obviously, it is not optimal to run amber in a VM and in particular trying to get MPI to work effectively in a VM inside another VM though this is really below expectations.
>
> It seems that this problem might be caused by something regarding trace, GitHub suggested switching 1 to a 0 in some file
>
> sudo echo 0 > /proc/sys/kernel/yama/ptrace_scope
>
> After a reboot a was allowed to set this as intended. This made the warnings go away though I am still looking at <5% CPU usage for cpptraj.MPI (last run ut was sander.MPI during some tip4p test), which will not be sufficient for most tasks that might benefit from speeding up through MPI implementation.
>
> // Gustaf
>
>
>
>
>> On 25 Feb 2020, at 14:51, Nicolas Feldman <nfeldman01.qub.ac.uk> wrote:
>>
>> I'm trying to install ambertools 19 in WSL and when I'm doing the test from the parallel installation the test stopped here:
>>
>> But is like it is not do in it but there is no sign of error nor stopped. It just stop there
>>
>> make[3]: Entering directory '/home/nico/apps/amber18/AmberTools/test/nab'
>> Running test to do simple minimization
>> (this tests the molecular mechanics interface)
>> diffing ltest.out.check with ltest.out
>> PASSED
>> ==============================================================
>> Running test to do simple minimization with shake
>> (this tests the molecular mechanics interface)
>> diffing rattle_min.out.check with rattle_min.out
>> PASSED
>> ==============================================================
>> Running test to do simple minimization
>> (this tests the generalized Born implementation)
>> diffing gbrna.out.check with gbrna.out
>> PASSED
>> ==============================================================
>> Running test to do simple minimization
>> (this tests the generalized Born implementation)
>>
>> Nicolas Feldman
>> PhD Student
>> The Wellcome-Wolfson Institute for Experimental Medicine
>> Queen's University of Belfast; 97 Lisburn Rd. Belfast, UK BT9 7BL
>>
>> _______________________________________________
>> AMBER mailing list
>> AMBER.ambermd.org
>> http://lists.ambermd.org/mailman/listinfo/amber
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber

_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Thu Feb 27 2020 - 05:30:03 PST
Custom Search