Re: [AMBER] Testing AMBER16 installation in parallel; possible bug in testing scripts

From: Pratul Agarwal <pratul.agarwal-lab.org>
Date: Fri, 21 Dec 2018 23:04:38 +0000

By any chance do you have more than one version of MPI/MPICH installed? One thing to check is that the mpirun being used is from the same installation which was used during AMBER compilation.

I have seen strange behavior when mpirun from a different installation was used accidentally. This may not be the case here, but something to check.

Pratul K. Agarwal, Ph.D.
(Editorial Board Member: PLoS ONE, Microbial Cell Factories)
Web: http://www.agarwal-lab.org/


On 12/21/2018 3:56 PM, Bellesis, Andrew G wrote:

Hello,


I have been installing AMBER16 in parallel on Arch Linux. It passed all tests when I ran make test on 2 cores, but threw some errors and one failure when I tried using 4 cores. There might be a bug in the testing script that I want to report. (As far as I can tell AMBER is working on my machine when I run pmemd). I included the .diff file from the test run; at the bottom of the email I have also included a snippet of the .log file showing some segfault errors that were encountered.


The .diff file showed a "possible failure": Notice that the temperature cannot be calculated for some steps, and blows up to over 400,000K near the end of the run; this does not come from rounding errors.


possible FAILURE: check mdout.ips_sgmdg.dif
/home/agb53/amber16/test/gact_ips
100,101c100,101
< NSTEP = 1 TIME(PS) = 0.00 TEMP(K) = 302.0 PRESS = 0.
< Etot = -67871.633 EKtot = 14613.503 EPtot = -82485.136
---
 NSTEP =        1   TIME(PS) =       0.00  TEMP(K) =460927.7  PRESS =     0.
 Etot   = **************  EKtot   =  22300678.231  EPtot      = **************
103c103
<  1-4 NB =       264.770  1-4 EEL =     -3290.571  VDWAALS    =     10292.333
---
 1-4 NB =       264.770  1-4 EEL =     -3290.571  VDWAALS    = **************
105c105
<  SGLF =   1.000   400.0   21.203   15.902  0.900    -82485.136     0.
---
 SGLF =   1.000   400.0   21.203   15.902  0.900 **************     0.
108,179c108,179
<  NSTEP =        2   TIME(PS) =       0.00  TEMP(K) =   299.7  PRESS =     0.
<  Etot   =    -67873.187  EKtot   =     14500.598  EPtot      =    -82373.785
<  BOND   =       242.456  ANGLE   =       538.201  DIHED      =       541.942
<  1-4 NB =       266.159  1-4 EEL =     -3293.839  VDWAALS    =     10303.687
<  EELEC  =    -90972.392  EHBOND  =         0.  RESTRAINT  =         0.
<  SGLF =   1.000   400.0   21.193   15.895  0.900    -82484.579     3.230
<  SGHF =  -0.100   1.095  278.806  284.104  1.000       110.794     0.
<  ------------------------------------------------------------------------------
<  NSTEP =        3   TIME(PS) =       0.00  TEMP(K) =   297.6  PRESS =     0.
<  Etot   =    -67874.555  EKtot   =     14399.578  EPtot      =    -82274.134
<  BOND   =       246.718  ANGLE   =       546.711  DIHED      =       540.814
<  1-4 NB =       267.117  1-4 EEL =     -3296.668  VDWAALS    =     10314.879
<  EELEC  =    -90893.708  EHBOND  =         0.  RESTRAINT  =         0.
<  SGLF =   1.000   400.0   21.184   15.888  0.900    -82483.527     5.801
<  SGHF =  -0.100   1.095  278.815  284.111  1.000       209.393     0.
<  ------------------------------------------------------------------------------
<  NSTEP =        4   TIME(PS) =       0.00  TEMP(K) =   295.9  PRESS =     0.
<  Etot   =    -67876.266  EKtot   =     14317.328  EPtot      =    -82193.594
<  BOND   =       248.318  ANGLE   =       546.948  DIHED      =       539.267
<  1-4 NB =       267.505  1-4 EEL =     -3298.796  VDWAALS    =     10325.748
<  EELEC  =    -90822.586  EHBOND  =         0.  RESTRAINT  =         0.
<  SGLF =   1.000   400.0   21.174   15.881  0.900    -82482.077     7.534
<  SGHF =  -0.100   1.095  278.825  284.118  1.000       288.483     0.
<  ------------------------------------------------------------------------------
<  NSTEP =        5   TIME(PS) =       0.00  TEMP(K) =   294.6  PRESS =     0.
<  Etot   =    -67878.295  EKtot   =     14257.047  EPtot      =    -82135.342
<  BOND   =       247.431  ANGLE   =       538.990  DIHED      =       537.415
<  1-4 NB =       267.252  1-4 EEL =     -3300.033  VDWAALS    =     10336.158
<  EELEC  =    -90762.558  EHBOND  =         0.  RESTRAINT  =         0.
<  SGLF =   1.000   400.0   21.165   15.874  0.900    -82480.344     8.393
<  SGHF =  -0.100   1.095  278.834  284.125  1.000       345.001     0.
<  ------------------------------------------------------------------------------
<  NSTEP =        6   TIME(PS) =       0.00  TEMP(K) =   293.8  PRESS =     0.
<  Etot   =    -67880.569  EKtot   =     14218.021  EPtot      =    -82098.591
<  BOND   =       244.911  ANGLE   =       525.212  DIHED      =       535.431
<  1-4 NB =       266.371  1-4 EEL =     -3300.276  VDWAALS    =     10345.991
<  EELEC  =    -90716.233  EHBOND  =         0.  RESTRAINT  =         0.
<  SGLF =   1.000   400.0   21.156   15.867  0.900    -82478.435     8.465
<  SGHF =  -0.100   1.095  278.843  284.132  1.000       379.844     0.
<  ------------------------------------------------------------------------------
<  NSTEP =        7   TIME(PS) =       0.00  TEMP(K) =   293.4  PRESS =     0.
<  Etot   =    -67882.956  EKtot   =     14196.853  EPtot      =    -82079.809
<  BOND   =       241.835  ANGLE   =       509.430  DIHED      =       533.506
<  1-4 NB =       264.963  1-4 EEL =     -3299.513  VDWAALS    =     10355.159
<  EELEC  =    -90685.190  EHBOND  =         0.  RESTRAINT  =         0.
<  SGLF =   1.000   400.0   21.147   15.860  0.900    -82476.442     7.917
<  SGHF =  -0.100   1.095  278.852  284.139  1.000       396.632     0.
<  ------------------------------------------------------------------------------
<  NSTEP =        8   TIME(PS) =       0.00  TEMP(K) =   293.2  PRESS =     0.
<  Etot   =    -67885.377  EKtot   =     14188.988  EPtot      =    -82074.365
<  BOND   =       239.081  ANGLE   =       495.674  DIHED      =       531.819
<  1-4 NB =       263.200  1-4 EEL =     -3297.812  VDWAALS    =     10363.628
<  EELEC  =    -90669.956  EHBOND  =         0.  RESTRAINT  =         0.
<  SGLF =   1.000   400.0   21.138   15.853  0.900    -82474.431     6.933
<  SGHF =  -0.100   1.095  278.861  284.146  1.000       400.066     0.
<  ------------------------------------------------------------------------------
<  NSTEP =        9   TIME(PS) =       0.00  TEMP(K) =   293.2  PRESS =     0.
<  Etot   =    -67887.774  EKtot   =     14190.225  EPtot      =    -82077.999
<  BOND   =       237.101  ANGLE   =       487.037  DIHED      =       530.504
<  1-4 NB =       261.286  1-4 EEL =     -3295.290  VDWAALS    =     10371.436
<  EELEC  =    -90670.075  EHBOND  =         0.  RESTRAINT  =         0.
<  SGLF =   1.000   400.0   21.129   15.847  0.900    -82472.449     5.677
<  SGHF =  -0.100   1.095  278.870  284.152  1.000       394.449     0.
<  ------------------------------------------------------------------------------
< wrapping first mol.:      -45.3230       32.0482       55.5091
<  NSTEP =       10   TIME(PS) =       0.01  TEMP(K) =   293.4  PRESS =     0.
<  Etot   =    -67890.099  EKtot   =     14197.615  EPtot      =    -82087.714
<  BOND   =       235.904  ANGLE   =       484.929  DIHED      =       529.637
<  1-4 NB =       259.415  1-4 EEL =     -3292.085  VDWAALS    =     10378.701
<  EELEC  =    -90684.217  EHBOND  =         0.  RESTRAINT  =         0.
<  SGLF =   1.000   400.0   21.121   15.840  0.900    -82470.525     4.261
<  SGHF =  -0.100   1.095  278.878  284.159  1.000       382.811     0.
---
 NSTEP =        2   TIME(PS) =       0.00  TEMP(K) =*********  PRESS =     0.
 Etot   = **************  EKtot   =  89183552.646  EPtot      = **************
 BOND   =       236.305  ANGLE   =       523.804  DIHED      =       542.605
 1-4 NB =       264.770  1-4 EEL =     -3290.571  VDWAALS    = **************
 EELEC  =    -91054.383  EHBOND  =         0.  RESTRAINT  =         0.
 SGLF =   1.000   400.0   21.216   15.912  0.900 **************     0.
 SGHF =  -0.100      NaN  278.783  284.087  1.000         0.     0.
 ------------------------------------------------------------------------------
 NSTEP =        3   TIME(PS) =       0.00  TEMP(K) =*********  PRESS =     0.
 Etot   = **************  EKtot   =  89179092.073  EPtot      = **************
 BOND   =       236.305  ANGLE   =       523.804  DIHED      =       542.605
 1-4 NB =       264.770  1-4 EEL =     -3290.571  VDWAALS    = **************
 EELEC  =    -91054.383  EHBOND  =         0.  RESTRAINT  =         0.
 SGLF =   1.000   400.0   21.298   15.974  0.900 **************     0.
 SGHF =  -0.100      NaN  278.701  284.026  1.000         0.     0.
 ------------------------------------------------------------------------------
 NSTEP =        4   TIME(PS) =       0.00  TEMP(K) =*********  PRESS =     0.
 Etot   = **************  EKtot   =  89179092.519  EPtot      = **************
 BOND   =       236.305  ANGLE   =       523.804  DIHED      =       542.605
 1-4 NB =       264.770  1-4 EEL =     -3290.571  VDWAALS    = **************
 EELEC  =    -91054.383  EHBOND  =         0.  RESTRAINT  =         0.
 SGLF =   1.000   400.0   21.494   16.120  0.900 **************     0.
 SGHF =  -0.100      NaN  278.505  283.879  1.000         0.     0.
 ------------------------------------------------------------------------------
 NSTEP =        5   TIME(PS) =       0.00  TEMP(K) =*********  PRESS =     0.
 Etot   = **************  EKtot   =  89179092.519  EPtot      = **************
 BOND   =       236.305  ANGLE   =       523.804  DIHED      =       542.605
 1-4 NB =       264.770  1-4 EEL =     -3290.571  VDWAALS    = **************
 EELEC  =    -91054.383  EHBOND  =         0.  RESTRAINT  =         0.
 SGLF =   1.000   400.0   21.847   16.385  0.900 **************     0.
 SGHF =  -0.100      NaN  278.152  283.614  1.000         0.     0.
 ------------------------------------------------------------------------------
 NSTEP =        6   TIME(PS) =       0.00  TEMP(K) =*********  PRESS =     0.
 Etot   = **************  EKtot   =  89179092.519  EPtot      = **************
 BOND   =       236.305  ANGLE   =       523.804  DIHED      =       542.605
 1-4 NB =       264.770  1-4 EEL =     -3290.571  VDWAALS    = **************
 EELEC  =    -91054.383  EHBOND  =         0.  RESTRAINT  =         0.
 SGLF =   1.000   400.0   22.401   16.801  0.900 **************     0.
 SGHF =  -0.100      NaN  277.598  283.198  1.000         0.     0.
 ------------------------------------------------------------------------------
 NSTEP =        7   TIME(PS) =       0.00  TEMP(K) =*********  PRESS =     0.
 Etot   = **************  EKtot   =  89179092.519  EPtot      = **************
 BOND   =       236.305  ANGLE   =       523.804  DIHED      =       542.605
 1-4 NB =       264.770  1-4 EEL =     -3290.571  VDWAALS    = **************
 EELEC  =    -91054.383  EHBOND  =         0.  RESTRAINT  =         0.
 SGLF =   1.000   400.0   23.200   17.400  0.900 **************     0.
 SGHF =  -0.100      NaN  276.799  282.599  1.000         0.     0.
 ------------------------------------------------------------------------------
 NSTEP =        8   TIME(PS) =       0.00  TEMP(K) =*********  PRESS =     0.
 Etot   = **************  EKtot   =  89179092.519  EPtot      = **************
 BOND   =       236.305  ANGLE   =       523.804  DIHED      =       542.605
 1-4 NB =       264.770  1-4 EEL =     -3290.571  VDWAALS    = **************
 EELEC  =    -91054.383  EHBOND  =         0.  RESTRAINT  =         0.
 SGLF =   1.000   400.0   24.285   18.213  0.900 **************     0.
 SGHF =  -0.100      NaN  275.714  281.786  1.000         0.     0.
 ------------------------------------------------------------------------------
 NSTEP =        9   TIME(PS) =       0.00  TEMP(K) =*********  PRESS =     0.
 Etot   = **************  EKtot   =  89179092.519  EPtot      = **************
 BOND   =       236.305  ANGLE   =       523.804  DIHED      =       542.605
 1-4 NB =       264.770  1-4 EEL =     -3290.571  VDWAALS    = **************
 EELEC  =    -91054.383  EHBOND  =         0.  RESTRAINT  =         0.
 SGLF =   1.000   400.0   25.697   19.273  0.900 **************     0.
 SGHF =  -0.100      NaN  274.302  280.726  1.000         0.     0.
 ------------------------------------------------------------------------------
wrapping first mol.:       22.6615       32.0482       55.5091
 NSTEP =       10   TIME(PS) =       0.01  TEMP(K) =*********  PRESS =     0.
 Etot   = **************  EKtot   =  89179092.519  EPtot      = **************
 BOND   =       236.305  ANGLE   =       523.804  DIHED      =       542.605
 1-4 NB =       264.770  1-4 EEL =     -3290.571  VDWAALS    = **************
 EELEC  =    -91054.383  EHBOND  =         0.  RESTRAINT  =         0.
 SGLF =   1.000   400.0   27.478   20.608  0.900 **************     0.
 SGHF =  -0.100      NaN  272.521  279.391  1.000         0.     0.
182,188c182,188
<  NSTEP =       10   TIME(PS) =       0.01  TEMP(K) =   295.7  PRESS =     0.
<  Etot   =    -67880.071  EKtot   =     14307.975  EPtot      =    -82188.047
<  BOND   =       242.006  ANGLE   =       519.693  DIHED      =       536.294
<  1-4 NB =       264.804  1-4 EEL =     -3296.488  VDWAALS    =     10338.772
<  EELEC  =    -90793.130  EHBOND  =         0.  RESTRAINT  =         0.
<  SGLF =   1.000   400.0   21.161   15.871  0.900    -82478.795     5.821
<  SGHF =  -0.100   1.095  278.838  284.128  1.000       290.747     0.
---
 NSTEP =       10   TIME(PS) =       0.01  TEMP(K) =*********  PRESS =     0.
 Etot   = **************  EKtot   =  82491697.058  EPtot      = **************
 BOND   =       236.305  ANGLE   =       523.804  DIHED      =       542.605
 1-4 NB =       264.770  1-4 EEL =     -3290.571  VDWAALS    = **************
 EELEC  =    -91054.383  EHBOND  =         0.  RESTRAINT  =         0.
 SGLF =   1.000   400.0   23.012   17.259  0.900 **************     0.
 SGHF =  -0.100      NaN  276.987  282.740  1.000         0.     0.
191,197c191,197
<  NSTEP =       10   TIME(PS) =       0.01  TEMP(K) =     2.9  PRESS =     0.
<  Etot   =         6.027  EKtot   =       142.184  EPtot      =       137.039
<  BOND   =         4.500  ANGLE   =        22.786  DIHED      =         4.567
<  1-4 NB =         2.587  1-4 EEL =         3.256  VDWAALS    =        27.811
<  EELEC  =       131.166  EHBOND  =         0.  RESTRAINT  =         0.
<  SGLF =   0.     0.    0.026    0.019  0.         4.925     2.554
<  SGHF =   0.   0.    0.026    0.019  0.       132.971     0.
---
 NSTEP =       10   TIME(PS) =       0.01  TEMP(K) =414691.5  PRESS =     0.
 Etot   =            NaN  EKtot   =  20063672.986  EPtot      =            NaN
 BOND   =         0.  ANGLE   =         0.  DIHED      =         0.
 1-4 NB =         0.  1-4 EEL =         0.000  VDWAALS    =            NaN
 EELEC  =         0.001  EHBOND  =         0.  RESTRAINT  =         0.
 SGLF =   0.     0.    2.056    1.542  0.            NaN     0.
 SGHF =   0.      NaN    2.056    1.542  0.         0.     0.
---------------------------------------
Further, some segfault errors appeared on related tests that passed. Below is a snippet of one of the error messages:
./Run.ips:  Program error
make[3]: [Makefile:148: test.sander.BASIC] Error 1 (ignored)
cd gact_ips && ./Run.ipsnve
Program received signal SIGSEGV: Segmentation fault - invalid memory reference.
Backtrace for this error:
Program received signal SIGSEGV: Segmentation fault - invalid memory reference.
Backtrace for this error:
Program received signal SIGSEGV: Segmentation fault - invalid memory reference.
Backtrace for this error:
Program received signal SIGSEGV: Segmentation fault - invalid memory reference.
------------------------------------
Best regards,
Andrew Bellesis
_______________________________________________
AMBER mailing list
AMBER.ambermd.org<mailto:AMBER.ambermd.org>
http://lists.ambermd.org/mailman/listinfo/amber
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Fri Dec 21 2018 - 15:30:02 PST
Custom Search