[AMBER] Parallel Tests after compilation

From: FyD via AMBER <amber.ambermd.org>
Date: Tue, 08 Apr 2025 09:12:57 +0200

Dear All,

My system:
no mkl
CentOS 7.6 + devtoolset8
cmake-3.25.2 compiled with devtoolset8
openmpi-4.1.8 compiled with devtoolset8
Amber24.tar.bz2 + Amber24.tar.bz2 compiled with devtoolset8
Compilation date: April 7, 2025

Here is a part of my run_cmake file:
[...]
# Assume this is Linux:
   export CMAKE_BIN="/opt/cmake-3.25.2/bin/cmake"
   $CMAKE_BIN $AMBER_PREFIX/amber24_src \
     -DCMAKE_INSTALL_PREFIX=$AMBER_PREFIX/amber24 \
     -DCOMPILER=GNU \
     -DMPI=TRUE \
     -DOPENMP=TRUE \
     -DINSTALL_TESTS=TRUE \
     -DDOWNLOAD_MINICONDA=TRUE \
     -DCUDA=FALSE \
     2>&1 | tee z-cmake_April-7-2024.log
[...]

I get: Miniconda version 3.12 is used.
-- Features:
-- MPI: ON
-- MVAPICH2-GDR for GPU-GPU comm.: OFF
-- OpenMP: ON
-- CUDA: OFF
-- Build Shared Libraries: ON
-- Build GUI Interfaces: ON
-- Build Python Programs: ON
-- -Python Interpreter: Internal Miniconda (version 3.12)
-- Build Perl Programs: ON
-- Build configuration: RELEASE
-- Target Processor: x86_64
-- Build Documentation: ON
-- Sander Variants: normal LES API LES-API MPI
LES-MPI QUICK-MPI
-- Install location: /usr/local/amber24/
-- Installation of Tests: ON

-- Compilers:
-- C: GNU 8.3.1 (/opt/rh/devtoolset-8/root/usr/bin/gcc)
-- CXX: GNU 8.3.1 (/opt/rh/devtoolset-8/root/usr/bin/g++)
-- Fortran: GNU 8.3.1 (/opt/rh/devtoolset-8/root/usr/bin/gfortran)

And at the end I get the following report:

Finished parallel test suite for Amber 24 at lun. avril 7 17:01:02 CEST 2025.
Tests ran with DO_PARALLEL="mpirun -np 4".
Some tests require 4 or 8 threads to run, while some will not
run with more than 2. Please run further parallel tests with the
appropriate number of processors. See /usr/local/amber24/test/README.

make[2]: Leaving directory `/usr/local/amber24/test'
187 file comparisons passed
6 file comparisons failed (5 of which can be ignored)
2 tests experienced an error
Test log file saved as
/usr/local/amber24/logs/test_amber_parallel/2025-04-07_16-55-06.log
Test diffs file saved as
/usr/local/amber24/logs/test_amber_parallel/2025-04-07_16-55-06.diff
make[1]: *** [test.parallel] Error 1
make[1]: Leaving directory `/usr/local/amber24/test'

Questions about the amber tests; using information provided page 21 of
the Amber24.pdf manual:
When using: export DO_PARALLEL="mpirun -np 2" I get no error...
When using: export DO_PARALLEL="mpirun -np 4" I get 2 errors:

1)
==============================================================
export TESTsander='/usr/local/amber24///bin/pmemd.MPI' && cd
hrem_wat_pv && ./Run.rem

  Running multipmemd version of pmemd Amber24
     Total processors = 4
     Number of groups = 2
[...]
Program received signal SIGSEGV: Segmentation fault - invalid memory
reference.
Backtrace for this error:
[master1:03611] *** An error occurred in MPI_Bcast
[master1:03611] *** reported by process [2495676417,3]
[master1:03611] *** on communicator MPI COMMUNICATOR 3 SPLIT FROM 0
[master1:03611] *** MPI_ERR_TRUNCATE: message truncated
[master1:03611] *** MPI_ERRORS_ARE_FATAL (processes in this
communicator will now abort,
[master1:03611] *** and potentially your MPI job)
[...]
./Run.rem: Program error
make[2]: [test.pmemd.REM] Error 1 (ignored)
export TESTsander='/usr/local/amber24///bin/pmemd.MPI' && cd
rem_gb_4rep && ./Run.rem
This test case requires 8, 12, or 16 MPI threads!
[...]

2)
==============================================================
if [ -n "testrism" ]; then cd rism3d/4lzta && ./Run.4lzta.kh.pme; fi
  DO_PARALLEL set to mpirun -np 4; max is 2
if [ -n "testrism" ]; then cd rism3d/1ahoa && ./Run.1ahoa.kh.pme; fi
  DO_PARALLEL set to mpirun -np 4; max is 2
if [ -n "testrism" ]; then cd rism3d/PH4+_triclinic &&
./Run.PH4+_triclinic.kh.pme.center0; fi
  DO_PARALLEL set to mpirun -np 4; max is 2
if [ -n "testrism" ]; then cd rism3d/PH4+_triclinic &&
./Run.PH4+_triclinic.kh.pme.center1; fi
  DO_PARALLEL set to mpirun -np 4; max is 2
if [ -n "testrism" ]; then cd rism3d/PH4+_triclinic &&
./Run.PH4+_triclinic.kh.pme.center2; fi
  DO_PARALLEL set to mpirun -np 4; max is 2
cd bar_pbsa && ./Run.bar_pbsa
Running BAR/PBSA with 4 threads
Skipping complex runs with 4 threads
[...]
Namespace(command='calc', calc_input='1C5X_inputs/ligands_run_input.yaml')
{'dest_path': '1C5X', 'epsin': 1.0, 'radiscale': 1.0, 'protscale':
1.0, 'ligcom': 'ligands', 'del_traj': True}
[...]
parsing
/usr/local/amber24/AmberTools/test/bar_pbsa/1C5X/param_sweep_ligands/e_1.0_r_1.0_p_1.0/bar_out/1.000_1.000.out
1C5X ligands
bar_total: 0
possible FAILURE: file 1C5X/concat_ligands/0.000/ti001.nc does not exist.
[...]
==============================================================

About "DO_PARALLEL set to mpirun -np 4; max is 2" in 2):
does the later message mean this test crashed because 4 cores are
requested instead of 2...
if so why this test is included when 'export DO_PARALLEL="mpirun -np
4"' is called?

and why including a test that requests 8 cores, when 'export
DO_PARALLEL="mpirun -np 4"' is called?
  for sure it will crash...

thank you
Best Francois



_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Tue Apr 08 2025 - 00:30:03 PDT
Custom Search