Re: [AMBER] Error building Amber22 with cuda

From: Dhariwal, Rohit via AMBER <amber.ambermd.org>
Date: Wed, 13 Mar 2024 15:26:08 +0000

Hi Andy,

Just wanted to update you that I was able to build amber22 with cuda/11 support after disabling QUICK. I tried running a test case on our HPC cluster and was able to get it run on A100 nodes, but am getting the following error while running it on H100 nodes:

cudaMemcpyToSymbol: SetSim copy to cSim failed invalid device symbol
cudaMemcpyToSymbol: SetSim copy to cSim failed invalid device symbol

Can this be happening because amber22 is not configured to run on H100 nodes as I see the following output in cmake.log?

-- CUDA version 11.0 detected
-- Configuring for SM3.5, SM5.0, SM5.2, SM5.3, SM6.0, SM6.1, SM7.0, SM7.5 and SM8.0

Can you please advise how can I fix this issue so that I can run amber22 on H100 nodes?

Best,
Rohit
________________________________
From: Goetz, Andreas <awgoetz.ucsd.edu>
Sent: Saturday, March 2, 2024 3:00 PM
To: Dhariwal, Rohit <rohit.dhariwal.wsu.edu>
Cc: AMBER Mailing List <amber.ambermd.org>; Manathunga Mudiyanselage, Madushanka <manathun.msu.edu>
Subject: Re: [AMBER] Error building Amber22 with cuda


[EXTERNAL EMAIL]

Hi Rohit,

Thank you for providing this information.

The issue shows up with Intel compilers. I think we identified the underlying problem. We will work on a bugfix.

All the best,
Andy


Dr. Andreas W. Goetz
Associate Research Scientist
San Diego Supercomputer Center
Tel: +1-858-822-4771
Email: agoetz.sdsc.edu
Web: www.awgoetz.de<https://urldefense.com/v3/__http://www.awgoetz.de__;!!JmPEgBY0HMszNaDT!rsNLLIO8mHbgIck7Jhx1RrSeMYICJdsae-04nXmxGui4F1Q-sD6rLfgTt8o9cCuhFVOPRBbKEqfp7g8rqkAfQg$>

On Mar 1, 2024, at 11:35 AM, Dhariwal, Rohit <rohit.dhariwal.wsu.edu> wrote:

Hi Andy,

Thanks a lot for your reply. Below are the compilers and their versions (from cmake.log) that I am currently using:

-- Setting C compiler to icc
-- Setting CXX compiler to icpc
-- Setting Fortran compiler to ifort
-- Amber source found, building AmberTools and Amber
-- The C compiler identification is Intel 19.1.2.20200623
-- The CXX compiler identification is Intel 19.1.2.20200623
-- The Fortran compiler identification is Intel 19.1.2.20200623
-- CUDA version 11.0 detected

I'm also providing the cmake options that I'm using for your reference:

export INSTALL_DIR=/opt/apps/amber
export PYTHON_EXEC=/opt/apps/anaconda3/22.10.0/bin/python3
export CUDA_HOME=/opt/apps/cuda/11.0.3

  cmake $AMBER_PREFIX/amber22_src \
    -DCMAKE_INSTALL_PREFIX=$INSTALL_DIR/amber22 \
    -DCOMPILER=INTEL -DPYTHON_EXECUTABLE=$PYTHON_EXEC \
    -DMPI=FALSE -DCUDA=TRUE -DINSTALL_TESTS=TRUE \
    -DDOWNLOAD_MINICONDA=FALSE -DMKL_HOME=$MKLROOT \
    -DCUDA_TOOLKIT_ROOT_DIR=$CUDA_HOME \
    2>&1 | tee cmake.log

I'll also try different versions of compilers and let you know how it goes.

Best,
Rohit

________________________________
From: Goetz, Andreas <awgoetz.ucsd.edu<mailto:awgoetz.ucsd.edu>>
Sent: Thursday, February 29, 2024 1:14 PM
To: Dhariwal, Rohit <rohit.dhariwal.wsu.edu<mailto:rohit.dhariwal.wsu.edu>>; AMBER Mailing List <amber.ambermd.org<mailto:amber.ambermd.org>>
Subject: Re: [AMBER] Error building Amber22 with cuda

[EXTERNAL EMAIL]
Hi Rohit,

This is probably a bug in QUICK that may show up with certain compilers.

Please let us know which compilers and compiler versions you are using (C, Fortran, CUDA) so we can test and fix this as required.

In the meantime there are two ways forward. 1) Use different compilers, 2) If you do not plan to use QUICK, disable it by passing -DDISABLE_TOOLS=quick to cmake.

Thanks,
Andy


Dr. Andreas W. Goetz
Associate Research Scientist
San Diego Supercomputer Center
Tel: +1-858-822-4771
Email: agoetz.sdsc.edu
Web: www.awgoetz.de<https://urldefense.com/v3/__http://www.awgoetz.de__;!!JmPEgBY0HMszNaDT!t_ekqlgGUa2CvlLE-DE8ws40YHbkpyEoqZ7ERvKMHtB-2A6aQdjhk2QIO7f9CIJ1Iltak8Hh2QEZjEayEiL0DA$>

On Feb 28, 2024, at 11:11 AM, Dhariwal, Rohit via AMBER <amber.ambermd.org> wrote:

Dear all,

I am getting the following error while building Amber22 with Cuda support on our HPC cluster and I'm using cuda/11.0.3 . I was successfully able to install the serial and parallel (MPI) versions of Amber. I would really appreciate if you could help me in resolving this issue.

===============
[100%] Building Fortran object AmberTools/src/quick/src/CMakeFiles/libquick_cuda.dir/getMol.f90.o
[100%] Building Fortran object AmberTools/src/quick/src/CMakeFiles/libquick_cuda.dir/read_job_and_atom.f90.o
[100%] Linking CXX shared library libquick_cuda.so
CMakeFiles/libquick_cuda.dir/modules/quick_eri_module.f90.o: In function `quick_cshell_eri_module_mp_aoint_':
quick_eri_module.f90:(.text+0x4563): undefined reference to `gpu_aoint_'
CMakeFiles/libquick_cuda.dir/modules/oshell_quick_eri_module.f90.o: In function `quick_oshell_eri_module_mp_aoint_':
oshell_quick_eri_module.f90:(.text+0x40e3): undefined reference to `gpu_aoint_'
make[2]: *** [AmberTools/src/quick/src/libquick_cuda.so] Error 1
make[1]: *** [AmberTools/src/quick/src/CMakeFiles/libquick_cuda.dir/all] Error 2
make: *** [all] Error 2
===============


Best,
Rohit

_______________________________________________
AMBER mailing list
AMBER.ambermd.org
https://urldefense.com/v3/__http://lists.ambermd.org/mailman/listinfo/amber__;!!Mih3wA!FVhBj0RDsiPajjQoh5Ay4DYOprK03yfb2lL9p91JSKkKjGf8V849R_u3f0DMInE6DfMOlM3pTmr8gA$

_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Wed Mar 13 2024 - 09:00:02 PDT
Custom Search