Re: [AMBER] CUDA single gpu usage issue

From: Ray Luo <rluo.uci.edu>
Date: Thu, 5 Mar 2020 22:33:43 -0800

Hi Abhilash,

I've just tested cuda 10.2 with both amber 19 and the pre-release
version of Amber 20 and all test cases (both ambertools and pmemd)
passed without any problem. Unfortunately, I can't easily install cuda
9.2 any more, so can't test it.

These were done on centos 7 on a rocks 7.0 cluster. Cuda was installed
with the rpm installer.

All the best,
Ray

--
Ray Luo, Ph.D.
Professor of Structural Biology/Biochemistry/Biophysics,
Chemical Physics, Biomedical Engineering, and Chemical Engineering
Department of Molecular Biology and Biochemistry
University of California, Irvine, CA 92697-3900
On Wed, Mar 4, 2020 at 10:03 AM Ray Luo <rluo.uci.edu> wrote:
>
> Hi Abhilash,
>
> We'll need to reinstall cuda 9.2 to reproduce what you are seeing. For
> now, please stay with cuda 10.0 for your amber installation.
>
> All the best,
> Ray
> --
> Ray Luo, Ph.D.
> Professor of Structural Biology/Biochemistry/Biophysics,
> Chemical Physics, Biomedical Engineering, and Chemical Engineering
> Department of Molecular Biology and Biochemistry
> University of California, Irvine, CA 92697-3900
>
> On Wed, Mar 4, 2020 at 9:37 AM Abhilash J <md.scfbio.gmail.com> wrote:
> >
> > Hi Everyone,
> >
> >     I tried what Ruxi had suggested. But i could not find *libcublas* in
> > /usr/lib64 or  /usr/lib for CUDA 10.2. But i tried to use these *libcublas*
> > files from CUDA 9.2 installation (Just thinking if they are compatible and
> > useful).
> >     That did not go well and i ended up with the same error (/usr/bin/ld:
> > cannot find -lcublas). I did add the path to the  *libcublas* files to my
> > LD_PATH_LIBRARY. I am using CentOS 7
> >     I also tried what David suggested earlier in the thread to compile
> > pmemd.cuda only with CUDA 10.2 and was able to get that. I did not get any
> > appreciable change in the speed. And pmemd.cuda is what i use most of the
> > times.
> >     I was wondering what is the performance gain if I manage to get it
> > compiled to 10.2. Should i wait for AMBER20 given that its only few months
> > ahead.
> >     It will be great if someone can share some expected numbers.
> >
> > Regards
> >
> > Abhilash
> >
> > On Wed, Mar 4, 2020 at 11:25 AM Nicholas Moyer <nmoyer.broadinstitute.org>
> > wrote:
> >
> > > Good Morning,
> > >
> > > So I tried all the recommendations that were suggested and i was able to
> > > install with CUDA 9.2 but when i try and make test i get the following
> > > errors that look pretty similar. Any suggestions on why im able to build
> > > and install amber with this but all tests fail? I also reinstalled amber
> > > and everything looks fine as far as I can tell but clearly something is
> > > wrong. Thanks again for all the help!
> > >
> > > [image: image.png]
> > >
> > > On Wed, Mar 4, 2020 at 4:22 AM Ruxi Qi <ruxiq.uci.edu> wrote:
> > >
> > > > Sorry, typo, should be /usr/lib/x86_64-linux-gnu.
> > > >
> > > > BTW, we will add a handling to this in the AmberTools 20 release.
> > > >
> > > > Ruxi
> > > > On 3/4/20 5:05 PM, Ruxi Qi wrote:
> > > >
> > > > Hi Nicholas,
> > > >
> > > > The issue was because from CUDA 10.1 some libraries including the CUBLAS
> > > > are installed in the system standard locations rather than in the Toolkit
> > > > installation directory. Depending on distribution these installed
> > > locations
> > > > can be either /usr/lib/x84_64-linux-gnu as with Ubuntu 18.04, or
> > > /usr/lib64
> > > > as with Centos 7, or /usr/lib. You can check this by executing:
> > > >
> > > > *sudo find /usr -name libcublas**
> > > >
> > > > So the solution is to either create a symlink in the CUDA Toolkit library
> > > > path to the library file missing, or more handily add the location to
> > > your
> > > > library searching path in your .bashrc:
> > > >
> > > > export LD_LIBRARY_PATH=/usr/lib/x84_64-linux-gnu:$LD_LIBRARY_PATH
> > > >
> > > > Hope it helps.
> > > >
> > > > Best,
> > > >
> > > > Ruxi
> > > > On 3/4/20 6:04 AM, Nicholas Moyer wrote:
> > > >
> > > > This is the closest error that i've seen to date, i will have to try CUDA
> > > > 9.2, I had been trying to get 10 to work so I wouldn't have to reinstall
> > > on
> > > > a bunch of machines but that looks like my only option. Thank you very
> > > much
> > > > I will let you know if that works tomorrow!
> > > >
> > > > On Tue, Mar 3, 2020 at 4:58 PM Abhilash J <md.scfbio.gmail.com> <
> > > md.scfbio.gmail.com> wrote:
> > > >
> > > >
> > > > Hi Everyone,
> > > >
> > > >     I am also having a similar (but not exact issue) with installing
> > > AMBER
> > > > 18. The compile seems to complete without glitch if i use CUDA 9.2. Error
> > > > occurs if i use CUDA 10.2. Did you give CUDA 9.2 a shot.
> > > >     The error i am dealing with is as follows.
> > > >     Any comments will be useful.
> > > >
> > > > ============error file======================
> > > > Warning: Deleted feature: ASSIGN statement at (1)
> > > > /bin/ld: cannot find -lcublas
> > > > collect2: error: ld returned 1 exit status
> > > > make[2]: *** [pbsa.cuda] Error 1
> > > > make[1]: *** [cuda_serial] Error 2
> > > > make: *** [install] Error 2
> > > > ===========================================
> > > >
> > > > ========out file====================
> > > > /usr/local/cuda-10.2/bin/nvcc -gencode arch=compute_30,code=sm_30
> > > -gencode
> > > > arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode
> > > > arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode
> > > > arch=compute_53,code=sm_53 -gencode arch=compute_60,code=sm_60 -gencode
> > > > arch=compute_61,code=sm_61 -gencode arch=compute_60,code=sm_70, -gencode
> > > > arch=compute_61,code=sm_70 -Wno-deprecated-declarations -use_fast_math
> > > -O3
> > > >  -ccbin g++ -I../cusplibrary-cuda9 -o cusp_LinearSolvers.o -c
> > > > cusp_LinearSolvers.cu -DDIA
> > > > [PBSA]  FC pbsa.cuda
> > > > make[2]: Leaving directory
> > > > `/home33/ajayaraj/AMBER/amber18/AmberTools/src/pbsa'
> > > > make[1]: Leaving directory
> > > `/home33/ajayaraj/AMBER/amber18/AmberTools/src'
> > > > =================================
> > > >
> > > >
> > > > On Tue, Mar 3, 2020 at 1:27 PM Ray Luo <rluo.uci.edu> <rluo.uci.edu>
> > > wrote:
> > > >
> > > >
> > > > Nicholas,
> > > >
> > > > Any CUDA version higher than 10.0 could be a problem. We only tested
> > > > version 10 and older releases when the amber19 was released earlier
> > > > last year.
> > > >
> > > > Right now we are testing the current release of 10.2 on one of our GPU
> > > > boxes.
> > > >
> > > > All the best,
> > > > Ray
> > > > --
> > > > Ray Luo, Ph.D.
> > > > Professor of Structural Biology/Biochemistry/Biophysics,
> > > > Chemical Physics, Biomedical Engineering, and Chemical Engineering
> > > > Department of Molecular Biology and Biochemistry
> > > > University of California, Irvine, CA 92697-3900
> > > >
> > > > On Tue, Mar 3, 2020 at 10:20 AM Nicholas Moyer<nmoyer.broadinstitute.org>
> > > <nmoyer.broadinstitute.org> wrote:
> > > >
> > > > Good Afternoon,
> > > >
> > > > I have tried make install in the  src/pmemd  and it seems to install
> > > >
> > > > but
> > > >
> > > > then if I try and make test it fails each one. I am currently
> > > >
> > > > re-installing
> > > >
> > > > from scratch as you had suggested as it still had several cuda.mpi in
> > > >
> > > > some
> > > >
> > > > of the folders after the clean command. Here is some info I forgot to add
> > > > in my original email. Thank you for all the suggestions so far !
> > > >
> > > > OS: Ubuntu 18.04
> > > > shell: bash
> > > > compiler: gnu
> > > > CUDA toolkit: 10.1.243
> > > > amber: 18
> > > > amber-toolkit: 19
> > > >
> > > >
> > > > On Tue, Mar 3, 2020 at 10:09 AM David A Case <david.case.rutgers.edu> <
> > > david.case.rutgers.edu>
> > > >
> > > > wrote:
> > > >
> > > > On Tue, Mar 03, 2020, Nicholas Moyer wrote:
> > > >
> > > >
> > > > so ive been having an error with CUDA. I have been dealing with an
> > > >
> > > > annoying
> > > >
> > > > CUDA issue where basically i have Amber18/ambertools19 installed and
> > > >
> > > > set
> > > >
> > > > with multiple gpu's but when i try to configure amber for single GPU
> > > >
> > > > usage
> > > >
> > > > compiles but when trying to make install it gives me a huge error
> > > >
> > > > block
> > > >
> > > > The error involves pbsa.cuda, which may not be the program you really
> > > > want (most people are more eager to run pmemd.cuda).  If that is the
> > > > case, after the configure step, do this:
> > > >
> > > >      cd src/pmemd
> > > >      make install
> > > >
> > > > That will install just pmemd.cuda, whose installation is better
> > > >
> > > > tested.
> > > >
> > > > (Of course, you may still have problems, since for most people,
> > > >
> > > > building
> > > >
> > > > pbsa.cuda gives no problems.  If things still don't work, provide some
> > > > details about your OS, compiler and CUDA toolkit versions.  Also,
> > > >
> > > > since
> > > >
> > > > you apparently previously installed the cuda.MPI versions, and are now
> > > > trying to get the cuda serial codes, start from a completely fresh
> > > > directory tree, just in case something left over from the MPI install
> > > > is causing problems.)
> > > >
> > > > I'm cc-ing this to Ray Luo, in case he may have a better handle on
> > > > recognizing the problem.
> > > >
> > > > ...good luck...dac
> > > >
> > > >
> > > >
> > > > /usr/bin/ld: warning: libcublasLt.so.10, needed by
> > > > /opt/CUDA/cuda-toolkit/targets/x86_64-linux/lib/libcublas.so, not
> > > >
> > > > found
> > > >
> > > > (try using -rpath or -rpath-link)
> > > > /opt/CUDA/cuda-toolkit/targets/x86_64-linux/lib/libcublas.so:
> > > >
> > > > undefined
> > > >
> > > > reference to `cublasLtShutdownCtx.libcublasLt.so.10'
> > > > /opt/CUDA/cuda-toolkit/targets/x86_64-linux/lib/libcublas.so:
> > > >
> > > > undefined
> > > >
> > > > reference to `cublasLtGetProperty.libcublasLt.so.10'
> > > > /opt/CUDA/cuda-toolkit/targets/x86_64-linux/lib/libcublas.so:
> > > >
> > > > undefined
> > > >
> > > > reference to `init_gemm_select.libcublasLt.so.10'
> > > > /opt/CUDA/cuda-toolkit/targets/x86_64-linux/lib/libcublas.so:
> > > >
> > > > undefined
> > > >
> > > > reference to `runGemmShortApi.libcublasLt.so.10'
> > > > /opt/CUDA/cuda-toolkit/targets/x86_64-linux/lib/libcublas.so:
> > > >
> > > > undefined
> > > >
> > > > reference to `cublasLtMatmulAlgoGetHeuristic.libcublasLt.so.10'
> > > > /opt/CUDA/cuda-toolkit/targets/x86_64-linux/lib/libcublas.so:
> > > >
> > > > undefined
> > > >
> > > > reference to `gemm_utilization.libcublasLt.so.10'
> > > > /opt/CUDA/cuda-toolkit/targets/x86_64-linux/lib/libcublas.so:
> > > >
> > > > undefined
> > > >
> > > > reference to `cublasLtMatmul.libcublasLt.so.10'
> > > > /opt/CUDA/cuda-toolkit/targets/x86_64-linux/lib/libcublas.so:
> > > >
> > > > undefined
> > > >
> > > > reference to `free_gemm_select.libcublasLt.so.10'
> > > > /opt/CUDA/cuda-toolkit/targets/x86_64-linux/lib/libcublas.so:
> > > >
> > > > undefined
> > > >
> > > > reference to `cublasLtCtxInit.libcublasLt.so.10'
> > > > /opt/CUDA/cuda-toolkit/targets/x86_64-linux/lib/libcublas.so:
> > > >
> > > > undefined
> > > >
> > > > reference to `cublasLtMatmulAlgoInit.libcublasLt.so.10'
> > > > /opt/CUDA/cuda-toolkit/targets/x86_64-linux/lib/libcublas.so:
> > > >
> > > > undefined
> > > >
> > > > reference to `runGemmApi.libcublasLt.so.10'
> > > > /opt/CUDA/cuda-toolkit/targets/x86_64-linux/lib/libcublas.so:
> > > >
> > > > undefined
> > > >
> > > > reference to `cublasLtGetCudartVersion.libcublasLt.so.10'
> > > > /opt/CUDA/cuda-toolkit/targets/x86_64-linux/lib/libcublas.so:
> > > >
> > > > undefined
> > > >
> > > > reference to `cublasLtGetVersion.libcublasLt.so.10'
> > > > collect2: error: ld returned 1 exit status
> > > > Makefile:156: recipe for target 'pbsa.cuda' failed
> > > > make[2]: *** [pbsa.cuda] Error 1
> > > > make[2]: Leaving directory '/opt/amber18/AmberTools/src/pbsa'
> > > > Makefile:447: recipe for target 'cuda_serial' failed
> > > > make[1]: *** [cuda_serial] Error 2
> > > > make[1]: Leaving directory '/opt/amber18/AmberTools/src'
> > > > Makefile:7: recipe for target 'install' failed
> > > > make: *** [install] Error 2
> > > >
> > > > _______________________________________________
> > > > AMBER mailing listAMBER.ambermd.orghttp://
> > > lists.ambermd.org/mailman/listinfo/amber
> > > >
> > > > _______________________________________________
> > > > AMBER mailing listAMBER.ambermd.orghttp://
> > > lists.ambermd.org/mailman/listinfo/amber
> > > >
> > > > _______________________________________________
> > > > AMBER mailing listAMBER.ambermd.orghttp://
> > > lists.ambermd.org/mailman/listinfo/amber
> > > >
> > > > _______________________________________________
> > > > AMBER mailing listAMBER.ambermd.orghttp://
> > > lists.ambermd.org/mailman/listinfo/amber
> > > >
> > > >
> > > _______________________________________________
> > > AMBER mailing list
> > > AMBER.ambermd.org
> > > http://lists.ambermd.org/mailman/listinfo/amber
> > >
> > _______________________________________________
> > AMBER mailing list
> > AMBER.ambermd.org
> > http://lists.ambermd.org/mailman/listinfo/amber
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Thu Mar 05 2020 - 23:00:02 PST
Custom Search