Re: [AMBER] CUDA single gpu usage issue

From: Nicholas Moyer <nmoyer.broadinstitute.org>
Date: Fri, 6 Mar 2020 09:15:43 -0500

Good Morning,

I just wanted to share how I was able to solve my problem of not being able
to install with cuda 10.2 and amber18/ambertools19. I reinstalled Amber and
got cuda 10 and installed. I was able to configure and install and
eventually got it to pass tests. I realized there were two issues with my
install, one was permissions as i would export but the defined variable
wouldn't continue. I solved this by installing as full root. Second was
there was a defined path LD_LIBRARY_PATH that pointed to lib instead of
lib64. lib didn't exist in that directory. After I declared that path in
etc/environment it worked and amber passes all tests. Thank you for the
support here, best of luck any other issues.

Thanks again.

The Best,
Nicholas Moyer.

On Fri, Mar 6, 2020 at 1:34 AM Ray Luo <rluo.uci.edu> wrote:

> Hi Abhilash,
>
> I've just tested cuda 10.2 with both amber 19 and the pre-release
> version of Amber 20 and all test cases (both ambertools and pmemd)
> passed without any problem. Unfortunately, I can't easily install cuda
> 9.2 any more, so can't test it.
>
> These were done on centos 7 on a rocks 7.0 cluster. Cuda was installed
> with the rpm installer.
>
> All the best,
> Ray
>
> --
> Ray Luo, Ph.D.
> Professor of Structural Biology/Biochemistry/Biophysics,
> Chemical Physics, Biomedical Engineering, and Chemical Engineering
> Department of Molecular Biology and Biochemistry
> University of California, Irvine, CA 92697-3900
> On Wed, Mar 4, 2020 at 10:03 AM Ray Luo <rluo.uci.edu> wrote:
> >
> > Hi Abhilash,
> >
> > We'll need to reinstall cuda 9.2 to reproduce what you are seeing. For
> > now, please stay with cuda 10.0 for your amber installation.
> >
> > All the best,
> > Ray
> > --
> > Ray Luo, Ph.D.
> > Professor of Structural Biology/Biochemistry/Biophysics,
> > Chemical Physics, Biomedical Engineering, and Chemical Engineering
> > Department of Molecular Biology and Biochemistry
> > University of California, Irvine, CA 92697-3900
> >
> > On Wed, Mar 4, 2020 at 9:37 AM Abhilash J <md.scfbio.gmail.com> wrote:
> > >
> > > Hi Everyone,
> > >
> > > I tried what Ruxi had suggested. But i could not find *libcublas*
> in
> > > /usr/lib64 or /usr/lib for CUDA 10.2. But i tried to use these
> *libcublas*
> > > files from CUDA 9.2 installation (Just thinking if they are compatible
> and
> > > useful).
> > > That did not go well and i ended up with the same error
> (/usr/bin/ld:
> > > cannot find -lcublas). I did add the path to the *libcublas* files to
> my
> > > LD_PATH_LIBRARY. I am using CentOS 7
> > > I also tried what David suggested earlier in the thread to compile
> > > pmemd.cuda only with CUDA 10.2 and was able to get that. I did not get
> any
> > > appreciable change in the speed. And pmemd.cuda is what i use most of
> the
> > > times.
> > > I was wondering what is the performance gain if I manage to get it
> > > compiled to 10.2. Should i wait for AMBER20 given that its only few
> months
> > > ahead.
> > > It will be great if someone can share some expected numbers.
> > >
> > > Regards
> > >
> > > Abhilash
> > >
> > > On Wed, Mar 4, 2020 at 11:25 AM Nicholas Moyer <
> nmoyer.broadinstitute.org>
> > > wrote:
> > >
> > > > Good Morning,
> > > >
> > > > So I tried all the recommendations that were suggested and i was
> able to
> > > > install with CUDA 9.2 but when i try and make test i get the
> following
> > > > errors that look pretty similar. Any suggestions on why im able to
> build
> > > > and install amber with this but all tests fail? I also reinstalled
> amber
> > > > and everything looks fine as far as I can tell but clearly something
> is
> > > > wrong. Thanks again for all the help!
> > > >
> > > > [image: image.png]
> > > >
> > > > On Wed, Mar 4, 2020 at 4:22 AM Ruxi Qi <ruxiq.uci.edu> wrote:
> > > >
> > > > > Sorry, typo, should be /usr/lib/x86_64-linux-gnu.
> > > > >
> > > > > BTW, we will add a handling to this in the AmberTools 20 release.
> > > > >
> > > > > Ruxi
> > > > > On 3/4/20 5:05 PM, Ruxi Qi wrote:
> > > > >
> > > > > Hi Nicholas,
> > > > >
> > > > > The issue was because from CUDA 10.1 some libraries including the
> CUBLAS
> > > > > are installed in the system standard locations rather than in the
> Toolkit
> > > > > installation directory. Depending on distribution these installed
> > > > locations
> > > > > can be either /usr/lib/x84_64-linux-gnu as with Ubuntu 18.04, or
> > > > /usr/lib64
> > > > > as with Centos 7, or /usr/lib. You can check this by executing:
> > > > >
> > > > > *sudo find /usr -name libcublas**
> > > > >
> > > > > So the solution is to either create a symlink in the CUDA Toolkit
> library
> > > > > path to the library file missing, or more handily add the location
> to
> > > > your
> > > > > library searching path in your .bashrc:
> > > > >
> > > > > export LD_LIBRARY_PATH=/usr/lib/x84_64-linux-gnu:$LD_LIBRARY_PATH
> > > > >
> > > > > Hope it helps.
> > > > >
> > > > > Best,
> > > > >
> > > > > Ruxi
> > > > > On 3/4/20 6:04 AM, Nicholas Moyer wrote:
> > > > >
> > > > > This is the closest error that i've seen to date, i will have to
> try CUDA
> > > > > 9.2, I had been trying to get 10 to work so I wouldn't have to
> reinstall
> > > > on
> > > > > a bunch of machines but that looks like my only option. Thank you
> very
> > > > much
> > > > > I will let you know if that works tomorrow!
> > > > >
> > > > > On Tue, Mar 3, 2020 at 4:58 PM Abhilash J <md.scfbio.gmail.com> <
> > > > md.scfbio.gmail.com> wrote:
> > > > >
> > > > >
> > > > > Hi Everyone,
> > > > >
> > > > > I am also having a similar (but not exact issue) with
> installing
> > > > AMBER
> > > > > 18. The compile seems to complete without glitch if i use CUDA
> 9.2. Error
> > > > > occurs if i use CUDA 10.2. Did you give CUDA 9.2 a shot.
> > > > > The error i am dealing with is as follows.
> > > > > Any comments will be useful.
> > > > >
> > > > > ============error file======================
> > > > > Warning: Deleted feature: ASSIGN statement at (1)
> > > > > /bin/ld: cannot find -lcublas
> > > > > collect2: error: ld returned 1 exit status
> > > > > make[2]: *** [pbsa.cuda] Error 1
> > > > > make[1]: *** [cuda_serial] Error 2
> > > > > make: *** [install] Error 2
> > > > > ===========================================
> > > > >
> > > > > ========out file====================
> > > > > /usr/local/cuda-10.2/bin/nvcc -gencode arch=compute_30,code=sm_30
> > > > -gencode
> > > > > arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37
> -gencode
> > > > > arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52
> -gencode
> > > > > arch=compute_53,code=sm_53 -gencode arch=compute_60,code=sm_60
> -gencode
> > > > > arch=compute_61,code=sm_61 -gencode arch=compute_60,code=sm_70,
> -gencode
> > > > > arch=compute_61,code=sm_70 -Wno-deprecated-declarations
> -use_fast_math
> > > > -O3
> > > > > -ccbin g++ -I../cusplibrary-cuda9 -o cusp_LinearSolvers.o -c
> > > > > cusp_LinearSolvers.cu -DDIA
> > > > > [PBSA] FC pbsa.cuda
> > > > > make[2]: Leaving directory
> > > > > `/home33/ajayaraj/AMBER/amber18/AmberTools/src/pbsa'
> > > > > make[1]: Leaving directory
> > > > `/home33/ajayaraj/AMBER/amber18/AmberTools/src'
> > > > > =================================
> > > > >
> > > > >
> > > > > On Tue, Mar 3, 2020 at 1:27 PM Ray Luo <rluo.uci.edu> <
> rluo.uci.edu>
> > > > wrote:
> > > > >
> > > > >
> > > > > Nicholas,
> > > > >
> > > > > Any CUDA version higher than 10.0 could be a problem. We only
> tested
> > > > > version 10 and older releases when the amber19 was released earlier
> > > > > last year.
> > > > >
> > > > > Right now we are testing the current release of 10.2 on one of our
> GPU
> > > > > boxes.
> > > > >
> > > > > All the best,
> > > > > Ray
> > > > > --
> > > > > Ray Luo, Ph.D.
> > > > > Professor of Structural Biology/Biochemistry/Biophysics,
> > > > > Chemical Physics, Biomedical Engineering, and Chemical Engineering
> > > > > Department of Molecular Biology and Biochemistry
> > > > > University of California, Irvine, CA 92697-3900
> > > > >
> > > > > On Tue, Mar 3, 2020 at 10:20 AM Nicholas Moyer<
> nmoyer.broadinstitute.org>
> > > > <nmoyer.broadinstitute.org> wrote:
> > > > >
> > > > > Good Afternoon,
> > > > >
> > > > > I have tried make install in the src/pmemd and it seems to
> install
> > > > >
> > > > > but
> > > > >
> > > > > then if I try and make test it fails each one. I am currently
> > > > >
> > > > > re-installing
> > > > >
> > > > > from scratch as you had suggested as it still had several cuda.mpi
> in
> > > > >
> > > > > some
> > > > >
> > > > > of the folders after the clean command. Here is some info I forgot
> to add
> > > > > in my original email. Thank you for all the suggestions so far !
> > > > >
> > > > > OS: Ubuntu 18.04
> > > > > shell: bash
> > > > > compiler: gnu
> > > > > CUDA toolkit: 10.1.243
> > > > > amber: 18
> > > > > amber-toolkit: 19
> > > > >
> > > > >
> > > > > On Tue, Mar 3, 2020 at 10:09 AM David A Case <
> david.case.rutgers.edu> <
> > > > david.case.rutgers.edu>
> > > > >
> > > > > wrote:
> > > > >
> > > > > On Tue, Mar 03, 2020, Nicholas Moyer wrote:
> > > > >
> > > > >
> > > > > so ive been having an error with CUDA. I have been dealing with an
> > > > >
> > > > > annoying
> > > > >
> > > > > CUDA issue where basically i have Amber18/ambertools19 installed
> and
> > > > >
> > > > > set
> > > > >
> > > > > with multiple gpu's but when i try to configure amber for single
> GPU
> > > > >
> > > > > usage
> > > > >
> > > > > compiles but when trying to make install it gives me a huge error
> > > > >
> > > > > block
> > > > >
> > > > > The error involves pbsa.cuda, which may not be the program you
> really
> > > > > want (most people are more eager to run pmemd.cuda). If that is
> the
> > > > > case, after the configure step, do this:
> > > > >
> > > > > cd src/pmemd
> > > > > make install
> > > > >
> > > > > That will install just pmemd.cuda, whose installation is better
> > > > >
> > > > > tested.
> > > > >
> > > > > (Of course, you may still have problems, since for most people,
> > > > >
> > > > > building
> > > > >
> > > > > pbsa.cuda gives no problems. If things still don't work, provide
> some
> > > > > details about your OS, compiler and CUDA toolkit versions. Also,
> > > > >
> > > > > since
> > > > >
> > > > > you apparently previously installed the cuda.MPI versions, and are
> now
> > > > > trying to get the cuda serial codes, start from a completely fresh
> > > > > directory tree, just in case something left over from the MPI
> install
> > > > > is causing problems.)
> > > > >
> > > > > I'm cc-ing this to Ray Luo, in case he may have a better handle on
> > > > > recognizing the problem.
> > > > >
> > > > > ...good luck...dac
> > > > >
> > > > >
> > > > >
> > > > > /usr/bin/ld: warning: libcublasLt.so.10, needed by
> > > > > /opt/CUDA/cuda-toolkit/targets/x86_64-linux/lib/libcublas.so, not
> > > > >
> > > > > found
> > > > >
> > > > > (try using -rpath or -rpath-link)
> > > > > /opt/CUDA/cuda-toolkit/targets/x86_64-linux/lib/libcublas.so:
> > > > >
> > > > > undefined
> > > > >
> > > > > reference to `cublasLtShutdownCtx.libcublasLt.so.10'
> > > > > /opt/CUDA/cuda-toolkit/targets/x86_64-linux/lib/libcublas.so:
> > > > >
> > > > > undefined
> > > > >
> > > > > reference to `cublasLtGetProperty.libcublasLt.so.10'
> > > > > /opt/CUDA/cuda-toolkit/targets/x86_64-linux/lib/libcublas.so:
> > > > >
> > > > > undefined
> > > > >
> > > > > reference to `init_gemm_select.libcublasLt.so.10'
> > > > > /opt/CUDA/cuda-toolkit/targets/x86_64-linux/lib/libcublas.so:
> > > > >
> > > > > undefined
> > > > >
> > > > > reference to `runGemmShortApi.libcublasLt.so.10'
> > > > > /opt/CUDA/cuda-toolkit/targets/x86_64-linux/lib/libcublas.so:
> > > > >
> > > > > undefined
> > > > >
> > > > > reference to `cublasLtMatmulAlgoGetHeuristic.libcublasLt.so.10'
> > > > > /opt/CUDA/cuda-toolkit/targets/x86_64-linux/lib/libcublas.so:
> > > > >
> > > > > undefined
> > > > >
> > > > > reference to `gemm_utilization.libcublasLt.so.10'
> > > > > /opt/CUDA/cuda-toolkit/targets/x86_64-linux/lib/libcublas.so:
> > > > >
> > > > > undefined
> > > > >
> > > > > reference to `cublasLtMatmul.libcublasLt.so.10'
> > > > > /opt/CUDA/cuda-toolkit/targets/x86_64-linux/lib/libcublas.so:
> > > > >
> > > > > undefined
> > > > >
> > > > > reference to `free_gemm_select.libcublasLt.so.10'
> > > > > /opt/CUDA/cuda-toolkit/targets/x86_64-linux/lib/libcublas.so:
> > > > >
> > > > > undefined
> > > > >
> > > > > reference to `cublasLtCtxInit.libcublasLt.so.10'
> > > > > /opt/CUDA/cuda-toolkit/targets/x86_64-linux/lib/libcublas.so:
> > > > >
> > > > > undefined
> > > > >
> > > > > reference to `cublasLtMatmulAlgoInit.libcublasLt.so.10'
> > > > > /opt/CUDA/cuda-toolkit/targets/x86_64-linux/lib/libcublas.so:
> > > > >
> > > > > undefined
> > > > >
> > > > > reference to `runGemmApi.libcublasLt.so.10'
> > > > > /opt/CUDA/cuda-toolkit/targets/x86_64-linux/lib/libcublas.so:
> > > > >
> > > > > undefined
> > > > >
> > > > > reference to `cublasLtGetCudartVersion.libcublasLt.so.10'
> > > > > /opt/CUDA/cuda-toolkit/targets/x86_64-linux/lib/libcublas.so:
> > > > >
> > > > > undefined
> > > > >
> > > > > reference to `cublasLtGetVersion.libcublasLt.so.10'
> > > > > collect2: error: ld returned 1 exit status
> > > > > Makefile:156: recipe for target 'pbsa.cuda' failed
> > > > > make[2]: *** [pbsa.cuda] Error 1
> > > > > make[2]: Leaving directory '/opt/amber18/AmberTools/src/pbsa'
> > > > > Makefile:447: recipe for target 'cuda_serial' failed
> > > > > make[1]: *** [cuda_serial] Error 2
> > > > > make[1]: Leaving directory '/opt/amber18/AmberTools/src'
> > > > > Makefile:7: recipe for target 'install' failed
> > > > > make: *** [install] Error 2
> > > > >
> > > > > _______________________________________________
> > > > > AMBER mailing listAMBER.ambermd.orghttp://
> > > > lists.ambermd.org/mailman/listinfo/amber
> > > > >
> > > > > _______________________________________________
> > > > > AMBER mailing listAMBER.ambermd.orghttp://
> > > > lists.ambermd.org/mailman/listinfo/amber
> > > > >
> > > > > _______________________________________________
> > > > > AMBER mailing listAMBER.ambermd.orghttp://
> > > > lists.ambermd.org/mailman/listinfo/amber
> > > > >
> > > > > _______________________________________________
> > > > > AMBER mailing listAMBER.ambermd.orghttp://
> > > > lists.ambermd.org/mailman/listinfo/amber
> > > > >
> > > > >
> > > > _______________________________________________
> > > > AMBER mailing list
> > > > AMBER.ambermd.org
> > > > http://lists.ambermd.org/mailman/listinfo/amber
> > > >
> > > _______________________________________________
> > > AMBER mailing list
> > > AMBER.ambermd.org
> > > http://lists.ambermd.org/mailman/listinfo/amber
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Fri Mar 06 2020 - 06:30:03 PST
Custom Search