Re: [AMBER] Trouble compiling cuda, need help. from Ross Walker on 2012-05-30 (Amber Archive May 2012)

From: Ross Walker <ross.rosswalker.co.uk>
Date: Wed, 30 May 2012 09:23:53 -0700

Hi Jonathan,

I can't see anything obviously wrong with your setup so I can't see why this
would be failing. One thing to try:

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/lib
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/lib64

Try swapping this so the CUDA lib paths are at the beginning of your
LD_LIBRARY_PATH, just in case there is an older cuda library lying
somewhere. Then try:

cd $AMBERHOME
make clean
./configure -cuda gnu
make >& compile_log.txt

Then if it fails please post (gzipped) the compile_log.txt file.

All the best
Ross

> -----Original Message-----
> From: Jonathan Gough [mailto:jonathan.d.gough.gmail.com]
> Sent: Wednesday, May 30, 2012 5:26 AM
> To: amber
> Subject: [AMBER] Trouble compiling cuda, need help.
>
> First a big thanks to the amber team, Jason Swails and his wiki (and of
> course google) that helped me successfully compile amber in serial and
> parallel (without needing to reach out for help)
>
> That being said, i finally have hit a wall.
>
> After doing the configuration, make install seems to fall apart and I
> get
> the full series of "undefined reference to ' ** ' " right after the
> following command:
>
> gfortran -O3 -mtune=native -DCUDA -o pmemd.cuda gbl_constants.o
> gbl_datatypes.o state_info.o file_io_dat.o mdin_ctrl_dat.o
> mdin_ewald_dat.o
> mdin_debugf_dat.o prmtop_dat.o inpcrd_dat.o dynamics_dat.o img.o
> nbips.o
> parallel_dat.o parallel.o gb_parallel.o pme_direct.o pme_recip_dat.o
> pme_slab_recip.o pme_blk_recip.o pme_slab_fft.o pme_blk_fft.o
> pme_fft_dat.o
> fft1d.o bspline.o pme_force.o pbc.o nb_pairlist.o nb_exclusions.o cit.o
> dynamics.o bonds.o angles.o dihedrals.o extra_pnts_nb14.o runmd.o
> loadbal.o
> shake.o prfs.o mol_list.o runmin.o constraints.o axis_optimize.o
> gb_ene.o
> veclib.o gb_force.o timers.o pmemd_lib.o runfiles.o file_io.o bintraj.o
> binrestart.o pmemd_clib.o pmemd.o random.o degcnt.o erfcfun.o
> nmr_calls.o
> nmr_lib.o get_cmdline.o master_setup.o pme_alltasks_setup.o pme_setup.o
> ene_frc_splines.o gb_alltasks_setup.o nextprmtop_section.o angles_ub.o
> dihedrals_imp.o cmap.o charmm.o charmm_gold.o findmask.o remd.o
> multipmemd.o remd_exchg.o amd.o \
> -L/usr/local/cuda/lib64 -L/usr/local/cuda/lib -lcurand -lcufft
> -lcudart ./cuda/cuda.a -L/home/jonathan/amber12/lib
> -L/home/jonathan/amber12/lib -lnetcdf
>
> I thought that I have everything set up correctly... but who knows....
> What
> follows are, as best as I can tell the necessary details re: my setup.
>
> uname -m && cat /etc/*release
> x86_64
> DISTRIB_ID=Ubuntu
> DISTRIB_RELEASE=12.04
> DISTRIB_CODENAME=precise
> DISTRIB_DESCRIPTION="Ubuntu 12.04 LTS"
>
> intel i7
>
> gcc --version
> gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3
>
> gfortran --version
> GNU Fortran (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3
>
> nvcc --version
> nvcc: NVIDIA (R) Cuda compiler driver
> Copyright (c) 2005-2012 NVIDIA Corporation
> Built on Thu_Apr__5_00:24:31_PDT_2012
> Cuda compilation tools, release 4.2, V0.2.1221
>
> cat /proc/driver/nvidia/version
> NVRM version: NVIDIA UNIX x86_64 Kernel Module 295.41 Fri Apr 6
> 23:18:58
> PDT 2012
> GCC version: gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5)
>
> .bashrc has the following present.
>
> export AMBERHOME=/home/jonathan/amber12
> export PATH=$PATH:/usr/local/cuda/bin
> export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/lib
> export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/lib64
> export CUDA_HOME=/usr/local/cuda
> export PATH=$PATH:/$CUDA_HOME/bin
>
> /usr/local/cuda/
> has the following:
> bin/ doc/ extras/ include/ lib/ lib64/ libnvvp/ nvvm/
> open64/ src/ tools/
>
>
> the SDK is at:
> /NVIDIA_GPU_Computing_SDK$ pwd
> /home/jonathan/NVIDIA_GPU_Computing_SDK
>
> amber12 is at:
> /amber12$ pwd
> /home/jonathan/amber12
> jonathan.jonathan-M-601A:~/amber12$ ls
> AmberTools bin configure doc include lib64 Makefile
> README src
> benchmarks config.h dat GNU_LGPL_v2 lib logs
> patch_amber.py share test
>
> deviceQuery gives:
>
> /deviceQuery
> [deviceQuery] starting...
>
> ./deviceQuery Starting...
>
> CUDA Device Query (Runtime API) version (CUDART static linking)
>
> Found 2 CUDA Capable device(s)
>
> Device 0: "Tesla C1060"
> CUDA Driver Version / Runtime Version 4.2 / 4.2
> CUDA Capability Major/Minor version number: 1.3
> Total amount of global memory: 4096 MBytes
> (4294770688
> bytes)
> (30) Multiprocessors x ( 8) CUDA Cores/MP: 240 CUDA Cores
> GPU Clock rate: 1296 MHz (1.30 GHz)
> Memory Clock rate: 800 Mhz
> Memory Bus Width: 512-bit
> Max Texture Dimension Size (x,y,z) 1D=(8192),
> 2D=(65536,32768), 3D=(2048,2048,2048)
> Max Layered Texture Size (dim) x layers 1D=(8192) x 512,
> 2D=(8192,8192) x 512
> Total amount of constant memory: 65536 bytes
> Total amount of shared memory per block: 16384 bytes
> Total number of registers available per block: 16384
> Warp size: 32
> Maximum number of threads per multiprocessor: 1024
> Maximum number of threads per block: 512
> Maximum sizes of each dimension of a block: 512 x 512 x 64
> Maximum sizes of each dimension of a grid: 65535 x 65535 x 1
> Maximum memory pitch: 2147483647 bytes
> Texture alignment: 256 bytes
> Concurrent copy and execution: Yes with 1 copy
> engine(s)
> Run time limit on kernels: No
> Integrated GPU sharing Host Memory: No
> Support host page-locked memory mapping: Yes
> Concurrent kernel execution: No
> Alignment requirement for Surfaces: Yes
> Device has ECC support enabled: No
> Device is using TCC driver mode: No
> Device supports Unified Addressing (UVA): No
> Device PCI Bus ID / PCI location ID: 2 / 0
> Compute Mode:
> < Default (multiple host threads can use ::cudaSetDevice() with
> device
> simultaneously) >
>
> Device 1: "GeForce 9500 GT"
> CUDA Driver Version / Runtime Version 4.2 / 4.2
> CUDA Capability Major/Minor version number: 1.1
> Total amount of global memory: 1024 MBytes
> (1073414144
> bytes)
> ( 4) Multiprocessors x ( 8) CUDA Cores/MP: 32 CUDA Cores
> GPU Clock rate: 1350 MHz (1.35 GHz)
> Memory Clock rate: 400 Mhz
> Memory Bus Width: 128-bit
> Max Texture Dimension Size (x,y,z) 1D=(8192),
> 2D=(65536,32768), 3D=(2048,2048,2048)
> Max Layered Texture Size (dim) x layers 1D=(8192) x 512,
> 2D=(8192,8192) x 512
> Total amount of constant memory: 65536 bytes
> Total amount of shared memory per block: 16384 bytes
> Total number of registers available per block: 8192
> Warp size: 32
> Maximum number of threads per multiprocessor: 768
> Maximum number of threads per block: 512
> Maximum sizes of each dimension of a block: 512 x 512 x 64
> Maximum sizes of each dimension of a grid: 65535 x 65535 x 1
> Maximum memory pitch: 2147483647 bytes
> Texture alignment: 256 bytes
> Concurrent copy and execution: Yes with 1 copy
> engine(s)
> Run time limit on kernels: Yes
> Integrated GPU sharing Host Memory: No
> Support host page-locked memory mapping: Yes
> Concurrent kernel execution: No
> Alignment requirement for Surfaces: Yes
> Device has ECC support enabled: No
> Device is using TCC driver mode: No
> Device supports Unified Addressing (UVA): No
> Device PCI Bus ID / PCI location ID: 3 / 0
> Compute Mode:
> < Default (multiple host threads can use ::cudaSetDevice() with
> device
> simultaneously) >
>
> deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 4.2, CUDA
> Runtime
> Version = 4.2, NumDevs = 2, Device = Tesla C1060, Device = GeForce 9500
> GT
> [deviceQuery] test results...
> PASSED
>
> > exiting in 3 seconds: 3...2...1...done!
>
>
>
> and config.h has the following:
>
> # Amber configuration file, created with: ./configure -cuda gnu
>
> #######################################################################
> ########
>
> # (1) Location of the installation
>
> BASEDIR=/home/jonathan/amber12
> BINDIR=/home/jonathan/amber12/bin
> LIBDIR=/home/jonathan/amber12/lib
> INCDIR=/home/jonathan/amber12/include
> DATDIR=/home/jonathan/amber12/dat
> LOGDIR=/home/jonathan/amber12/logs
>
> #######################################################################
> ########
>
>
> # (2) If you want to search additional libraries by default, add them
> # to the FLIBS variable here. (External libraries can also be
> linked
> into
> # NAB programs simply by including them on the command line;
> libraries
> # included in FLIBS are always searched.)
>
> FLIBS= -lsff -lpbsa -larpack -llapack -lblas -L$(BASEDIR)/lib -
> lnetcdf
> -lgfortran -w
> FLIBS_PTRAJ= -larpack -llapack -lblas -lgfortran -w
> FLIBSF= -larpack -llapack -lblas
> FLIBS_FFTW3=
> #######################################################################
> ########
>
> # (3) Modify any of the following if you need to change, e.g. to use
> gcc
> # rather than cc, etc.
>
> SHELL=/bin/sh
> INSTALLTYPE=cuda
> BUILDAMBER=amber
>
> # Set the C compiler, etc.
>
> # The configure script should be fine, but if you need to hand-edit,
> # here is some info:
>
> # Example: CC-->gcc; LEX-->flex; YACC-->yacc (built in byacc)
> # Note: If your lexer is "really" flex, you need to set
> # LEX=flex below. For example, on some distributions,
> # /usr/bin/lex is really just a pointer to /usr/bin/flex,
> # so LEX=flex is necessary. In general, gcc seems to need flex.
>
> # The compiler flags CFLAGS and CXXFLAGS should always be used.
> # By contrast, *OPTFLAGS and *NOOPTFLAGS will only be used with
> # certain files, and usually at compile-time but not link-time.
> # Where *OPTFLAGS and *NOOPTFLAGS are requested (in Makefiles,
> # makedepend and depend), they should come before CFLAGS or
> # CXXFLAGS; this allows the user to override *OPTFLAGS and
> # *NOOPTFLAGS using the BUILDFLAGS variable.
> #
> CC=gcc
> CFLAGS= -DSYSV -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -DBINTRAJ
> $(CUSTOMBUILDFLAGS)
> CNOOPTFLAGS=
> COPTFLAGS=-O3 -mtune=native -DBINTRAJ -DHASGZ -DHASBZ2
> AMBERCFLAGS= $(AMBERBUILDFLAGS)
>
> CXX=g++
> CPLUSPLUS=g++
> CXXFLAGS= $(CUSTOMBUILDFLAGS)
> CXXNOOPTFLAGS=
> CXXOPTFLAGS=-O3
> AMBERCXXFLAGS= $(AMBERBUILDFLAGS)
>
> NABFLAGS=
> PBSAFLAG=
>
> LDFLAGS= $(CUSTOMBUILDFLAGS)
> AMBERLDFLAGS=$(AMBERBUILDFLAGS)
>
> LEX= flex
> YACC= $(BINDIR)/yacc
> AR= ar rv
> M4= m4
> RANLIB=ranlib
>
> # Set the C-preprocessor. Code for a small preprocessor is in
> # ucpp-1.3; it gets installed as $(BINDIR)/ucpp;
> # this can generally be used (maybe not on 64-bit machines like
> altix).
>
> CPP= ucpp -l
>
> # These variables control whether we will use compiled versions of
> BLAS
> # and LAPACK (which are generally slower), or whether those libraries
> are
> # already available (presumably in an optimized form).
>
> LAPACK=install
> BLAS=install
> F2C=skip
>
> # These variables determine whether builtin versions of certain
> components
> # can be used, or whether we need to compile our own versions.
>
> UCPP=install
> C9XCOMPLEX=skip
>
> # For Windows/cygwin, set SFX to ".exe"; for Unix/Linux leave it
> empty:
> # Set OBJSFX to ".obj" instead of ".o" on Windows:
>
> SFX=
> OSFX=.o
> MV=mv
> RM=rm
> CP=cp
>
> # Information about Fortran compilation:
>
> FC=gfortran
> FFLAGS= $(LOCALFLAGS) $(CUSTOMBUILDFLAGS) -I$(INCDIR) $(NETCDFINC)
> FNOOPTFLAGS= -O0
> FOPTFLAGS= -O3 -mtune=native
> AMBERFFLAGS=$(AMBERBUILDFLAGS)
> FREEFORMAT_FLAG= -ffree-form
> LM=-lm
> FPP=cpp -traditional -P
> FPPFLAGS= -DBINTRAJ $(CUSTOMBUILDFLAGS)
> AMBERFPPFLAGS=$(AMBERBUILDFLAGS)
> FCREAL8=-fdefault-real-8
>
> XHOME= /usr
> XLIBS= -L/usr/lib/x86_64-linux-gnu -L/usr/lib64 -L/usr/lib
> MAKE_XLEAP=install_xleap
>
> NETCDF=$(BASEDIR)/include/netcdf.mod
> NETCDFLIB=-L$(BASEDIR)/lib -lnetcdf
> NETCDFINC=-I$(BASEDIR)/include
> PNETCDF=
> PNETCDFLIB=
> FFTWLIB=
>
> ZLIB=-lz
> BZLIB=-lbz2
>
> HASFC=yes
> MTKPP=
> XBLAS=
> FFTW3=
> MDGX=no
>
> COMPILER=gnu
> MKL=
> MKL_PROCESSOR=
>
> #CUDA Specific build flags
> NVCC=$(CUDA_HOME)/bin/nvcc -use_fast_math -O3 -gencode
> arch=compute_13,code=sm_13 -gencode arch=compute_20,code=sm_20
> PMEMD_CU_INCLUDES=-I$(CUDA_HOME)/include -IB40C -IB40C/KernelCommon
> PMEMD_CU_LIBS=-L$(CUDA_HOME)/lib64 -L$(CUDA_HOME)/lib -lcurand -lcufft
> -lcudart ./cuda/cuda.a
> PMEMD_CU_DEFINES=-DCUDA
>
> #PMEMD Specific build flags
> PMEMD_F90=gfortran -DBINTRAJ -DDIRFRC_EFS -DDIRFRC_COMTRANS
> -DDIRFRC_NOVEC -DFFTLOADBAL_2PROC -DPUBFFT
> PMEMD_FOPTFLAGS=-O3 -mtune=native
> PMEMD_CC=gcc
> PMEMD_COPTFLAGS=-O3 -mtune=native -DSYSV -D_FILE_OFFSET_BITS=64
> -D_LARGEFILE_SOURCE -DBINTRAJ
> PMEMD_FLIBSF=
> PMEMD_LD= gfortran
> LDOUT= -o
>
> #for NAB:
> MPI=
>
> #1D-RISM
> RISM=no
>
> #3D-RISM NAB
> RISMSFF=
> SFF_RISM_INTERFACE=
> TESTRISMSFF=
>
> #3D-RISM SANDER
> RISMSANDER=
> SANDER_RISM_INTERFACE=
> FLIBS_RISMSANDER=
> TESTRISMSANDER=
>
> #PUPIL
> PUPILLIBS=-lrt -lm -lc -L${PUPIL_PATH}/lib -lPUPIL -lPUPILBlind
>
> #Python interpreter we are using
> PYTHON=/usr/bin/python2.7
>
>
> if you need any other info, please let me know...
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber

_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Wed May 30 2012 - 09:30:03 PDT