Hi Jonathan,
I can't see anything obviously wrong with your setup so I can't see why this
would be failing. One thing to try:
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/lib
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/lib64
Try swapping this so the CUDA lib paths are at the beginning of your
LD_LIBRARY_PATH, just in case there is an older cuda library lying
somewhere. Then try:
cd $AMBERHOME
make clean
./configure -cuda gnu
make >& compile_log.txt
Then if it fails please post (gzipped) the compile_log.txt file.
All the best
Ross
> -----Original Message-----
> From: Jonathan Gough [mailto:jonathan.d.gough.gmail.com]
> Sent: Wednesday, May 30, 2012 5:26 AM
> To: amber
> Subject: [AMBER] Trouble compiling cuda, need help.
> 
> First a big thanks to the amber team, Jason Swails and his wiki (and of
> course google) that helped me successfully compile amber in serial and
> parallel (without needing to reach out for help)
> 
> That being said, i finally have hit a wall.
> 
> After doing the configuration, make install seems to fall apart and I
> get
> the full series of "undefined reference to ' ** ' " right after the
> following command:
> 
> gfortran  -O3 -mtune=native -DCUDA -o pmemd.cuda gbl_constants.o
> gbl_datatypes.o state_info.o file_io_dat.o mdin_ctrl_dat.o
> mdin_ewald_dat.o
> mdin_debugf_dat.o prmtop_dat.o inpcrd_dat.o dynamics_dat.o img.o
> nbips.o
> parallel_dat.o parallel.o gb_parallel.o pme_direct.o pme_recip_dat.o
> pme_slab_recip.o pme_blk_recip.o pme_slab_fft.o pme_blk_fft.o
> pme_fft_dat.o
> fft1d.o bspline.o pme_force.o pbc.o nb_pairlist.o nb_exclusions.o cit.o
> dynamics.o bonds.o angles.o dihedrals.o extra_pnts_nb14.o runmd.o
> loadbal.o
> shake.o prfs.o mol_list.o runmin.o constraints.o axis_optimize.o
> gb_ene.o
> veclib.o gb_force.o timers.o pmemd_lib.o runfiles.o file_io.o bintraj.o
> binrestart.o pmemd_clib.o pmemd.o random.o degcnt.o erfcfun.o
> nmr_calls.o
> nmr_lib.o get_cmdline.o master_setup.o pme_alltasks_setup.o pme_setup.o
> ene_frc_splines.o gb_alltasks_setup.o nextprmtop_section.o angles_ub.o
> dihedrals_imp.o cmap.o charmm.o charmm_gold.o findmask.o remd.o
> multipmemd.o remd_exchg.o amd.o \
>       -L/usr/local/cuda/lib64 -L/usr/local/cuda/lib -lcurand -lcufft
> -lcudart ./cuda/cuda.a -L/home/jonathan/amber12/lib
> -L/home/jonathan/amber12/lib -lnetcdf
> 
> I thought that I have everything set up correctly... but who knows....
> What
> follows are, as best as I can tell the necessary details re: my setup.
> 
> uname -m && cat /etc/*release
> x86_64
> DISTRIB_ID=Ubuntu
> DISTRIB_RELEASE=12.04
> DISTRIB_CODENAME=precise
> DISTRIB_DESCRIPTION="Ubuntu 12.04 LTS"
> 
> intel i7
> 
> gcc --version
> gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3
> 
> gfortran --version
> GNU Fortran (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3
> 
> nvcc --version
> nvcc: NVIDIA (R) Cuda compiler driver
> Copyright (c) 2005-2012 NVIDIA Corporation
> Built on Thu_Apr__5_00:24:31_PDT_2012
> Cuda compilation tools, release 4.2, V0.2.1221
> 
> cat /proc/driver/nvidia/version
> NVRM version: NVIDIA UNIX x86_64 Kernel Module  295.41  Fri Apr  6
> 23:18:58
> PDT 2012
> GCC version:  gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5)
> 
> .bashrc has the following present.
> 
> export AMBERHOME=/home/jonathan/amber12
> export PATH=$PATH:/usr/local/cuda/bin
> export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/lib
> export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/lib64
> export CUDA_HOME=/usr/local/cuda
> export PATH=$PATH:/$CUDA_HOME/bin
> 
> /usr/local/cuda/
> has the following:
> bin/     doc/     extras/  include/ lib/     lib64/   libnvvp/ nvvm/
>  open64/  src/     tools/
> 
> 
> the SDK is at:
> /NVIDIA_GPU_Computing_SDK$ pwd
> /home/jonathan/NVIDIA_GPU_Computing_SDK
> 
> amber12 is at:
> /amber12$ pwd
> /home/jonathan/amber12
> jonathan.jonathan-M-601A:~/amber12$ ls
> AmberTools  bin       configure  doc          include  lib64  Makefile
>    README  src
> benchmarks  config.h  dat        GNU_LGPL_v2  lib      logs
> patch_amber.py  share   test
> 
> deviceQuery gives:
> 
> /deviceQuery
> [deviceQuery] starting...
> 
> ./deviceQuery Starting...
> 
>  CUDA Device Query (Runtime API) version (CUDART static linking)
> 
> Found 2 CUDA Capable device(s)
> 
> Device 0: "Tesla C1060"
>   CUDA Driver Version / Runtime Version          4.2 / 4.2
>   CUDA Capability Major/Minor version number:    1.3
>   Total amount of global memory:                 4096 MBytes
> (4294770688
> bytes)
>   (30) Multiprocessors x (  8) CUDA Cores/MP:    240 CUDA Cores
>   GPU Clock rate:                                1296 MHz (1.30 GHz)
>   Memory Clock rate:                             800 Mhz
>   Memory Bus Width:                              512-bit
>   Max Texture Dimension Size (x,y,z)             1D=(8192),
> 2D=(65536,32768), 3D=(2048,2048,2048)
>   Max Layered Texture Size (dim) x layers        1D=(8192) x 512,
> 2D=(8192,8192) x 512
>   Total amount of constant memory:               65536 bytes
>   Total amount of shared memory per block:       16384 bytes
>   Total number of registers available per block: 16384
>   Warp size:                                     32
>   Maximum number of threads per multiprocessor:  1024
>   Maximum number of threads per block:           512
>   Maximum sizes of each dimension of a block:    512 x 512 x 64
>   Maximum sizes of each dimension of a grid:     65535 x 65535 x 1
>   Maximum memory pitch:                          2147483647 bytes
>   Texture alignment:                             256 bytes
>   Concurrent copy and execution:                 Yes with 1 copy
> engine(s)
>   Run time limit on kernels:                     No
>   Integrated GPU sharing Host Memory:            No
>   Support host page-locked memory mapping:       Yes
>   Concurrent kernel execution:                   No
>   Alignment requirement for Surfaces:            Yes
>   Device has ECC support enabled:                No
>   Device is using TCC driver mode:               No
>   Device supports Unified Addressing (UVA):      No
>   Device PCI Bus ID / PCI location ID:           2 / 0
>   Compute Mode:
>      < Default (multiple host threads can use ::cudaSetDevice() with
> device
> simultaneously) >
> 
> Device 1: "GeForce 9500 GT"
>   CUDA Driver Version / Runtime Version          4.2 / 4.2
>   CUDA Capability Major/Minor version number:    1.1
>   Total amount of global memory:                 1024 MBytes
> (1073414144
> bytes)
>   ( 4) Multiprocessors x (  8) CUDA Cores/MP:    32 CUDA Cores
>   GPU Clock rate:                                1350 MHz (1.35 GHz)
>   Memory Clock rate:                             400 Mhz
>   Memory Bus Width:                              128-bit
>   Max Texture Dimension Size (x,y,z)             1D=(8192),
> 2D=(65536,32768), 3D=(2048,2048,2048)
>   Max Layered Texture Size (dim) x layers        1D=(8192) x 512,
> 2D=(8192,8192) x 512
>   Total amount of constant memory:               65536 bytes
>   Total amount of shared memory per block:       16384 bytes
>   Total number of registers available per block: 8192
>   Warp size:                                     32
>   Maximum number of threads per multiprocessor:  768
>   Maximum number of threads per block:           512
>   Maximum sizes of each dimension of a block:    512 x 512 x 64
>   Maximum sizes of each dimension of a grid:     65535 x 65535 x 1
>   Maximum memory pitch:                          2147483647 bytes
>   Texture alignment:                             256 bytes
>   Concurrent copy and execution:                 Yes with 1 copy
> engine(s)
>   Run time limit on kernels:                     Yes
>   Integrated GPU sharing Host Memory:            No
>   Support host page-locked memory mapping:       Yes
>   Concurrent kernel execution:                   No
>   Alignment requirement for Surfaces:            Yes
>   Device has ECC support enabled:                No
>   Device is using TCC driver mode:               No
>   Device supports Unified Addressing (UVA):      No
>   Device PCI Bus ID / PCI location ID:           3 / 0
>   Compute Mode:
>      < Default (multiple host threads can use ::cudaSetDevice() with
> device
> simultaneously) >
> 
> deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 4.2, CUDA
> Runtime
> Version = 4.2, NumDevs = 2, Device = Tesla C1060, Device = GeForce 9500
> GT
> [deviceQuery] test results...
> PASSED
> 
> > exiting in 3 seconds: 3...2...1...done!
> 
> 
> 
> and config.h has the following:
> 
> #  Amber configuration file, created with: ./configure -cuda gnu
> 
> #######################################################################
> ########
> 
> # (1)  Location of the installation
> 
> BASEDIR=/home/jonathan/amber12
> BINDIR=/home/jonathan/amber12/bin
> LIBDIR=/home/jonathan/amber12/lib
> INCDIR=/home/jonathan/amber12/include
> DATDIR=/home/jonathan/amber12/dat
> LOGDIR=/home/jonathan/amber12/logs
> 
> #######################################################################
> ########
> 
> 
> #  (2) If you want to search additional libraries by default, add them
> #      to the FLIBS variable here.  (External libraries can also be
> linked
> into
> #      NAB programs simply by including them on the command line;
> libraries
> #      included in FLIBS are always searched.)
> 
> FLIBS=  -lsff -lpbsa   -larpack -llapack -lblas  -L$(BASEDIR)/lib -
> lnetcdf
>  -lgfortran -w
> FLIBS_PTRAJ= -larpack -llapack -lblas   -lgfortran -w
> FLIBSF= -larpack -llapack -lblas
> FLIBS_FFTW3=
> #######################################################################
> ########
> 
> #  (3)  Modify any of the following if you need to change, e.g. to use
> gcc
> #        rather than cc, etc.
> 
> SHELL=/bin/sh
> INSTALLTYPE=cuda
> BUILDAMBER=amber
> 
> #  Set the C compiler, etc.
> 
> #  The configure script should be fine, but if you need to hand-edit,
> #  here is some info:
> 
> #  Example:  CC-->gcc; LEX-->flex; YACC-->yacc (built in byacc)
> #     Note: If your lexer is "really" flex, you need to set
> #     LEX=flex below.  For example, on some distributions,
> #     /usr/bin/lex is really just a pointer to /usr/bin/flex,
> #     so LEX=flex is necessary.  In general, gcc seems to need flex.
> 
> #   The compiler flags CFLAGS and CXXFLAGS should always be used.
> #   By contrast, *OPTFLAGS and *NOOPTFLAGS will only be used with
> #   certain files, and usually at compile-time but not link-time.
> #   Where *OPTFLAGS and *NOOPTFLAGS are requested (in Makefiles,
> #   makedepend and depend), they should come before CFLAGS or
> #   CXXFLAGS; this allows the user to override *OPTFLAGS and
> #   *NOOPTFLAGS using the BUILDFLAGS variable.
> #
> CC=gcc
> CFLAGS= -DSYSV -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -DBINTRAJ
>  $(CUSTOMBUILDFLAGS)
> CNOOPTFLAGS=
> COPTFLAGS=-O3 -mtune=native -DBINTRAJ -DHASGZ -DHASBZ2
> AMBERCFLAGS= $(AMBERBUILDFLAGS)
> 
> CXX=g++
> CPLUSPLUS=g++
> CXXFLAGS=  $(CUSTOMBUILDFLAGS)
> CXXNOOPTFLAGS=
> CXXOPTFLAGS=-O3
> AMBERCXXFLAGS= $(AMBERBUILDFLAGS)
> 
> NABFLAGS=
> PBSAFLAG=
> 
> LDFLAGS= $(CUSTOMBUILDFLAGS)
> AMBERLDFLAGS=$(AMBERBUILDFLAGS)
> 
> LEX=   flex
> YACC=  $(BINDIR)/yacc
> AR=    ar rv
> M4=    m4
> RANLIB=ranlib
> 
> #  Set the C-preprocessor.  Code for a small preprocessor is in
> #    ucpp-1.3;  it gets installed as $(BINDIR)/ucpp;
> #    this can generally be used (maybe not on 64-bit machines like
> altix).
> 
> CPP=    ucpp -l
> 
> #  These variables control whether we will use compiled versions of
> BLAS
> #  and LAPACK (which are generally slower), or whether those libraries
> are
> #  already available (presumably in an optimized form).
> 
> LAPACK=install
> BLAS=install
> F2C=skip
> 
> #  These variables determine whether builtin versions of certain
> components
> #  can be used, or whether we need to compile our own versions.
> 
> UCPP=install
> C9XCOMPLEX=skip
> 
> #  For Windows/cygwin, set SFX to ".exe"; for Unix/Linux leave it
> empty:
> #  Set OBJSFX to ".obj" instead of ".o" on Windows:
> 
> SFX=
> OSFX=.o
> MV=mv
> RM=rm
> CP=cp
> 
> #  Information about Fortran compilation:
> 
> FC=gfortran
> FFLAGS=  $(LOCALFLAGS) $(CUSTOMBUILDFLAGS) -I$(INCDIR) $(NETCDFINC)
> FNOOPTFLAGS= -O0
> FOPTFLAGS= -O3 -mtune=native
> AMBERFFLAGS=$(AMBERBUILDFLAGS)
> FREEFORMAT_FLAG= -ffree-form
> LM=-lm
> FPP=cpp -traditional -P
> FPPFLAGS= -DBINTRAJ  $(CUSTOMBUILDFLAGS)
> AMBERFPPFLAGS=$(AMBERBUILDFLAGS)
> FCREAL8=-fdefault-real-8
> 
> XHOME= /usr
> XLIBS= -L/usr/lib/x86_64-linux-gnu -L/usr/lib64 -L/usr/lib
> MAKE_XLEAP=install_xleap
> 
> NETCDF=$(BASEDIR)/include/netcdf.mod
> NETCDFLIB=-L$(BASEDIR)/lib -lnetcdf
> NETCDFINC=-I$(BASEDIR)/include
> PNETCDF=
> PNETCDFLIB=
> FFTWLIB=
> 
> ZLIB=-lz
> BZLIB=-lbz2
> 
> HASFC=yes
> MTKPP=
> XBLAS=
> FFTW3=
> MDGX=no
> 
> COMPILER=gnu
> MKL=
> MKL_PROCESSOR=
> 
> #CUDA Specific build flags
> NVCC=$(CUDA_HOME)/bin/nvcc -use_fast_math -O3 -gencode
> arch=compute_13,code=sm_13 -gencode arch=compute_20,code=sm_20
> PMEMD_CU_INCLUDES=-I$(CUDA_HOME)/include -IB40C -IB40C/KernelCommon
> PMEMD_CU_LIBS=-L$(CUDA_HOME)/lib64 -L$(CUDA_HOME)/lib -lcurand -lcufft
> -lcudart ./cuda/cuda.a
> PMEMD_CU_DEFINES=-DCUDA
> 
> #PMEMD Specific build flags
> PMEMD_F90=gfortran   -DBINTRAJ -DDIRFRC_EFS -DDIRFRC_COMTRANS
> -DDIRFRC_NOVEC -DFFTLOADBAL_2PROC -DPUBFFT
> PMEMD_FOPTFLAGS=-O3 -mtune=native
> PMEMD_CC=gcc
> PMEMD_COPTFLAGS=-O3 -mtune=native -DSYSV -D_FILE_OFFSET_BITS=64
> -D_LARGEFILE_SOURCE -DBINTRAJ
> PMEMD_FLIBSF=
> PMEMD_LD= gfortran
> LDOUT= -o
> 
> #for NAB:
> MPI=
> 
> #1D-RISM
> RISM=no
> 
> #3D-RISM NAB
> RISMSFF=
> SFF_RISM_INTERFACE=
> TESTRISMSFF=
> 
> #3D-RISM SANDER
> RISMSANDER=
> SANDER_RISM_INTERFACE=
> FLIBS_RISMSANDER=
> TESTRISMSANDER=
> 
> #PUPIL
> PUPILLIBS=-lrt -lm -lc -L${PUPIL_PATH}/lib -lPUPIL -lPUPILBlind
> 
> #Python interpreter we are using
> PYTHON=/usr/bin/python2.7
> 
> 
> if you need any other info, please let me know...
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Wed May 30 2012 - 09:30:03 PDT