Re: [AMBER] error of amber16 mpi version using open-mpi

From: jacky zhao <jackyzhao010.gmail.com>
Date: Wed, 19 Oct 2016 11:35:10 +0800

Thank you for your suggestions. I am running this on a workstation computer
through ssh method. I have added a new user to solve this problem. I have
used MPICH3.1.4 to compile amber16 successfully and passed all the test.

Thank you for your help.

jacky

2016-10-19 0:34 GMT+08:00 Ross Walker <ross.rosswalker.co.uk>:

> Hi Jacky,
>
> I compile AMBER 16 with cuda 8.0 and mpich 3.1.4 almost everyday. It works
> fine. I've never seen this behavior before, built AMBER 16 with centos 7,
> mpich-3.1.4, gnu 4.8.5 and cuda 8.0.44 a total of 11 times yesterday.
>
> Hangup doesn't look like a compile error to me - more like someone doing a
> kill <pid> on the nvcc process. So this looks like something weird in your
> environment or some kind of interactive timeout limit (or maybe a memory
> limit). Are you running this on a local machine or on a cluster through
> some queueing system?
>
> All the best
> Ross
>
> > On Oct 18, 2016, at 9:03 AM, jacky zhao <jackyzhao010.gmail.com> wrote:
> >
> > Sorry to bother you again.
> >
> > MPICH3.1.4 can be used for cpu-mpi version of amber16. However, the
> > cuda-mpi version of amber16 has problem to compile. The error information
> > are list below:
> >
> >
> > nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are
> > deprecated, and may be removed in a future release (Use
> > -Wno-deprecated-gpu-targets to suppress warning).
> >
> > make[5]: *** [kCalculateGBNonbondEnergy1.o] Hangup
> >
> > make[1]: make: *** [cuda_parallel] Hangup*** [install] Hangup
> >
> >
> > make[4]: *** [cuda_dpfp_libs] Hangup
> >
> > make[3]: *** [cuda_parallel_DPFP] Hangup
> >
> > make[2]: *** [cuda_parallel] Hangup
> >
> >
> > 2016-10-18 22:47 GMT+08:00 Daniel Roe <daniel.r.roe.gmail.com>:
> >
> >> Hi,
> >>
> >> On Tue, Oct 18, 2016 at 10:24 AM, jacky zhao <jackyzhao010.gmail.com>
> >> wrote:
> >>>
> >>> /home/user/amber16/lib/libfftw3_mpi.so: undefined reference to
> >>> `ompi_mpi_char'
> >>
> >> These kinds of errors during the link step usually happen when you
> >> change compiler types/versions and old libraries get leftover. You may
> >> want to try running a 'make uninstall' and then re-running configure.
> >>
> >> -Dan
> >>
> >> PS - The patch will be ready to go soon.
> >>
> >>>
> >>> /home/user/amber16/lib/libfftw3_mpi.so: undefined reference to
> >>> `MPI_Comm_f2c'
> >>>
> >>> /home/user/amber16/lib/libfftw3_mpi.so: undefined reference to
> >>> `ompi_mpi_int'
> >>>
> >>> /home/user/amber16/lib/libfftw3_mpi.so: undefined reference to
> >>> `ompi_mpi_comm_null'
> >>>
> >>> /home/user/amber16/lib/libfftw3_mpi.so: undefined reference to
> >>> `ompi_mpi_op_max'
> >>>
> >>> /home/user/amber16/lib/libfftw3_mpi.so: undefined reference to
> >>> `ompi_mpi_op_lor'
> >>>
> >>> /home/user/amber16/lib/libfftw3_mpi.so: undefined reference to
> >>> `ompi_mpi_op_land'
> >>>
> >>> /home/user/amber16/lib/libfftw3_mpi.so: undefined reference to
> >>> `ompi_mpi_unsigned'
> >>>
> >>> /home/user/amber16/lib/libfftw3_mpi.so: undefined reference to
> >>> `ompi_mpi_double'
> >>>
> >>> /home/user/amber16/lib/libfftw3_mpi.so: undefined reference to
> >>> `ompi_mpi_unsigned_long'
> >>>
> >>> /home/user/amber16/lib/libfftw3_mpi.so: undefined reference to
> >>> `ompi_mpi_op_sum'
> >>>
> >>> collect2: error: ld returned 1 exit status
> >>>
> >>> make[2]: *** [/home/user/amber16/bin/mdgx.MPI] Error 1
> >>>
> >>> make[2]: Leaving directory `/home/user/amber16/AmberTools/src/mdgx'
> >>>
> >>> make[1]: *** [parallel] Error 2
> >>>
> >>> make[1]: Leaving directory `/home/user/amber16/AmberTools/src'
> >>>
> >>> make: *** [install] Error 2
> >>>
> >>>
> >>> 2016-10-18 22:02 GMT+08:00 jacky zhao <jackyzhao010.gmail.com>:
> >>>
> >>>> MPICH3.2:
> >>>>
> >>>>
> >>>> mpicxx -Wall -Wno-unused-function -c -I/home/user/amber16/include -O3
> >>>> -mtune=native -fPIC -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE
> >> -DBINTRAJ
> >>>> -DHASGZ -DHASBZ2 -D__PLUMED_HAS_DLOPEN -DMPI
> >>>> -I/home/user/amber16/include -DUSE_SANDERLIB -o FileIO_Gzip.o
> >>>> FileIO_Gzip.cpp
> >>>>
> >>>> mpicxx -Wall -Wno-unused-function -c -I/home/user/amber16/include -O3
> >>>> -mtune=native -fPIC -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE
> >> -DBINTRAJ
> >>>> -DHASGZ -DHASBZ2 -D__PLUMED_HAS_DLOPEN -DMPI
> >>>> -I/home/user/amber16/include -DUSE_SANDERLIB -o FileIO_Mpi.o
> >>>> FileIO_Mpi.cpp
> >>>>
> >>>> FileIO_Mpi.cpp: In member function 'virtual int
> >> FileIO_Mpi::Seek(off_t)':
> >>>>
> >>>> FileIO_Mpi.cpp:33:28: error: 'SEEK_SET' was not declared in this scope
> >>>>
> >>>> if (pfile_.Fseek(offset, SEEK_SET)) return 1;
> >>>>
> >>>> ^
> >>>>
> >>>> FileIO_Mpi.cpp: In member function 'virtual int FileIO_Mpi::Rewind()':
> >>>>
> >>>> FileIO_Mpi.cpp:39:24: error: 'SEEK_SET' was not declared in this scope
> >>>>
> >>>> if (pfile_.Fseek(0L, SEEK_SET)) return 1;
> >>>>
> >>>> ^
> >>>>
> >>>> make[4]: *** [FileIO_Mpi.o] Error 1
> >>>>
> >>>> make[4]: Leaving directory `/home/user/amber16/
> >> AmberTools/src/cpptraj/src'
> >>>>
> >>>> make[3]: *** [parallel] Error 2
> >>>>
> >>>> make[3]: Leaving directory `/home/user/amber16/
> AmberTools/src/cpptraj'
> >>>>
> >>>>
> >>>> 510,1
> >>>>
> >>>> 2016-10-18 22:01 GMT+08:00 jacky zhao <jackyzhao010.gmail.com>:
> >>>>
> >>>>> Very well for this good news.
> >>>>> .Case I have used MPICH3.2 to compile cpu-mpi version of amber16,
> but
> >>>>> the same error occur. As your suggestion, I am using MPICH3.1.4 to
> >>>>> recompile it.
> >>>>>
> >>>>>
> >>>>> 2016-10-18 21:22 GMT+08:00 Daniel Roe <daniel.r.roe.gmail.com>:
> >>>>>
> >>>>>> In addition to Dave's suggestions, I am currently working on a patch
> >>>>>> to AmberTools that addresses this. Should be out today or tomorrow.
> >>>>>>
> >>>>>> -Dan
> >>>>>>
> >>>>>> On Tue, Oct 18, 2016 at 9:08 AM, David A Case <
> david.case.rutgers.edu
> >>>
> >>>>>> wrote:
> >>>>>>> On Tue, Oct 18, 2016, jacky zhao wrote:
> >>>>>>>
> >>>>>>>> open-mpi version is 2.0.1. and open-mpi example test is passed.
> >>>>>>>> I have used open-mpi to compile cuda and cuda.mpi verion of
> amber16
> >>>>>> without
> >>>>>>>> any problems. But error was encountered in the cpu-mpi version of
> >>>>>> amber16.
> >>>>>>>> The information are list below:
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> if (pfile_.Fseek(offset, SEEK_SET)) return 1;
> >>>>>>>>
> >>>>>>>> FileIO_Mpi.cpp: In member function 'virtual int
> >> FileIO_Mpi::Rewind()':
> >>>>>>>> FileIO_Mpi.cpp:39:24: error: 'SEEK_SET' was not declared in this
> >> scope
> >>>>>>>> if (pfile_.Fseek(0L, SEEK_SET)) return 1;
> >>>>>>>> ^
> >>>>>>>
> >>>>>>> Option 1: use an older version of OpenMPI.
> >>>>>>> Option 2: use the patch suggested below, or get the github version
> >> of
> >>>>>>> OpenMPI
> >>>>>>> Option 3: use a different MPI, say mpich2
> >>>>>>>
> >>>>>>> See these previous posts:
> >>>>>>>
> >>>>>>> http://archive.ambermd.org/201610/0109.html
> >>>>>>>
> >>>>>>> ...dac
> >>>>>>>
> >>>>>>> _______________________________________________
> >>>>>>> AMBER mailing list
> >>>>>>> AMBER.ambermd.org
> >>>>>>> http://lists.ambermd.org/mailman/listinfo/amber
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> --
> >>>>>> -------------------------
> >>>>>> Daniel R. Roe
> >>>>>> Laboratory of Computational Biology
> >>>>>> National Institutes of Health, NHLBI
> >>>>>> 5635 Fishers Ln, Rm T900
> >>>>>> Rockville MD, 20852
> >>>>>> https://www.lobos.nih.gov/lcb
> >>>>>>
> >>>>>> _______________________________________________
> >>>>>> AMBER mailing list
> >>>>>> AMBER.ambermd.org
> >>>>>> http://lists.ambermd.org/mailman/listinfo/amber
> >>>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> --
> >>>>> Lei Zhao, Ph.D.
> >>>>> International Joint Cancer Institute of the Second Military Medical
> >>>>> University
> >>>>> National Engineering Research Center for Antibody Medicine
> >>>>> New Library Building 11th floor,800 Xiang Yin Road
> >>>>> Shanghai 200433
> >>>>> P.R.China
> >>>>>
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> Lei Zhao, Ph.D.
> >>>> International Joint Cancer Institute of the Second Military Medical
> >>>> University
> >>>> National Engineering Research Center for Antibody Medicine
> >>>> New Library Building 11th floor,800 Xiang Yin Road
> >>>> Shanghai 200433
> >>>> P.R.China
> >>>>
> >>>
> >>>
> >>>
> >>> --
> >>> Lei Zhao, Ph.D.
> >>> International Joint Cancer Institute of the Second Military Medical
> >>> University
> >>> National Engineering Research Center for Antibody Medicine
> >>> New Library Building 11th floor,800 Xiang Yin Road
> >>> Shanghai 200433
> >>> P.R.China
> >>> _______________________________________________
> >>> AMBER mailing list
> >>> AMBER.ambermd.org
> >>> http://lists.ambermd.org/mailman/listinfo/amber
> >>
> >>
> >>
> >> --
> >> -------------------------
> >> Daniel R. Roe
> >> Laboratory of Computational Biology
> >> National Institutes of Health, NHLBI
> >> 5635 Fishers Ln, Rm T900
> >> Rockville MD, 20852
> >> https://www.lobos.nih.gov/lcb
> >>
> >> _______________________________________________
> >> AMBER mailing list
> >> AMBER.ambermd.org
> >> http://lists.ambermd.org/mailman/listinfo/amber
> >>
> >
> >
> >
> > --
> > Lei Zhao, Ph.D.
> > International Joint Cancer Institute of the Second Military Medical
> > University
> > National Engineering Research Center for Antibody Medicine
> > New Library Building 11th floor,800 Xiang Yin Road
> > Shanghai 200433
> > P.R.China
> > _______________________________________________
> > AMBER mailing list
> > AMBER.ambermd.org
> > http://lists.ambermd.org/mailman/listinfo/amber
>
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>



-- 
Lei Zhao, Ph.D.
International Joint Cancer Institute of the Second Military Medical
University
National Engineering Research Center for Antibody Medicine
New Library Building 11th floor,800 Xiang Yin Road
Shanghai 200433
P.R.China
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Tue Oct 18 2016 - 21:00:02 PDT
Custom Search