Re: [AMBER] Error in PMEMD run

From: Robert Duke <rduke.email.unc.edu>
Date: Wed, 6 May 2009 20:55:49 +0100

Heck, I am just now going back through the whole deal and note that it is a
shared library loading issue, not a compiletime issue. So absolutely, it
could be that the shared libraries for ifort are not in the right places on
the runtime nodes. Geez. At least in the past, there has been incredible
grief in trying to do a full static link with ifort (at one point the
threads libraries did not work properly), so I have just avoided it (in
general, you also get into a bunch of grief with the need to link in a bunch
of other libraries...). But I generally prefer static linkage if/when you
can get it to work, just to avoid this sort of mess. But you should check
and see if the shared libraries for ifort are available in the location
specified by IFORT_RPATH on all nodes that run the code if you do use shared
ifort libraries (basically I am reiterating what Ross says, and perhaps
coming down a bit more on the side of fixing the shared libs install on the
cluster instead of going to static linkage, due to (perhaps now outdated)
history).
- Bob
----- Original Message -----
From: "Ross Walker" <ross.rosswalker.co.uk>
To: "'AMBER Mailing List'" <amber.ambermd.org>
Sent: Wednesday, May 06, 2009 3:41 PM
Subject: RE: [AMBER] Error in PMEMD run


Hi Marek,

Here is my take on what is going on here. It may be right it may not be but
this is what I guess it is.

1) When you compile PMEMD with MKL it always links in the static libraries.
Thus it doesn't matter what the environment is at run time, just at build
time.

2) When it links svml it links the shared version of svml. This is part of
the ifort compiler suite.

Thus you have statically linked mkl and dynamically linked svml.

My guess then would be that when you run the code your environment is
different in some way, either the shell is different, the paths are
different, or it is a different node with different versions of the intel
compiler installed. Either way this is messing up the dynamic link to svml.

To fix this you either need to find out what is wrong with your environment
(i.e. what is different between when you build and when you run) or build a
statically linked version of pmemd.

All the best
Ross

> -----Original Message-----
> From: amber-bounces.ambermd.org [mailto:amber-bounces.ambermd.org] On
> Behalf Of Robert Duke
> Sent: Wednesday, May 06, 2009 12:30 PM
> To: AMBER Mailing List
> Subject: Re: [AMBER] Error in PMEMD run
>
> Hi Marek,
> As an additional note, I did look at your config.h, and it looks to me
> like
> the ifort setup should be fine, so I am pretty puzzled as to what is
> going
> on. If it still doesn't work, please send some additional info about
> your
> hardware setup; I would note that I use ifort 10.1.021 all the time
> without
> problems, but don't know if there is something odd about 012 or not.
> Best Regards - Bob
>
> ----- Original Message -----
> From: "Marek Malý" <maly.sci.ujep.cz>
> To: "AMBER Mailing List" <amber.ambermd.org>
> Sent: Wednesday, May 06, 2009 2:42 PM
> Subject: Re: [AMBER] Error in PMEMD run
>
>
> Sorry now I realised that you probably talked about "config.h" not
> about
> configure file,
> so please find this pmemed config file attached - there is "-lsvml"
> present.
>
> So if it is necessary to modify this file please tell me how or please
> edit it and send
> back.
>
> Thanks a lot !
>
> Best,
>
> Marek
>
>
> Dne Wed, 06 May 2009 20:30:43 +0200 Marek Malý <maly.sci.ujep.cz>
> napsal/-a:
>
> > Dear Bob,
> >
> > I am definitively getting lost :))
> >
> > OK, first of all nor the original nor your config file for pmemd
> obtain
> > "-lsvml" parameter.
> > Simply this string doesn't exist in this file please see the attached
> > file
> > "configure" (this is that your
> > last version which you sent me). In confiuguration file for Amber -
> > please
> > see attached file "configure_amber"
> > there is one occurrence of this parameter in part "IA32 Intel
> compilers".
> >
> > Here is the whole path to our ifort compiler:
> >
> > /opt/intel/fce/10.1.012/bin/ifort
> >
> > all the others paths are listed in my previous email (please see
> below)
> > there
> > is list after performing "env" command.
> >
> > My config line for pmemd is this:
> >
> > ./configure linux_em64t ifort intelmpi pubfft bintraj
> >
> > If I can provide you more useful information please just let me know.
> >
> > For this moment thank you veru much for your time and effort !
> >
> > Best,
> >
> > Marek
> >
> >
> >
> >
> >
> >
> > Dne Wed, 06 May 2009 19:30:43 +0200 Robert Duke <rduke.email.unc.edu>
> > napsal/-a:
> >
> >> Hi Marek,
> >> Well, I have been plowing around in the intel MKL libraries, and the
> >> unresolved symbol you list is not defined in either MKL 8 or 10, so
> that
> >> is why trying to fix the mkl does not work. It is instead defined
> in
> >> libsvml.so (for shared libs) and libsvml.a (for static libs).
> Normally
> >> you get the shared lib linked in by including
> >> -lsvml in the link line, which should be happening in your config
> file
> >> (if you look at the config data files, this happens for everything
> >> except linux_p3_athlon.ifort, which is probably now broken, but also
> >> probably now completely unused (hence folks are not complaining -
> any
> >> chance you were using this one?)). SO this is NOT an mkl problem,
> but a
> >> problem getting to an svml function, perhaps called by some other
> >> function. Okay, so first question - are you setting up the ifort
> >> environment in the manner specified by the compiler (you source
> >> something like /opt/intel/fc/10.whatever/bin/ifortvars.csh or
> >> ifortvars.sh depending on which shell you use). You need to do an
> >> equivalent thing for MKL, by the way. Then if you did not specify
> >> linux_p3_athlon, what exactly did you use when you ran configure?
> We
> >> are finally narrowing it down... Sorry I did not pick up on this
> right
> >> away - so many math function linkage problems source from the chaos
> >> surrounding the interface to MKL.
> >> Best Regards - Bob
> >>
> >> ----- Original Message ----- From: "Marek Malý" <maly.sci.ujep.cz>
> >> To: "AMBER Mailing List" <amber.ambermd.org>
> >> Sent: Wednesday, May 06, 2009 11:58 AM
> >> Subject: Re: [AMBER] Error in PMEMD run
> >>
> >>
> >> Dear Bob,
> >>
> >> unfortunately your "configure patch" didn't help me.
> >>
> >> I tried just configure pmemd with your new configure file and run
> >> the simulation (with still the same error), then I also made a new
> >> compilation of of the pmemd after configuration with new cofigure
> file,
> >> but there is again the same error (undefined symbol: __svml_cos2).
> >>
> >> Anyway regarding to your question about version of our ifort
> compiler.
> >> Our actual version is this: "Intel(R) 64, Version 10.1 Build
> 20080112
> >> Package ID: l_fc_p_10.1.012"
> >>
> >> If you have no other idea, probably will be for this moment the best
> >> solution to use pmemd without
> >> MKL. If pmemd uses MKL just for the implicit solvent calculations,
> it
> >> will
> >> be acceptable for me
> >> now since as I wrote sooner. Now I am dealing just with explicit
> solvent
> >> calculations.
> >>
> >> So please tell me what all (lines/sentences) I should delete from
> the
> >> configure file to prevent
> >> linking pmemd with MKL and which configure file (original or your's)
> I
> >> have to use now.
> >> I assume that in this situation doesn't matter.
> >>
> >> Thank you very much in advance !
> >>
> >> Best,
> >>
> >> Marek
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> Dne Tue, 05 May 2009 06:08:37 +0200 Robert Duke
> <rduke.email.unc.edu>
> >> napsal/-a:
> >>
> >>> Okay, attempt at a late-night fix. Attached is a tar ball for
> pmemd
> >>> configuration, basically with two files. If you untar this, it
> will
> >>> expand
> >>> into a config_stuff dir. This then contains a new "configure" and
> a
> >>> new
> >>> config_data/interconnect.intelmpi (which you maybe can use if you
> >>> really
> >>> have intel mpi). So copy the two files into your existing pmemd
> tree
> >>> (saving old files first, just in case), and rerun ./configure in
> the
> >>> pmemd
> >>> directory, and hopefully all will be well.
> >>> Regards - Bob
> >>> ----- Original Message -----
> >>> From: "Marek Malý" <maly.sci.ujep.cz>
> >>> To: "AMBER Mailing List" <amber.ambermd.org>
> >>> Sent: Monday, May 04, 2009 10:19 PM
> >>> Subject: Re: [AMBER] Error in PMEMD run
> >>>
> >>>
> >>> Dear Bob,
> >>>
> >>> actually we have installed MKL version 10.0.011 as it is clear from
> the
> >>> "env list" below. Recently I would like to use PMEMD just
> >>> for the explicit solvent simulations, but of course I would be
> happy to
> >>> have possibility use PMEMD also for the implicit
> >>> solvent calculations. So I will appreciate any idea which can help
> to
> >>> fix
> >>> this problem.
> >>>
> >>> Thanks in advance !
> >>>
> >>> Best,
> >>>
> >>> Marek
> >>>
> >>>
> MANPATH=/opt/intel/mkl/10.0.011/man:/opt/intel/cce/9.1.043/man:/opt/int
> el/fce/10.1.012/man:/usr/local/share/man:/usr/share/man:/usr/share/binu
> tils-data/x86_64-pc-linux-gnu/2.16.1/man:/usr/share/gcc-data/x86_64-pc-
> linux-gnu/4.1.1/man
> >>>
> INTEL_LICENSE_FILE=/opt/intel/fce/10.1.012/licenses:/opt/intel/licenses
> :/home/mmaly/intel/licenses:/Users/Shared/Library/Application
> >>>
> Support/Intel/Licenses:/opt/intel/cce/9.1.043/licenses:/opt/intel/licen
> ses:/home/mmaly/intel/licenses
> >>> TERM=xterm
> >>> SHELL=/bin/bash
> >>> SSH_CLIENT=192.168.0.15 37849 22
> >>> LIBRARY_PATH=/opt/intel/mkl/10.0.011/lib/em64t
> >>> SGE_CELL=default
> >>> FPATH=/opt/intel/mkl/10.0.011/include
> >>> SSH_TTY=/dev/pts/3
> >>> USER=mmaly
> >>>
> LD_LIBRARY_PATH=/opt/intel/impi/3.1/lib64:/opt/intel/mkl/10.0.011/lib/e
> m64t:/opt/intel/cce/9.1.043/lib:/opt/intel/fce/10.1.012/lib::/opt/intel
> /impi/3.1/lib64
> >>> LS_COLORS=no=00:fi=00:di=01
> >>> CPATH=/opt/intel/mkl/10.0.011/include
> >>> PAGER=/usr/bin/less
> >>> CONFIG_PROTECT_MASK=/etc/fonts/fonts.conf /etc/terminfo
> >>> MAIL=/var/mail/mmaly
> >>>
> PATH=/opt/intel/impi/3.1/bin64:/opt/intel/cce/9.1.043/bin:/opt/intel/fc
> e/10.1.012/bin:/usr/local/bin:/usr/bin:/bin:/opt/bin:/usr/x86_64-pc-
> linux-gnu/gcc-
> bin/4.1.1:/opt/intel/cce/9.1.043/bin:/opt/intel/fce/10.1.012/bin:/opt/i
> ntel/impi/3.1/bin:/opt/intel/idbe/9.1.043/bin:/opt/intel/idbe/10.1.012/
> bin:/opt/intel/etc:/opt/amber/exe:/opt/sge/bin/lx24-amd64
> >>> PWD=/home/mmaly
> >>> SGE_EXECD_PORT=537
> >>> EDITOR=/bin/nano
> >>> SGE_QMASTER_PORT=536
> >>> SGE_ROOT=/opt/sge
> >>> MKL_HOME=/opt/intel/mkl/10.0.011
> >>>
> INTEL_PATHS=/opt/intel/cce/9.1.043/bin:/opt/intel/fce/10.1.012/bin:/opt
> /intel/impi/3.1/bin:/opt/intel/idbe/9.1.043/bin:/opt/intel/idbe/10.1.01
> 2/bin:/opt/intel/etc
> >>> SHLVL=1
> >>> HOME=/home/mmaly
> >>> DYLD_LIBRARY_PATH=/opt/intel/fce/10.1.012/lib
> >>> PYTHONPATH=/usr/lib64/portage/pym
> >>> LESS=-R -M --shift 5
> >>> LOGNAME=mmaly
> >>> GCC_SPECS=
> >>> CVS_RSH=ssh
> >>> SSH_CONNECTION=192.168.0.15 37849 192.168.0.13 22
> >>> MPI_HOME=/opt/intel/impi/3.1
> >>> LESSOPEN=|lesspipe.sh %s
> >>> INFOPATH=/usr/share/info:/usr/share/binutils-data/x86_64-pc-linux-
> gnu/2.16.1/info:/usr/share/gcc-data/x86_64-pc-linux-
> gnu/4.1.1/info:/usr/share/info/emacs-22
> >>> INCLUDE=/opt/intel/mkl/10.0.011/include
> >>> AMBERHOME=/opt/amber
> >>> _=/usr/bin/env
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> Dne Tue, 05 May 2009 03:35:54 +0200 Robert Duke
> <rduke.email.unc.edu>
> >>> napsal/-a:
> >>>
> >>>> This looks to me like an MKL linkage problem. If you don't need
> >>>> generalized Born, you can make this go away by simply not choosing
> to
> >>>> use
> >>>> MKL when you run pmemd configure. Otherwise, we do have more
> recent
> >>>> directions that work with the latest versions of MKL. If you want
> to
> >>>> use
> >>>> this, let me know your version of MKL and I will dig up the
> >>>> appropriate
> >>>> new version of pmemd configure that should work (I think I have
> >>>> posted
> >>>> fixed versions to the list before; we should probably release a
> >>>> patch,
> >>>> but in the meantime I can dig out the last posting if you want GB
> >>>> support
> >>>> with MKL).
> >>>> Best Regards - Bob Duke
> >>>> ----- Original Message ----- From: "Marek Malý" <maly.sci.ujep.cz>
> >>>> To: <amber.ambermd.org>
> >>>> Sent: Monday, May 04, 2009 9:23 PM
> >>>> Subject: [AMBER] Error in PMEMD run
> >>>>
> >>>>
> >>>> Dear amber users,
> >>>>
> >>>> I have installed Amber10 in our cluster some time ago. Now I
> started
> >>>> with some calculations and I have problem with PMEMD.
> >>>>
> >>>> When I try to switch (after minimisation, heating and density
> >>>> equilibrium
> >>>> phases) from SANDER
> >>>> to PMEMD, my calculation is broken starting with this error line:
> >>>>
> >>>> "symbol lookup error: /opt/amber/exe/pmemd: undefined symbol:
> >>>> __svml_cos2"
> >>>>
> >>>>
> >>>> Without switching to PMEMD everything is OK, it means SANDER works
> >>>> perfectly but since
> >>>> I am working on big systems (hundreds thousands of atoms )
> typically
> >>>> 32-64
> >>>> CPUs jobs,
> >>>> I would like to use PMEMD for my equil/production runs.
> >>>>
> >>>> I would be grateful for any useful info.
> >>>>
> >>>> With the best wishes
> >>>>
> >>>> Marek
> >>>>
> >>>>
> >>>>
> >>>
> >>
> >
>
> --
> Tato zpráva byla vytvořena převratným poštovním klientem Opery:
> http://www.opera.com/mail/
>
>
> -----------------------------------------------------------------------
> ---------
>
>
> > _______________________________________________
> > AMBER mailing list
> > AMBER.ambermd.org
> > http://lists.ambermd.org/mailman/listinfo/amber
> >
>
>
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber


_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber



_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Wed May 20 2009 - 14:56:14 PDT
Custom Search