RE: [AMBER] Error in PMEMD run

From: Ross Walker <ross.rosswalker.co.uk>
Date: Wed, 6 May 2009 21:36:36 +0100

Hi Marek,

Sander's veclib library contains vdcos, vdtanh etc while pmemd's does not.
Hence for vectored cosine calls in sander the calls got through the mkl
library in your case. For pmemd calls to cosine are being vectorized through
through the ifort compiler's vectored vsml library and it is here that the
problem lies. Either that there is some kind of corruption of this library,
that it is picking up an incorrect version at runtime (that is different
from the compiled version) etc. If you can't work out what is wrong with
your ifort setup there are some possible hacks that could work.

1) Try linking without the vsml library - just remove it from the config.h
file, make clean and then build. Then it won't try to vectorize the cos
calls and it should work.

2) modify pmemd to include vdcos in veclib.f and replace all calls to cos()
with calls to vdcos() then it will use the MKL vector library.

All the best
Ross

> -----Original Message-----
> From: amber-bounces.ambermd.org [mailto:amber-bounces.ambermd.org] On
> Behalf Of Marek Malý
> Sent: Wednesday, May 06, 2009 1:27 PM
> To: AMBER Mailing List
> Subject: Re: [AMBER] Error in PMEMD run
>
> Hi Bob,
>
> thanks again. What I do not understand in the light of your last
> comment is
> why the parallel version of Amber which I compilled on the same node as
> pmemd is
> possible to use on any node of our cluster without any problems. It
> works
> perfectly,
> I have now 3 calculations on 32CPUs/job ( = each job on four 8 CPUs
> nodes
> ) and it works,
> it is running from Monday without problem (minimisation, and MD -
> explicit
> solvent).
>
> So where could be so big difference here. Is it that Amber uses
> different
> shared libraries
> that pmemd ?
>
> Best
>
> Marek
>
>
> Dne Wed, 06 May 2009 22:08:56 +0200 Robert Duke <rduke.email.unc.edu>
> napsal/-a:
>
> > Hi Marek,
> > I think you are probably going to need to get somebody involved on
> your
> > end that understands the intricacies of runtime loading of shared
> > libraries; it is probably best to go this route rather than hacking
> the
> > build, which is known to work if you don't mess with it (maybe not on
> > your setup at the moment, but that is because something is not being
> set
> > up correctly). The key here is being able to handle getting at the
> > ifort shared libraries from all nodes in the cluster. Sorry again
> this
> > has been so hard.
> > Regards - Bob
> > ----- Original Message ----- From: "Marek Malý" <maly.sci.ujep.cz>
> > To: "AMBER Mailing List" <amber.ambermd.org>
> > Sent: Wednesday, May 06, 2009 4:02 PM
> > Subject: Re: [AMBER] Error in PMEMD run
> >
> >
> > Hi Ross,
> >
> > I just tested pmemd on the same (calculation) node where I have
> compilled
> > it, still with the same
> > error.
> >
> > I also found in my personal notices that
> >
> > compilation with this -static flag didn't proceed.
> >
> > ./configure_amber -intelmpi ifort -static
> >
> > I can eventually try again to sent you which errors appeared during
> > compilation ...
> >
> > But anyway thank you for your suggestions.
> >
> > Best,
> >
> > Marek
> >
> >
> >
> >
> >
> > Dne Wed, 06 May 2009 21:41:10 +0200 Ross Walker
> <ross.rosswalker.co.uk>
> > napsal/-a:
> >
> >> Hi Marek,
> >>
> >> Here is my take on what is going on here. It may be right it may not
> be
> >> but
> >> this is what I guess it is.
> >>
> >> 1) When you compile PMEMD with MKL it always links in the static
> >> libraries.
> >> Thus it doesn't matter what the environment is at run time, just at
> >> build
> >> time.
> >>
> >> 2) When it links svml it links the shared version of svml. This is
> part
> >> of
> >> the ifort compiler suite.
> >>
> >> Thus you have statically linked mkl and dynamically linked svml.
> >>
> >> My guess then would be that when you run the code your environment
> is
> >> different in some way, either the shell is different, the paths are
> >> different, or it is a different node with different versions of the
> >> intel
> >> compiler installed. Either way this is messing up the dynamic link
> to
> >> svml.
> >>
> >> To fix this you either need to find out what is wrong with your
> >> environment
> >> (i.e. what is different between when you build and when you run) or
> >> build a
> >> statically linked version of pmemd.
> >>
> >> All the best
> >> Ross
> >>
> >>> -----Original Message-----
> >>> From: amber-bounces.ambermd.org [mailto:amber-bounces.ambermd.org]
> On
> >>> Behalf Of Robert Duke
> >>> Sent: Wednesday, May 06, 2009 12:30 PM
> >>> To: AMBER Mailing List
> >>> Subject: Re: [AMBER] Error in PMEMD run
> >>>
> >>> Hi Marek,
> >>> As an additional note, I did look at your config.h, and it looks to
> me
> >>> like
> >>> the ifort setup should be fine, so I am pretty puzzled as to what
> is
> >>> going
> >>> on. If it still doesn't work, please send some additional info
> about
> >>> your
> >>> hardware setup; I would note that I use ifort 10.1.021 all the
> time
> >>> without
> >>> problems, but don't know if there is something odd about 012 or
> not.
> >>> Best Regards - Bob
> >>>
> >>> ----- Original Message -----
> >>> From: "Marek Malý" <maly.sci.ujep.cz>
> >>> To: "AMBER Mailing List" <amber.ambermd.org>
> >>> Sent: Wednesday, May 06, 2009 2:42 PM
> >>> Subject: Re: [AMBER] Error in PMEMD run
> >>>
> >>>
> >>> Sorry now I realised that you probably talked about "config.h" not
> >>> about
> >>> configure file,
> >>> so please find this pmemed config file attached - there is "-
> lsvml"
> >>> present.
> >>>
> >>> So if it is necessary to modify this file please tell me how or
> please
> >>> edit it and send
> >>> back.
> >>>
> >>> Thanks a lot !
> >>>
> >>> Best,
> >>>
> >>> Marek
> >>>
> >>>
> >>> Dne Wed, 06 May 2009 20:30:43 +0200 Marek Malý <maly.sci.ujep.cz>
> >>> napsal/-a:
> >>>
> >>> > Dear Bob,
> >>> >
> >>> > I am definitively getting lost :))
> >>> >
> >>> > OK, first of all nor the original nor your config file for pmemd
> >>> obtain
> >>> > "-lsvml" parameter.
> >>> > Simply this string doesn't exist in this file please see the
> attached
> >>> > file
> >>> > "configure" (this is that your
> >>> > last version which you sent me). In confiuguration file for Amber
> -
> >>> > please
> >>> > see attached file "configure_amber"
> >>> > there is one occurrence of this parameter in part "IA32 Intel
> >>> compilers".
> >>> >
> >>> > Here is the whole path to our ifort compiler:
> >>> >
> >>> > /opt/intel/fce/10.1.012/bin/ifort
> >>> >
> >>> > all the others paths are listed in my previous email (please see
> >>> below)
> >>> > there
> >>> > is list after performing "env" command.
> >>> >
> >>> > My config line for pmemd is this:
> >>> >
> >>> > ./configure linux_em64t ifort intelmpi pubfft bintraj
> >>> >
> >>> > If I can provide you more useful information please just let me
> know.
> >>> >
> >>> > For this moment thank you veru much for your time and effort !
> >>> >
> >>> > Best,
> >>> >
> >>> > Marek
> >>> >
> >>> >
> >>> >
> >>> >
> >>> >
> >>> >
> >>> > Dne Wed, 06 May 2009 19:30:43 +0200 Robert Duke
> <rduke.email.unc.edu>
> >>> > napsal/-a:
> >>> >
> >>> >> Hi Marek,
> >>> >> Well, I have been plowing around in the intel MKL libraries, and
> the
> >>> >> unresolved symbol you list is not defined in either MKL 8 or 10,
> so
> >>> that
> >>> >> is why trying to fix the mkl does not work. It is instead
> defined
> >>> in
> >>> >> libsvml.so (for shared libs) and libsvml.a (for static libs).
> >>> Normally
> >>> >> you get the shared lib linked in by including
> >>> >> -lsvml in the link line, which should be happening in your
> config
> >>> file
> >>> >> (if you look at the config data files, this happens for
> everything
> >>> >> except linux_p3_athlon.ifort, which is probably now broken, but
> also
> >>> >> probably now completely unused (hence folks are not complaining
> -
> >>> any
> >>> >> chance you were using this one?)). SO this is NOT an mkl
> problem,
> >>> but a
> >>> >> problem getting to an svml function, perhaps called by some
> other
> >>> >> function. Okay, so first question - are you setting up the
> ifort
> >>> >> environment in the manner specified by the compiler (you source
> >>> >> something like /opt/intel/fc/10.whatever/bin/ifortvars.csh or
> >>> >> ifortvars.sh depending on which shell you use). You need to do
> an
> >>> >> equivalent thing for MKL, by the way. Then if you did not
> specify
> >>> >> linux_p3_athlon, what exactly did you use when you ran
> configure?
> >>> We
> >>> >> are finally narrowing it down... Sorry I did not pick up on
> this
> >>> right
> >>> >> away - so many math function linkage problems source from the
> chaos
> >>> >> surrounding the interface to MKL.
> >>> >> Best Regards - Bob
> >>> >>
> >>> >> ----- Original Message ----- From: "Marek Malý"
> <maly.sci.ujep.cz>
> >>> >> To: "AMBER Mailing List" <amber.ambermd.org>
> >>> >> Sent: Wednesday, May 06, 2009 11:58 AM
> >>> >> Subject: Re: [AMBER] Error in PMEMD run
> >>> >>
> >>> >>
> >>> >> Dear Bob,
> >>> >>
> >>> >> unfortunately your "configure patch" didn't help me.
> >>> >>
> >>> >> I tried just configure pmemd with your new configure file and
> run
> >>> >> the simulation (with still the same error), then I also made a
> new
> >>> >> compilation of of the pmemd after configuration with new
> cofigure
> >>> file,
> >>> >> but there is again the same error (undefined symbol:
> __svml_cos2).
> >>> >>
> >>> >> Anyway regarding to your question about version of our ifort
> >>> compiler.
> >>> >> Our actual version is this: "Intel(R) 64, Version 10.1 Build
> >>> 20080112
> >>> >> Package ID: l_fc_p_10.1.012"
> >>> >>
> >>> >> If you have no other idea, probably will be for this moment the
> best
> >>> >> solution to use pmemd without
> >>> >> MKL. If pmemd uses MKL just for the implicit solvent
> calculations,
> >>> it
> >>> >> will
> >>> >> be acceptable for me
> >>> >> now since as I wrote sooner. Now I am dealing just with explicit
> >>> solvent
> >>> >> calculations.
> >>> >>
> >>> >> So please tell me what all (lines/sentences) I should delete
> from
> >>> the
> >>> >> configure file to prevent
> >>> >> linking pmemd with MKL and which configure file (original or
> your's)
> >>> I
> >>> >> have to use now.
> >>> >> I assume that in this situation doesn't matter.
> >>> >>
> >>> >> Thank you very much in advance !
> >>> >>
> >>> >> Best,
> >>> >>
> >>> >> Marek
> >>> >>
> >>> >>
> >>> >>
> >>> >>
> >>> >>
> >>> >>
> >>> >>
> >>> >> Dne Tue, 05 May 2009 06:08:37 +0200 Robert Duke
> >>> <rduke.email.unc.edu>
> >>> >> napsal/-a:
> >>> >>
> >>> >>> Okay, attempt at a late-night fix. Attached is a tar ball for
> >>> pmemd
> >>> >>> configuration, basically with two files. If you untar this, it
> >>> will
> >>> >>> expand
> >>> >>> into a config_stuff dir. This then contains a new "configure"
> and
> >>> a
> >>> >>> new
> >>> >>> config_data/interconnect.intelmpi (which you maybe can use if
> you
> >>> >>> really
> >>> >>> have intel mpi). So copy the two files into your existing
> pmemd
> >>> tree
> >>> >>> (saving old files first, just in case), and rerun ./configure
> in
> >>> the
> >>> >>> pmemd
> >>> >>> directory, and hopefully all will be well.
> >>> >>> Regards - Bob
> >>> >>> ----- Original Message -----
> >>> >>> From: "Marek Malý" <maly.sci.ujep.cz>
> >>> >>> To: "AMBER Mailing List" <amber.ambermd.org>
> >>> >>> Sent: Monday, May 04, 2009 10:19 PM
> >>> >>> Subject: Re: [AMBER] Error in PMEMD run
> >>> >>>
> >>> >>>
> >>> >>> Dear Bob,
> >>> >>>
> >>> >>> actually we have installed MKL version 10.0.011 as it is clear
> from
> >>> the
> >>> >>> "env list" below. Recently I would like to use PMEMD just
> >>> >>> for the explicit solvent simulations, but of course I would be
> >>> happy to
> >>> >>> have possibility use PMEMD also for the implicit
> >>> >>> solvent calculations. So I will appreciate any idea which can
> help
> >>> to
> >>> >>> fix
> >>> >>> this problem.
> >>> >>>
> >>> >>> Thanks in advance !
> >>> >>>
> >>> >>> Best,
> >>> >>>
> >>> >>> Marek
> >>> >>>
> >>> >>>
> >>>
> MANPATH=/opt/intel/mkl/10.0.011/man:/opt/intel/cce/9.1.043/man:/opt/int
> >>>
> el/fce/10.1.012/man:/usr/local/share/man:/usr/share/man:/usr/share/binu
> >>> tils-data/x86_64-pc-linux-gnu/2.16.1/man:/usr/share/gcc-
> data/x86_64-pc-
> >>> linux-gnu/4.1.1/man
> >>> >>>
> >>>
> INTEL_LICENSE_FILE=/opt/intel/fce/10.1.012/licenses:/opt/intel/licenses
> >>> :/home/mmaly/intel/licenses:/Users/Shared/Library/Application
> >>> >>>
> >>>
> Support/Intel/Licenses:/opt/intel/cce/9.1.043/licenses:/opt/intel/licen
> >>> ses:/home/mmaly/intel/licenses
> >>> >>> TERM=xterm
> >>> >>> SHELL=/bin/bash
> >>> >>> SSH_CLIENT=192.168.0.15 37849 22
> >>> >>> LIBRARY_PATH=/opt/intel/mkl/10.0.011/lib/em64t
> >>> >>> SGE_CELL=default
> >>> >>> FPATH=/opt/intel/mkl/10.0.011/include
> >>> >>> SSH_TTY=/dev/pts/3
> >>> >>> USER=mmaly
> >>> >>>
> >>>
> LD_LIBRARY_PATH=/opt/intel/impi/3.1/lib64:/opt/intel/mkl/10.0.011/lib/e
> >>>
> m64t:/opt/intel/cce/9.1.043/lib:/opt/intel/fce/10.1.012/lib::/opt/intel
> >>> /impi/3.1/lib64
> >>> >>> LS_COLORS=no=00:fi=00:di=01
> >>> >>> CPATH=/opt/intel/mkl/10.0.011/include
> >>> >>> PAGER=/usr/bin/less
> >>> >>> CONFIG_PROTECT_MASK=/etc/fonts/fonts.conf /etc/terminfo
> >>> >>> MAIL=/var/mail/mmaly
> >>> >>>
> >>>
> PATH=/opt/intel/impi/3.1/bin64:/opt/intel/cce/9.1.043/bin:/opt/intel/fc
> >>> e/10.1.012/bin:/usr/local/bin:/usr/bin:/bin:/opt/bin:/usr/x86_64-
> pc-
> >>> linux-gnu/gcc-
> >>>
> bin/4.1.1:/opt/intel/cce/9.1.043/bin:/opt/intel/fce/10.1.012/bin:/opt/i
> >>>
> ntel/impi/3.1/bin:/opt/intel/idbe/9.1.043/bin:/opt/intel/idbe/10.1.012/
> >>> bin:/opt/intel/etc:/opt/amber/exe:/opt/sge/bin/lx24-amd64
> >>> >>> PWD=/home/mmaly
> >>> >>> SGE_EXECD_PORT=537
> >>> >>> EDITOR=/bin/nano
> >>> >>> SGE_QMASTER_PORT=536
> >>> >>> SGE_ROOT=/opt/sge
> >>> >>> MKL_HOME=/opt/intel/mkl/10.0.011
> >>> >>>
> >>>
> INTEL_PATHS=/opt/intel/cce/9.1.043/bin:/opt/intel/fce/10.1.012/bin:/opt
> >>>
> /intel/impi/3.1/bin:/opt/intel/idbe/9.1.043/bin:/opt/intel/idbe/10.1.01
> >>> 2/bin:/opt/intel/etc
> >>> >>> SHLVL=1
> >>> >>> HOME=/home/mmaly
> >>> >>> DYLD_LIBRARY_PATH=/opt/intel/fce/10.1.012/lib
> >>> >>> PYTHONPATH=/usr/lib64/portage/pym
> >>> >>> LESS=-R -M --shift 5
> >>> >>> LOGNAME=mmaly
> >>> >>> GCC_SPECS=
> >>> >>> CVS_RSH=ssh
> >>> >>> SSH_CONNECTION=192.168.0.15 37849 192.168.0.13 22
> >>> >>> MPI_HOME=/opt/intel/impi/3.1
> >>> >>> LESSOPEN=|lesspipe.sh %s
> >>> >>> INFOPATH=/usr/share/info:/usr/share/binutils-data/x86_64-pc-
> linux-
> >>> gnu/2.16.1/info:/usr/share/gcc-data/x86_64-pc-linux-
> >>> gnu/4.1.1/info:/usr/share/info/emacs-22
> >>> >>> INCLUDE=/opt/intel/mkl/10.0.011/include
> >>> >>> AMBERHOME=/opt/amber
> >>> >>> _=/usr/bin/env
> >>> >>>
> >>> >>>
> >>> >>>
> >>> >>>
> >>> >>>
> >>> >>> Dne Tue, 05 May 2009 03:35:54 +0200 Robert Duke
> >>> <rduke.email.unc.edu>
> >>> >>> napsal/-a:
> >>> >>>
> >>> >>>> This looks to me like an MKL linkage problem. If you don't
> need
> >>> >>>> generalized Born, you can make this go away by simply not
> choosing
> >>> to
> >>> >>>> use
> >>> >>>> MKL when you run pmemd configure. Otherwise, we do have more
> >>> recent
> >>> >>>> directions that work with the latest versions of MKL. If you
> want
> >>> to
> >>> >>>> use
> >>> >>>> this, let me know your version of MKL and I will dig up the
> >>> >>>> appropriate
> >>> >>>> new version of pmemd configure that should work (I think I
> have
> >>> >>>> posted
> >>> >>>> fixed versions to the list before; we should probably release
> a
> >>> >>>> patch,
> >>> >>>> but in the meantime I can dig out the last posting if you
> want GB
> >>> >>>> support
> >>> >>>> with MKL).
> >>> >>>> Best Regards - Bob Duke
> >>> >>>> ----- Original Message ----- From: "Marek Malý"
> <maly.sci.ujep.cz>
> >>> >>>> To: <amber.ambermd.org>
> >>> >>>> Sent: Monday, May 04, 2009 9:23 PM
> >>> >>>> Subject: [AMBER] Error in PMEMD run
> >>> >>>>
> >>> >>>>
> >>> >>>> Dear amber users,
> >>> >>>>
> >>> >>>> I have installed Amber10 in our cluster some time ago. Now I
> >>> started
> >>> >>>> with some calculations and I have problem with PMEMD.
> >>> >>>>
> >>> >>>> When I try to switch (after minimisation, heating and density
> >>> >>>> equilibrium
> >>> >>>> phases) from SANDER
> >>> >>>> to PMEMD, my calculation is broken starting with this error
> line:
> >>> >>>>
> >>> >>>> "symbol lookup error: /opt/amber/exe/pmemd: undefined symbol:
> >>> >>>> __svml_cos2"
> >>> >>>>
> >>> >>>>
> >>> >>>> Without switching to PMEMD everything is OK, it means SANDER
> works
> >>> >>>> perfectly but since
> >>> >>>> I am working on big systems (hundreds thousands of atoms )
> >>> typically
> >>> >>>> 32-64
> >>> >>>> CPUs jobs,
> >>> >>>> I would like to use PMEMD for my equil/production runs.
> >>> >>>>
> >>> >>>> I would be grateful for any useful info.
> >>> >>>>
> >>> >>>> With the best wishes
> >>> >>>>
> >>> >>>> Marek
> >>> >>>>
> >>> >>>>
> >>> >>>>
> >>> >>>
> >>> >>
> >>> >
> >>>
> >>> --
> >>> Tato zpráva byla vytvořena převratným poštovním klientem Opery:
> >>> http://www.opera.com/mail/
> >>>
> >>>
> >>> -------------------------------------------------------------------
> ----
> >>> ---------
> >>>
> >>>
> >>> > _______________________________________________
> >>> > AMBER mailing list
> >>> > AMBER.ambermd.org
> >>> > http://lists.ambermd.org/mailman/listinfo/amber
> >>> >
> >>>
> >>>
> >>>
> >>> _______________________________________________
> >>> AMBER mailing list
> >>> AMBER.ambermd.org
> >>> http://lists.ambermd.org/mailman/listinfo/amber
> >>
> >>
> >> _______________________________________________
> >> AMBER mailing list
> >> AMBER.ambermd.org
> >> http://lists.ambermd.org/mailman/listinfo/amber
> >>
> >> __________ Informace od NOD32 4051 (20090504) __________
> >>
> >> Tato zprava byla proverena antivirovym systemem NOD32.
> >> http://www.nod32.cz
> >>
> >>
> >
>
> --
> Tato zpráva byla vytvořena převratným poštovním klientem Opery:
> http://www.opera.com/mail/
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber


_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Wed May 20 2009 - 14:56:40 PDT
Custom Search