[AMBER] Fwd: amber14 parallel build problems

From: Ada Sedova <ada.a.sedova.gmail.com>
Date: Wed, 18 Jan 2017 14:44:38 -0500

This was sent a few days ago with no response.

Today I was told this info was not given.

Here it is again.


AS


---------- Forwarded message ----------
From: Ada Sedova <ada.a.sedova.gmail.com>
Date: Tue, Jan 17, 2017 at 11:19 AM
Subject: Re: [AMBER] amber14 parallel build problems
To: david.case.rutgers.edu


Yes, I would like to continue to debug this, as getting OLCF to update to
amber16 may be difficult, as it requires a purchase and thus a bunch of
paperwork and bureaucratic steps.

The output logs form sdtout and stderr from the failed mpi build are
attached.

The complete output from mpicc -show was given above in this thread, but I
will repeat for convenience:

-bash-4.1$ mpicc -show
>
> gcc -I/sw/rhea/openmpi/1.8.4/rhel6.6_gcc4.8.2/include -pthread
> -L/usr/lib64
> -Wl,-rpath -Wl,/usr/lib64 -Wl,-rpath
> -Wl,/sw/rhea/openmpi/1.8.4/rhel6.6_gcc4.8.2/lib -Wl,--enable-new-dtags
> -L/sw/rhea/openmpi/1.8.4/rhel6.6_gcc4.8.2/lib -lmpi
>

serial build:

-bash-4.1$ ldd ./yacc

linux-vdso.so.1 => (0x00007ffe30bfd000)

libc.so.6 => /lib64/libc.so.6 (0x00007f3b7805c000)

/lib64/ld-linux-x86-64.so.2 (0x00007f3b7841b000)


parallel build:

-bash-4.1$ ldd ./yacc

linux-vdso.so.1 => (0x00007ffe86cf6000)

libmpi.so.1 => /sw/rhea/openmpi/1.8.4/rhel6.6_gcc4.8.2/lib/libmpi.so.1
(0x00007f29e89ac000)

libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f29e8764000)

libc.so.6 => /lib64/libc.so.6 (0x00007f29e83d0000)

librdmacm.so.1 => /usr/lib64/librdmacm.so.1 (0x00007f29e81bb000)

libosmcomp.so.3 => /usr/lib64/libosmcomp.so.3 (0x00007f29e7fad000)

libibverbs.so.1 => /usr/lib64/libibverbs.so.1 (0x00007f29e7d9b000)

libpsm_infinipath.so.1 => /usr/lib64/libpsm_infinipath.so.1
(0x00007f29e7b47000)

libopen-rte.so.7 => /sw/rhea/openmpi/1.8.4/rhel6.
6_gcc4.8.2/lib/libopen-rte.so.7 (0x00007f29e7856000)

libtorque.so.2 => /usr/lib64/libtorque.so.2 (0x00007f29e6f65000)

libxml2.so.2 => /usr/lib64/libxml2.so.2 (0x00007f29e6c12000)

libz.so.1 => /lib64/libz.so.1 (0x00007f29e69fb000)

libcrypto.so.10 => /usr/lib64/libcrypto.so.10 (0x00007f29e6617000)

libssl.so.10 => /usr/lib64/libssl.so.10 (0x00007f29e63aa000)

libopen-pal.so.6 => /sw/rhea/openmpi/1.8.4/rhel6.
6_gcc4.8.2/lib/libopen-pal.so.6 (0x00007f29e60b9000)

libnuma.so.1 => /usr/lib64/libnuma.so.1 (0x00007f29e5eae000)

libdl.so.2 => /lib64/libdl.so.2 (0x00007f29e5caa000)

librt.so.1 => /lib64/librt.so.1 (0x00007f29e5aa1000)

libm.so.6 => /lib64/libm.so.6 (0x00007f29e581d000)

libutil.so.1 => /lib64/libutil.so.1 (0x00007f29e561a000)

/lib64/ld-linux-x86-64.so.2 (0x00007f29e8ec1000)

libibumad.so.3 => /usr/lib64/libibumad.so.3 (0x00007f29e5412000)

libgcc_s.so.1 => /ccs/compilers/gcc/rhel6-x86_64/4.8.2/lib64/libgcc_s.so.1
(0x00007f29e51fc000)

libnl.so.1 => /lib64/libnl.so.1 (0x00007f29e4faa000)

libinfinipath.so.4 => /usr/lib64/libinfinipath.so.4 (0x00007f29e4d9b000)

libuuid.so.1 => /lib64/libuuid.so.1 (0x00007f29e4b97000)

libstdc++.so.6 => /usr/lib64/libstdc++.so.6 (0x00007f29e4891000)

libgssapi_krb5.so.2 => /lib64/libgssapi_krb5.so.2 (0x00007f29e464c000)

libkrb5.so.3 => /lib64/libkrb5.so.3 (0x00007f29e4365000)

libcom_err.so.2 => /lib64/libcom_err.so.2 (0x00007f29e4161000)

libk5crypto.so.3 => /lib64/libk5crypto.so.3 (0x00007f29e3f34000)

libkrb5support.so.0 => /lib64/libkrb5support.so.0 (0x00007f29e3d29000)

libkeyutils.so.1 => /lib64/libkeyutils.so.1 (0x00007f29e3b25000)

libresolv.so.2 => /lib64/libresolv.so.2 (0x00007f29e390b000)

libselinux.so.1 => /lib64/libselinux.so.1 (0x00007f29e36ec000)


Thanks for your continued assistance.


Ada

On Tue, Jan 17, 2017 at 9:10 AM, David Case <david.case.rutgers.edu> wrote:

> On Sun, Jan 15, 2017, Ada Sedova wrote:
>
> > Here are the results of the checks you suggested:
> >
> > 1) mpicc -show shows the correct gcc (4.8.2) that I used for serial
> compile
>
> I'm not sure if you still want to try to debug this problem or not. If you
> do, please save the complete log of the "make install" run that fails, and
> send it as an attachment. So far, I think we have only seen snippets of
> the
> end of the error messages.
>
> Also: run the serial install, cd $AMBERHOME/bin and report the output
> from "ldd ./yacc". Do the same after the (failed) parallel install.
> Are there any differences in what libraries are loaded?
>
> Finally, please provide the full output from "mpicc -show", and let us know
> which MPI version you are using, and how you installed it.
>
> As best I can understand things, when you compile byacc with gcc, the
> resulting executable works, but when you later compile it with mpicc, the
> resulting executable tries to load a library that is not in your
> LD_LIBRARY_PATH. If this is correct, it is something we should be able to
> track down.
>
> [An alternative workaround is to comment out the line in the Makefile
> that (re-)makes yacc during the parallel compilation step. But if your
> mpicc
> is failing with yacc, it seems likely to fail at some other step as well.]
>
> Trying the whole procedure again with AmberTools16 is probably a good
> sanity
> check (removes some possible sources of problems).
>
> Sorry for all the problems you are seeing: I don't recall having ever seen
> this particular problem before.
>
> ...dac
>


_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber

Received on Wed Jan 18 2017 - 12:00:05 PST
Custom Search