Re: [AMBER] mpi problems

From: Marcelo Andrade Chagas <andrade.mchagas.gmail.com>
Date: Mon, 18 Apr 2016 16:16:22 -0300

Dear,

the version of openmp that is trying to install is the same works on other
hosts?

I recently had a problem with the installation by trying to use the version
of
openmpi-1-10, because there was a library that had the name changed in
this latest
version.

I installed the older version manually (openmpi-1-6) and everything works.

See if this downloading the newest version and compare it to what has already
been installed.

Can be it!
Suggestion

In this latest version I have mentioned different libraries than Amber
MPI was developed.

Best regards,

Marcelo A. Chagas

Marcelo Andrade Chagas, MSc
(PhD student)
Laboratório de Química Computacional e Modelagem Molecular - LQC-MM
* http://lqcmm.qui.ufmg.br/
Departamento de Química da Universidade Federal de Minas Gerais - UFMG
Tel:(31)3409-5776

2016-04-18 15:46 GMT-03:00 Gard Nelson <Gard.Nelson.nantbio.com>:

> Hi all,
>
> I’m trying to install pmemd.MPI on a CPU cluster. The standard procedure
> (configure_openmpi, configure –mpi, make …) has worked fine on other
> machines but I’m having no luck on this one. Here’s a summary of what’s
> happening:
>
> configure_openmpi fails with an error from ld about mca_io_romio.la.
> However, I successfully installed mpich 3.1.4 from the source code. (this
> is the installation I use on a different machine) After installation, I run
> mpich's test using the command “mpirun –n 2 examples/cpi” and it passes
> both on the head and compute nodes.
>
> Amber installs without error, however almost all of the AmberTools tests
> fail or have errors and every single Amber14 test has an error. This
> happens regardless of whether I run it on the headnode or on a compute node.
>
> Now, a bit about our setup – the headnode has an infiniband network card
> installed, but it is not in use. All nodes are connected via ethernet. I
> want to run pmemd.MPI on all the cores of just one node so the interconnect
> shouldn’t matter. However, the errors I get in the log files seem to be
> related to the lack of fast interconnect. (IB ports on the head node and
> RDMA devices on the compute node) I’ve attached the test logfile from the
> compute node and I’ve copied the output from the first test at the end of
> the email.
>
> Has anyone seen this before? Any idea what I’m missing? I haven’t found
> anything on the mailing list, but that could be my fault. I’ve looked for a
> way to either compile mpi to not look for ib interconnects or a runtime
> option to the same effect but haven’t found anything. (although I’ve never
> needed to resort to that before)
>
> Thanks,
> Gard
>
> Output from test:
>
> make[2]: Entering directory `/home/gard/Code/amber14/test'
> export TESTsander='../../bin/pmemd.MPI'; cd 4096wat && ./Run.pure_wat
> librdmacm: Fatal: no RDMA devices found
> librdmacm: Fatal: no RDMA devices found
> --------------------------------------------------------------------------
> [[4828,1],0]: A high-performance Open MPI point-to-point messaging module
> was unable to find any relevant network interfaces:
>
> Module: OpenFabrics (openib)
> Host: node11
>
> Another transport will be used instead, although this may result in
> lower performance.
> --------------------------------------------------------------------------
> --------------------------------------------------------------------------
> [[4829,1],0]: A high-performance Open MPI point-to-point messaging module
> was unable to find any relevant network interfaces:
>
> Module: OpenFabrics (openib)
> Host: node11
>
> Another transport will be used instead, although this may result in
> lower performance.
> --------------------------------------------------------------------------
> [node11:22213] *** An error occurred in MPI_Comm_size
> [node11:22213] *** reported by process [316473345,0]
> [node11:22213] *** on communicator MPI_COMM_WORLD
> [node11:22213] *** MPI_ERR_COMM: invalid communicator
> [node11:22213] *** MPI_ERRORS_ARE_FATAL (processes in this communicator
> will now abort,
> [node11:22213] *** and potentially your MPI job)
> [node11:22212] *** An error occurred in MPI_Comm_size
> [node11:22212] *** reported by process [316407809,0]
> [node11:22212] *** on communicator MPI_COMM_WORLD
> [node11:22212] *** MPI_ERR_COMM: invalid communicator
> [node11:22212] *** MPI_ERRORS_ARE_FATAL (processes in this communicator
> will now abort,
> [node11:22212] *** and potentially your MPI job)
>
>
> ===================================================================================
> = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
> = PID 22212 RUNNING AT node11
> = EXIT CODE: 5
> = CLEANING UP REMAINING PROCESSES
> = YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
>
> ===================================================================================
> ./Run.pure_wat: Program error
> CONFIDENTIALITY NOTICE
> This e-mail message and any attachments are only for the use of the
> intended recipient and may contain information that is privileged,
> confidential or exempt from disclosure under applicable law. If you are not
> the intended recipient, any disclosure, distribution or other use of this
> e-mail message or attachments is prohibited. If you have received this
> e-mail message in error, please delete and notify the sender immediately.
> Thank you.
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>
>
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Mon Apr 18 2016 - 12:30:03 PDT
Custom Search