Re: [AMBER] Multi GPU performance

From: David Cerutti <dscerutti.gmail.com>
Date: Fri, 22 Jun 2018 11:00:01 -0400

Tsk, tsk... Dave Case is not the only Dave C replying on this board.

On Jun 22, 2018 10:54 AM, "James Kress" <jimkress_58.kressworks.org> wrote:

> Your problem has nothing to do with GPUs. For some reason you have
> invoked InfiniBand as an interconnect without actually have an InfiniBand
> system installed or properly configured.
>
> Check with your system administrator or IT personnel to find out why your
> run was invoking InfiniBand.
>
> This does not reflect on David Case's comments. Mult-GPUs are an illusory
> 'solution' being sold by Nvidia to sell more GPUs that provide money to
> them but no performance increase to you.
>
> Jim Kress
>
> -----Original Message-----
> From: 陈金峰 <201612095.mail.sdu.edu.cn>
> Sent: Friday, June 22, 2018 7:59 AM
> To: amber <amber.ambermd.org>
> Subject: [AMBER] Multi GPU performance
>
>
> Hello amber users:
> I am starting to use a station with 3GPUs, I used the following
> command :
> nohup mpirun -np 3 pmemd.cuda.MPI -O -i nvt.in -o nvt.out -r...
> It did work ,but it performed even slower than ran on only one GPU, and
> the
> output information is as follows:
>
> [[9951,1],2]: A high-performance Open MPI point-to-point messaging
> module
> was unable to find any relevant network interfaces:
> Module: OpenFabrics (openib)
> Host: hj191
> Another transport will be used instead, although this may result in
> lower performance.
> NOTE: You can disable this warning by setting the MCA parameter
> btl_base_warn_component_unused to 0.
> ------------------------------------------------------------
> --------------
> [hj191:04235] 2 more processes have sent help message
> help-mpi-btl-base.txt
> / btl:no-nics
> [hj191:04235] Set MCA parameter "orte_base_help_aggregate" to 0 to see
> all
> help / error messages
> Note: The following floating-point exceptions are signalling:
> IEEE_UNDERFLOW_FLAG IEEE_DENORMAL
> Note: The following floating-point exceptions are signalling:
> IEEE_UNDERFLOW_FLAG IEEE_DENORMAL
> Note: The following floating-point exceptions are signalling:
> IEEE_UNDERFLOW_FLAG IEEE_DENORMAL
> So what's the problem? what should I do to get a proper performance?
> Any suggestions will be appreciated!
> Thank you!
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Fri Jun 22 2018 - 08:30:03 PDT
Custom Search