Re: [AMBER] Sufficient CPU cores/GPU ratio ?

From: Scott Le Grand <varelse2005.gmail.com>
Date: Tue, 29 Nov 2011 13:16:01 -0800

Not surprised! Trying this with 4 580s however would be an adventure.
They'd be faster and cheaper, but they'd also be hotter and more finicky.

Scott

2011/11/29 Jodi Ann Hadden <jodih.uga.edu>

> Just a quick update for anyone who was interested... We've had this
> machine back for about 2 months now and have encountered no problems with
> running AMBER on all 4 C2070s at once for days at a time. This hardware
> configuration seems to be working out for us, though I'm still keeping my
> fingers crossed just in case... ;-)
>
> Jodi
>
> On Sep 21, 2011, at 2:10 PM, Marek Maly wrote:
>
> > OK,
> > I wish to your new machine long life !
> >
> > Would be anyway interesting to hear from you after
> > this machine is sufficiently tested (I mean Amber longer calculations
> with
> > all 4 GPUs fully "loaded") if everything
> > is OK and you are fully satisfied with this HW combination.
> >
> > Best wishes,
> >
> > Marek
> >
> >
> >
> > Dne Wed, 21 Sep 2011 19:54:08 +0200 Jodi Ann Hadden <jodih.uga.edu>
> > napsal/-a:
> >
> >> Just un update on our GPU machine with the motherboard socket burn:
> >>
> >> I received an email from Microway today saying the machine is being
> >> shipped back to us. They have installed a new motherboard of the same
> >> model and a 1350W PSU and claim to have stress tested it with the 4
> >> C2070s according to their protocol for testing a newly assembled
> >> machine. They only thing they had to say with regard to what went wrong
> >> after examining the old motherboard and PSU was that "results were
> >> inconclusive". I have lots of work lined up for the machine upon its
> >> return, so we should find out pretty quickly if there is some
> >> fundamental insufficiency in this particular grouping of hardware
> >> components... I'm still hoping for the lemon explanation and that all
> >> will be well now, but I'll be saying a prayer to Glycon, patron deity of
> >> the GLYCAM lab, before booting it up just in case...
> >>
> >> Jodi
> >>
> >> On Sep 18, 2011, at 10:34 AM, Marek Maly wrote:
> >>
> >>> Hello Ross,
> >>> thanks for deep analysis ! So let's see what answer/solution will be
> >>> offered to Jodi.
> >>>
> >>> Regarding to me I definitely decided not to go over 3 x GTX 580 (3GB)
> >>> using
> >>> common (one socket) motherboard (like mentioned "Asus P6T7 WS
> >>> SuperComputer").
> >>>
> >>> I will also buy some 1400W or 1500W PSU (even for "just" 3 x GTX 580)
> to
> >>> ensure
> >>> safe/long-term functioning of these machines.
> >>>
> >>> Best wishes,
> >>>
> >>> Marek
> >>>
> >>>
> >>>
> >>>
> >>> Dne Sun, 18 Sep 2011 02:01:49 +0200 Ross Walker <ross.rosswalker.co.uk
> >
> >>> napsal/-a:
> >>>
> >>>> Hi Marek,
> >>>>
> >>>>> However I would assume that insufficient PSU will cause just GPUs/CPU
> >>>>> errors but not
> >>>>> the melting of PSU connector ... but I am definitely not an expert
> >>>>> here.
> >>>>
> >>>> Yes BUT there should be no way an overloaded power supply would short
> >>>> out
> >>>> and melt the motherboard power connector. In principal it should just
> >>>> trip
> >>>> the power supply. However I suppose it is possible that either the
> >>>> current
> >>>> draw through the PCI-E slots from the 4 C2070s was too high causing a
> >>>> motherboard voltage regulator to fail and subsequently short out. This
> >>>> would
> >>>> be a fundamental design flaw in the motherboard but seems unlikely.
> >>>>
> >>>> The other possibility is the power supply was up against the limit and
> >>>> shorted in some way and put too much voltage on the motherboard power
> >>>> connector and that caused it to burn out.
> >>>>
> >>>> Either way it is simply not a failure that should be possible even if
> >>>> the
> >>>> power supply got overloaded. But then one only has to look at San
> >>>> Diego's
> >>>> power company trying to blame some lowly technician for blacking out
> >>>> the
> >>>> whole of southern California, Arizona and New Mexico to realize that
> >>>> everybody takes short cuts not bothering to make things fail safe.
> >>>>
> >>>> Oh well.
> >>>>
> >>>> Let's wait to see what happens when the machine comes back.
> >>>>
> >>>> BTW, people should note that all the 4 GPU boxes I have built, that
> >>>> are 2
> >>>> socket with 4 C2070 or 4 x M2090 and supermicro boards ALL have dual
> >>>> 1.4KW
> >>>> redundant power supplies. Trying to run 4 GPUs (4 GTX580s would be
> >>>> crazy)
> >>>> off a single power supply is really pushing the envelope. Especially
> in
> >>>> the
> >>>> US where the voltage is an appallingly low 110V so you can only get
> >>>> 2.2KW
> >>>> total off a single circuit. 2 x 1.4KW independent power supplies
> >>>> plugged
> >>>> in
> >>>> independent circuits (but sharing the same earth) is probably the
> >>>> correct
> >>>> way to go for building such systems. In Europe you can plug both power
> >>>> supplies in the same circuit. :-)
> >>>>
> >>>> All the best
> >>>> Ross
> >>>>
> >>>> /\
> >>>> \/
> >>>> |\oss Walker
> >>>>
> >>>> ---------------------------------------------------------
> >>>> | Assistant Research Professor |
> >>>> | San Diego Supercomputer Center |
> >>>> | Adjunct Assistant Professor |
> >>>> | Dept. of Chemistry and Biochemistry |
> >>>> | University of California San Diego |
> >>>> | NVIDIA Fellow |
> >>>> | http://www.rosswalker.co.uk | http://www.wmd-lab.org/ |
> >>>> | Tel: +1 858 822 0854 | EMail:- ross.rosswalker.co.uk |
> >>>> ---------------------------------------------------------
> >>>>
> >>>> Note: Electronic Mail is not secure, has no guarantee of delivery, may
> >>>> not
> >>>> be read every day, and should not be used for urgent or sensitive
> >>>> issues.
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> _______________________________________________
> >>>> AMBER mailing list
> >>>> AMBER.ambermd.org
> >>>> http://lists.ambermd.org/mailman/listinfo/amber
> >>>>
> >>>> __________ Informace od ESET NOD32 Antivirus, verze databaze 6472
> >>>> (20110917) __________
> >>>>
> >>>> Tuto zpravu proveril ESET NOD32 Antivirus.
> >>>>
> >>>> http://www.eset.cz
> >>>>
> >>>>
> >>>>
> >>>
> >>>
> >>> --
> >>> Tato zpráva byla vytvořena převratným poštovním klientem Opery:
> >>> http://www.opera.com/mail/
> >>>
> >>> _______________________________________________
> >>> AMBER mailing list
> >>> AMBER.ambermd.org
> >>> http://lists.ambermd.org/mailman/listinfo/amber
> >>>
> >>>
> >>
> >>
> >>
> >> _______________________________________________
> >> AMBER mailing list
> >> AMBER.ambermd.org
> >> http://lists.ambermd.org/mailman/listinfo/amber
> >>
> >> __________ Informace od ESET NOD32 Antivirus, verze databaze 6482
> >> (20110921) __________
> >>
> >> Tuto zpravu proveril ESET NOD32 Antivirus.
> >>
> >> http://www.eset.cz
> >>
> >>
> >>
> >
> >
> > --
> > Tato zpráva byla vytvořena převratným poštovním klientem Opery:
> > http://www.opera.com/mail/
> >
> > _______________________________________________
> > AMBER mailing list
> > AMBER.ambermd.org
> > http://lists.ambermd.org/mailman/listinfo/amber
> >
> >
>
>
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Tue Nov 29 2011 - 13:30:04 PST
Custom Search