Re: [AMBER] Sufficient CPU cores/GPU ratio ?

From: Marek Maly <marek.maly.ujep.cz>
Date: Wed, 21 Sep 2011 20:10:06 +0200

OK,
I wish to your new machine long life !

Would be anyway interesting to hear from you after
this machine is sufficiently tested (I mean Amber longer calculations with
all 4 GPUs fully "loaded") if everything
is OK and you are fully satisfied with this HW combination.

    Best wishes,

        Marek



Dne Wed, 21 Sep 2011 19:54:08 +0200 Jodi Ann Hadden <jodih.uga.edu>
napsal/-a:

> Just un update on our GPU machine with the motherboard socket burn:
>
> I received an email from Microway today saying the machine is being
> shipped back to us. They have installed a new motherboard of the same
> model and a 1350W PSU and claim to have stress tested it with the 4
> C2070s according to their protocol for testing a newly assembled
> machine. They only thing they had to say with regard to what went wrong
> after examining the old motherboard and PSU was that "results were
> inconclusive". I have lots of work lined up for the machine upon its
> return, so we should find out pretty quickly if there is some
> fundamental insufficiency in this particular grouping of hardware
> components... I'm still hoping for the lemon explanation and that all
> will be well now, but I'll be saying a prayer to Glycon, patron deity of
> the GLYCAM lab, before booting it up just in case...
>
> Jodi
>
> On Sep 18, 2011, at 10:34 AM, Marek Maly wrote:
>
>> Hello Ross,
>> thanks for deep analysis ! So let's see what answer/solution will be
>> offered to Jodi.
>>
>> Regarding to me I definitely decided not to go over 3 x GTX 580 (3GB)
>> using
>> common (one socket) motherboard (like mentioned "Asus P6T7 WS
>> SuperComputer").
>>
>> I will also buy some 1400W or 1500W PSU (even for "just" 3 x GTX 580) to
>> ensure
>> safe/long-term functioning of these machines.
>>
>> Best wishes,
>>
>> Marek
>>
>>
>>
>>
>> Dne Sun, 18 Sep 2011 02:01:49 +0200 Ross Walker <ross.rosswalker.co.uk>
>> napsal/-a:
>>
>>> Hi Marek,
>>>
>>>> However I would assume that insufficient PSU will cause just GPUs/CPU
>>>> errors but not
>>>> the melting of PSU connector ... but I am definitely not an expert
>>>> here.
>>>
>>> Yes BUT there should be no way an overloaded power supply would short
>>> out
>>> and melt the motherboard power connector. In principal it should just
>>> trip
>>> the power supply. However I suppose it is possible that either the
>>> current
>>> draw through the PCI-E slots from the 4 C2070s was too high causing a
>>> motherboard voltage regulator to fail and subsequently short out. This
>>> would
>>> be a fundamental design flaw in the motherboard but seems unlikely.
>>>
>>> The other possibility is the power supply was up against the limit and
>>> shorted in some way and put too much voltage on the motherboard power
>>> connector and that caused it to burn out.
>>>
>>> Either way it is simply not a failure that should be possible even if
>>> the
>>> power supply got overloaded. But then one only has to look at San
>>> Diego's
>>> power company trying to blame some lowly technician for blacking out
>>> the
>>> whole of southern California, Arizona and New Mexico to realize that
>>> everybody takes short cuts not bothering to make things fail safe.
>>>
>>> Oh well.
>>>
>>> Let's wait to see what happens when the machine comes back.
>>>
>>> BTW, people should note that all the 4 GPU boxes I have built, that
>>> are 2
>>> socket with 4 C2070 or 4 x M2090 and supermicro boards ALL have dual
>>> 1.4KW
>>> redundant power supplies. Trying to run 4 GPUs (4 GTX580s would be
>>> crazy)
>>> off a single power supply is really pushing the envelope. Especially in
>>> the
>>> US where the voltage is an appallingly low 110V so you can only get
>>> 2.2KW
>>> total off a single circuit. 2 x 1.4KW independent power supplies
>>> plugged
>>> in
>>> independent circuits (but sharing the same earth) is probably the
>>> correct
>>> way to go for building such systems. In Europe you can plug both power
>>> supplies in the same circuit. :-)
>>>
>>> All the best
>>> Ross
>>>
>>> /\
>>> \/
>>> |\oss Walker
>>>
>>> ---------------------------------------------------------
>>> | Assistant Research Professor |
>>> | San Diego Supercomputer Center |
>>> | Adjunct Assistant Professor |
>>> | Dept. of Chemistry and Biochemistry |
>>> | University of California San Diego |
>>> | NVIDIA Fellow |
>>> | http://www.rosswalker.co.uk | http://www.wmd-lab.org/ |
>>> | Tel: +1 858 822 0854 | EMail:- ross.rosswalker.co.uk |
>>> ---------------------------------------------------------
>>>
>>> Note: Electronic Mail is not secure, has no guarantee of delivery, may
>>> not
>>> be read every day, and should not be used for urgent or sensitive
>>> issues.
>>>
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> AMBER mailing list
>>> AMBER.ambermd.org
>>> http://lists.ambermd.org/mailman/listinfo/amber
>>>
>>> __________ Informace od ESET NOD32 Antivirus, verze databaze 6472
>>> (20110917) __________
>>>
>>> Tuto zpravu proveril ESET NOD32 Antivirus.
>>>
>>> http://www.eset.cz
>>>
>>>
>>>
>>
>>
>> --
>> Tato zpráva byla vytvořena převratným poštovním klientem Opery:
>> http://www.opera.com/mail/
>>
>> _______________________________________________
>> AMBER mailing list
>> AMBER.ambermd.org
>> http://lists.ambermd.org/mailman/listinfo/amber
>>
>>
>
>
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>
> __________ Informace od ESET NOD32 Antivirus, verze databaze 6482
> (20110921) __________
>
> Tuto zpravu proveril ESET NOD32 Antivirus.
>
> http://www.eset.cz
>
>
>


-- 
Tato zpráva byla vytvořena převratným poštovním klientem Opery:  
http://www.opera.com/mail/
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Wed Sep 21 2011 - 12:00:02 PDT
Custom Search