Re: [AMBER] AMBER GPU Cooling/usage Question

From: ET <sketchfoot.gmail.com>
Date: Fri, 10 May 2013 16:49:23 +0100

Thanks very much for the replies guys. Very informative. I think the "push
till they break within the warranty period" is probably the way to go on
this, as any potential issues should crop up under load.

Thanks again! :)
g


On 9 May 2013 14:27, Ross Walker <ross.rosswalker.co.uk> wrote:

> Hi ET,
>
>
> >I've installed AMBER GPU on several different PC's with varying case
> >designs. Generally I've noticed that when the GPU is under load the
> >temperature gets to about 76-82 degrees C depending on Case design, other
> >components, etc.
> >
> >My questions are:
> >
> >1) Has anyone managed to get their GPUs running at load at a temperature
> >less than 76 degrees C. I've noticed that the well ventilated cases I've
> >used seem to level at this temperature under load, but the difference is
> >that they cool back to baseline temperature v rapidly. 30-60 secs.
>
> You can if you water cool it - here's a 'getto' example from my lab that
> works and stays below 75C.
>
> http://www.brightsideofnews.com/Data/2013_4_29/Take-a-Tour-of-San-Diegos-Su
> percomputer-Center/SDSC13%20(40%20of%2041)_689.jpg
>
>
> That said it doesn't really matter. The cards are designed to run that hot
> - CPUs typically run upwards of 90C so I wouldn't worry about it. I've
> been running boxes with 4 GTX-680s in all at 85C plus flat out for months
> on end with no problems. I've had a few infant deaths in the cards but
> once that settled things are pretty stable.
>
> >2) If you are running a long production run - e.g 6 repeats of a 100ns
> >simulation in series. Do people tend to segment the production run. I.e.
> >Simulate in 5ns segments, then give a "sleep" period of 5 minutes before
> >starting the next segment.
>
> I would definitely segment your simulation - purely to avoid heartache if
> the machine crashes, or your disk gets corrupted etc. Typically I try to
> space my runs to be around 2 to 4 hours and figure that I can always
> repeat a 2 hour run without too much trouble. It can also make analysis
> easier later as you don't end up with multi-terabyte single trajectory
> files. Just make sure you are using a new random seed (or set ig=-1) for
> each restart.
>
> In terms of the 'sleep' I wouldn't bother - probably the worse thing you
> can do for hardware is keep heating it up and cooling it down, that just
> leads to metal fatigue and probably (pure speculation) shortens the life
> of fans etc. Probably better just to leave it running flat out.
>
> >The logic being not to run the card under a continuous load, as I
> >understand the consumer grade Geforce 680s & Titans we use are not really
> >rated for 24/7 usage. Thus they could potentially burn out under
> >protracted, heavy use. On the other hand would lots of rapid heating then
> >cooling reduce the lifespan of the electronics, though there must be some
> >level of inbuilt tolerance for this.
>
> They come with a 3 year warranty - If they break get them replaced. To be
> honest though I haven't seen any real difference in reliability between
> the gaming cards and the tesla cards. A few of the gaming cards die early
> probably because the QC is not as good but after that they tend to run
> just as well. They are the same physical chip underneath (and if you buy
> from EVGA they are made by the same company in the same factory) so there
> is no real argument that I know of for why the gaming card should be less
> tolerant of being run continuously than a tesla card - even the fans are
> the same. So really I think it is just a case of how much they are tested
> when leaving the factory and that is related to infant death of the card
> and not it's long term reliability as far as I can figure.
>
> BTW, if you buy from Exxact they will provide you fully warrantied
> machines (desktops and rack mount clusters) with 3 year + on GeForce
> equipped systems. http://ambermd.org/gpus/recommended_hardware.htm#exxact
>
> Hope that helps.
>
> All the best
> Ross
>
> /\
> \/
> |\oss Walker
>
> ---------------------------------------------------------
> | Assistant Research Professor |
> | San Diego Supercomputer Center |
> | Adjunct Assistant Professor |
> | Dept. of Chemistry and Biochemistry |
> | University of California San Diego |
> | NVIDIA Fellow |
> | http://www.rosswalker.co.uk | http://www.wmd-lab.org |
> | Tel: +1 858 822 0854 | EMail:- ross.rosswalker.co.uk |
> ---------------------------------------------------------
>
> Note: Electronic Mail is not secure, has no guarantee of delivery, may not
> be read every day, and should not be used for urgent or sensitive issues.
>
>
>
>
>
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Fri May 10 2013 - 09:00:03 PDT
Custom Search