Re: [AMBER] Anyone running machines with Quad GPU setups

From: Ross Walker <ross.rosswalker.co.uk>
Date: Mon, 24 Jun 2013 16:58:57 -0700

Hi Kevin,

Thanks for the detailed info. The GTX680s in our experience have been rock
solid stable. 1 out of 20 for infant death sounds about right in my
experience. You should be able to just RMA that card and you should be
good. Note, you can also build such systems with AMD processors. Attached
is a pdf with an Amazon shopping list for a 4 GTX680 system which we have
built many of for $3200 - might even be cheaper now with the 680s coming
down in price. - This motherboard takes all 4 GPUs without hitting
anything - as long as you don't try to connect up a bunch of external USB
connectors.

This same system should work great for GTX780s as well - we just need to
make sure they are giving the correct answers - looking more positive by
the day.

I've not seen any major issues with 4 GPU cooling in these systems - as
long as you have plenty of back airflow as you have ones should be good.
90C is a normal temperature for GTX680 and I've run several flat out for
months on end at this temperature.

I second the choice of slurm. It's certainly far from "Simple" but it does
seem to understand GPUs better than any of the other queuing systems out
there. Indeed - the 'certified' clusters I designed with exxact use Slurm.
Rocks with the slurm roll works great for GPU clusters.

Again, thanks for this info, should be useful to lots of people here.

All the best
Ross


On 6/24/13 4:40 PM, "Kevin Hauser" <84hauser.gmail.com> wrote:

>A good bit late to the discussion, but we've been having success with a
>relatively cheap setup ~ $3,200 per machine and five machines (about 3
>weeks of production burning; acceptable temperatures; see below).
>
>Bad news first:
>We lost one GPU (out of 20). Not bad, considering our expectations from
>commodity kit.
>
>Briefly, we got five quad-GPU boxes running EVGA GTX680-FTW, Intel
>i7-3820,
>on the GigaByte GA-X79-UD3 LGA2011 mobo. # See slide one of attached *pdf
>for overview.
>
>Discussion on our kit:
>The Antec P280 cases are great b/c all metal is rubber coated for quiet
>running, there's three fans included, a quick mount for a fourth 120mm
>fan,
>and space for half-dozen HDDs. It's pretty heavy, though (22 lbs, dry).
>--$110
>
>The CPU is what it is; from our vendor, cpu cooler is not included. We got
>the Cooler Master Hyper212 --$300 + $33
>
>The mobos appear to be well manufactured, especially given the price (half
>of ET's Asus kit). BUT, I needed to take apart the power and reset
>terminals connecting case to mobo so the last GPU fully seats into the
>mobo. -- $230 # See slide two of attached *pdf.
>
>Good news last:
>I took the DHFR test case (mdin below) and ran it for 100 ps. I left
>ntpr=1
>to see when and where things got funky... Every single GPU in all five
>"nodes" produced identical mdouts. Afterwards, we had that one GPU die,
>though. I'm getting double the speed we were getting on our center's mega
>expensive server (Tesla M2070s).
>custom short md
> &cntrl
> nstlim=100000, ig=11,ioutfm=1,ntxo=2,
> ntx=5, irest=1,
> ntc=2, ntf=2,
> ntpr=1, ntwr=10000,
> dt=0.002,
> ntt=1, tautp=10.0,
> temp0=300.0,
> ntb=2,ntp=1,taup=10.0,
> /
>
>
>Burn test info:
>Overall, our GPUs have not exceeded 90 C, yet. Max sustained we've seen is
>87 C. They're in our old, cold server room.
>
>Only down time last three weeks or so was when we were sorting out our NFS
>or PXE or Slurm queuing system (only a few hours, really). Slurm is
>actually quite nice, simple, and very free.
>
>Of course, the intake fans for the GPUs (save bottom one) suck air right
>off the heat sink of the GPU below it. The GPUs are cleverly tapered right
>where the intake fans face the heat sink, leaving a whopping 2 or so
>millimeters for air. On slide one, we installed a massive fan to ram fresh
>air into the intakes (mod_1). Tests show that 3 GPUs heat up to 85 C +/- 2
>(CVD=0,1,3) and 1 GPU to 76 C (CVD=2). Cooler Master R4-MFJR-07FK-R1 200mm
>MegaFlow --$19 # cvd=cuda_visible_device
>
>On slide 3, you can see I ghetto-fabricated a cardboard box that ducts air
>from a 120mm fan directly into the GPUs' intake-tapered section (mod_2).
>Tests show that CVD=0 hits 84 C, CDV=1 hits 83 C, CVD=2 hits 74 C, and
>CVD=3 hits 82 C. Benefit, the case has a very simple clip-on mount for
>120mm. Downside, we needed that duct to realize benefit. We yanked a fan
>from the top of the case that was needlessly serving the CPU.
>
>
>HTH,
>kevin
>
>
>On Mon, Jun 24, 2013 at 2:48 PM, ET <sketchfoot.gmail.com> wrote:
>
>> Thanks for the further info Ross! :)
>>
>> Decided in the end to go for a Asus P8Z77 WS board with a Intel i7
>>3770K.
>> Slightly overkill, but needed to proof it in the event of resale or
>>finding
>> another use for it.
>>
>> br,
>> g
>>
>>
>> On 25 June 2013 03:44, <deeptinayar.gmail.com> wrote:
>>
>> >
>> > Sent from my BlackBerry® smartphone from !DEA
>> >
>> > -----Original Message-----
>> > From: ET <sketchfoot.gmail.com>
>> > Date: Sat, 22 Jun 2013 20:45:20
>> > To: AMBER Mailing List<amber.ambermd.org>
>> > Reply-To: AMBER Mailing List <amber.ambermd.org>
>> > Subject: Re: [AMBER] Anyone running machines with Quad GPU setups
>> >
>> > Looks like the Asus P9X79-E WS is for you then Scott! :) Haven't seen
>> many
>> > (if any!) boards with that anount of bandwidth so far.
>> >
>> > br,
>> > g
>> >
>> >
>> > On 22 June 2013 19:31, Scott Le Grand <varelse2005.gmail.com> wrote:
>> >
>> > > It may be overkill now but I'm planning to revisit the Multi GPU
>>code
>> in
>> > > the near future and that's why I need a motherboard that can really
>> take
>> > > advantage of it.
>> > > On Jun 22, 2013 10:12 AM, "ET" <sketchfoot.gmail.com> wrote:
>> > >
>> > > > Hi,
>> > > >
>> > > > Thanks v much for you specs Divi. :) I've been debating with
>>myself
>> as
>> > my
>> > > > board as it looks good and has a very nice spec. From what I've
>>read,
>> > the
>> > > > only problems with it is the higher than average power draw.
>> > > >
>> > > > .Scott. I believe the board runs in x8/x8/x8/x8 for a 4 GPU
>>config -
>> so
>> > > > effectively PCI2 2.0 x16 rate. Would this present any problems,
>>if
>> you
>> > > > were running the serial GPU code, From what I read on the AMBER
>>GPU
>> > > > hardware page, this is more important for the parallel GPU code?
>> > Though,
>> > > I
>> > > > imagine having 4x serial ruins going simultaneously would also tax
>> the
>> > > GPU
>> > > > to CPU interface, though how much I'm not sure.
>> > > >
>> > > > Apparently, if you are going intel, you can only acheive PCIe 3.0
>> using
>> > > at
>> > > > least a Sandy Bridge-E or ivy bridge CPU in a socket 155. Please
>> > correct
>> > > me
>> > > > if I have understood this incorrectly though.
>> > > >
>> > > > http://www.enthusiastpc.net/articles/00003/3.aspx
>> > > >
>> > > >
>> > > > A socket 2011 proposition would be the Asus P9X79-E WES which has
>>2x
>> > PLX
>> > > > PEX 8747 chips so can run at x16/x16/x16/x16 with four GPUs
>> > > >
>> > > > https://www.asus.com/Motherboards/P9X79E_WS/#specifications
>> > > >
>> > > > However, I'm unsure whether this is overkill for running 4xGPUs
>>doing
>> > > AMBER
>> > > > serial code.
>> > > >
>> > > > What do you guys think?
>> > > >
>> > > > br,
>> > > > g
>> > > >
>> > > >
>> > > >
>> > > > On 22 June 2013 16:15, Scott Le Grand <varelse2005.gmail.com>
>>wrote:
>> > > >
>> > > > > Does this MB support full p2p at 16x PCIE Gen 3 speeds between
>>all
>> 4
>> > > > GPUs?
>> > > > > On Jun 21, 2013 4:09 PM, "Divi/GMAIL" <dvenkatlu.gmail.com>
>>wrote:
>> > > > >
>> > > > > >
>> > > > > > ET:
>> > > > > > I am using GA-Z77X-UP7 that has PLX chipset and supports
>>3rd
>> Gen
>> > > > > LGA1155
>> > > > > > socket. Bought together with 2 TITANS sometime in March.
>> > > > > > It has been running pretty stable 24/7 since then. I
>>thought
>> of
>> > > > buying
>> > > > > > two more titans later to fill all four slots. With so much
>>mess
>> > > going
>> > > > on
>> > > > > > with TITANS, I put off that plan until the dust settles. You
>> might
>> > > > want
>> > > > > to
>> > > > > > check new 4th GEN cpus and supporting motherboards as the
>> Hardware
>> > > keep
>> > > > > > changing pretty rapidly these days.
>> > > > > >
>> > > > > > I have i5-processor with 16 GB ram and 256 GB SSD. All
>>four
>> > PCI-E
>> > > > > lanes
>> > > > > > are X-16. It also has native X-16 link directly "hardwired" to
>> > > > CPU-lanes
>> > > > > > that will bypass PLX chipset, in case if you run single GPU.
>>This
>> > > might
>> > > > > > reduce a bit of latency but not much. I get 35ns/day on
>>FIX/NVE
>> > > > benchmark
>> > > > > > bypassing PLX chipset, but get about 34ns/day using PLX
>>chipset
>> (on
>> > > > TITAN
>> > > > > > of
>> > > > > > course!!). Not a deal breaker..
>> > > > > >
>> > > > > > Link below:
>> > > > > >
>> > > > > >
>>http://www.gigabyte.com/products/product-page.aspx?pid=4334#ov
>> > > > > >
>> > > > > > HTH
>> > > > > > Divi
>> > > > > >
>> > > > > > -----Original Message-----
>> > > > > > From: ET
>> > > > > > Sent: Thursday, June 20, 2013 8:18 PM
>> > > > > > To: AMBER Mailing List
>> > > > > > Subject: [AMBER] Anyone running machines with Quad GPU setups
>> > > > > >
>> > > > > > Hi all,
>> > > > > >
>> > > > > > I was looking at getting a new mobo to run a quad GPU system.
>>I
>> was
>> > > > > > wondering if anyone has done this. If you could post the
>>model &
>> > make
>> > > > of:
>> > > > > >
>> > > > > > 1) motherboard
>> > > > > > 2) CPU
>> > > > > > 3) RAM
>> > > > > > 4) Case
>> > > > > > 5) The aggregate estimate of ns in simulation you have run on
>> your
>> > > > setup
>> > > > > > without issue,
>> > > > > >
>> > > > > > I would be much obliged! :)
>> > > > > >
>> > > > > > br,
>> > > > > > g
>> > > > > > _______________________________________________
>> > > > > > AMBER mailing list
>> > > > > > AMBER.ambermd.org
>> > > > > > http://lists.ambermd.org/mailman/listinfo/amber
>> > > > > >
>> > > > > >
>> > > > > > _______________________________________________
>> > > > > > AMBER mailing list
>> > > > > > AMBER.ambermd.org
>> > > > > > http://lists.ambermd.org/mailman/listinfo/amber
>> > > > > >
>> > > > > _______________________________________________
>> > > > > AMBER mailing list
>> > > > > AMBER.ambermd.org
>> > > > > http://lists.ambermd.org/mailman/listinfo/amber
>> > > > >
>> > > > _______________________________________________
>> > > > AMBER mailing list
>> > > > AMBER.ambermd.org
>> > > > http://lists.ambermd.org/mailman/listinfo/amber
>> > > >
>> > > _______________________________________________
>> > > AMBER mailing list
>> > > AMBER.ambermd.org
>> > > http://lists.ambermd.org/mailman/listinfo/amber
>> > >
>> > _______________________________________________
>> > AMBER mailing list
>> > AMBER.ambermd.org
>> > http://lists.ambermd.org/mailman/listinfo/amber
>> > _______________________________________________
>> > AMBER mailing list
>> > AMBER.ambermd.org
>> > http://lists.ambermd.org/mailman/listinfo/amber
>> >
>> _______________________________________________
>> AMBER mailing list
>> AMBER.ambermd.org
>> http://lists.ambermd.org/mailman/listinfo/amber
>>
>
>
>
>--
>-- - -
>HK
>
>
>════════════════════════════════════════════
>Kevin E. Hauser, Ph.D. Candidate
>NRSA Fellow, National Institutes of Health
>Carlos Simmerling Laboratory
>Miguel Garcia-Diaz Laboratory
>100 Laufer Center for Physical and Quantitative Biology
>Stony Brook, New York 11794-5252
>Phone: (631) 632.5394 Email: 84hauser.gmail.com
>════════════════════════════════════════════
>
>**************************************************************************
>****
>This e- mail message, including any attachments,
>is for the sole use of the intended recipient(s) and may
>contain confidential and privileged information.
>Any unauthorized review, use, disclosure or distribution is prohibited.
>If you are not the intended recipient, please contact the sender
>by e-mail and destroy all copies of the original.
>**************************************************************************
>****
>_______________________________________________
>AMBER mailing list
>AMBER.ambermd.org
>http://lists.ambermd.org/mailman/listinfo/amber



_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber

Received on Mon Jun 24 2013 - 17:00:04 PDT
Custom Search