Re: [AMBER] The system size limitations for Tesla C2050 ?

From: Scott Le Grand <>
Date: Tue, 13 Jul 2010 15:13:14 -0700

No I'm only talking multi-GPU. Single GPU jobs run more or less independently unless you're doing something silly like dumping coordinates on every single step.

-----Original Message-----
From: Marek Maly []
Sent: Tuesday, July 13, 2010 14:47
To: AMBER Mailing List
Cc:; massimo maiolo
Subject: Re: [AMBER] The system size limitations for Tesla C2050 ?

If this is true also for the single GPU jobs on such machine,
then of course hybrid solution is really wasting of potential performance
and money.

Thanks a lot for this valuable and not so obvious information !

Best wishes,


Dne Tue, 13 Jul 2010 23:37:19 +0200 Scott Le Grand <>

> Without a lot of load-balancing work, a hybrid solution will have the
> performance of the slowest part...
> -----Original Message-----
> From: Marek Maly []
> Sent: Tuesday, July 13, 2010 09:46
> To: AMBER Mailing List
> Cc:; massimo maiolo
> Subject: Re: [AMBER] The system size limitations for Tesla C2050 ?
> OK,
> many thanks to Ross and Scott for their explanation !
> I hope that many C2050 users were pleased as me :))
> In fact my question was motivated just by the fact that my colleagues
> are going to buy in these days small GPU machine (just 4GPUs
> "Workstation"
> ) so we
> were thinking about the proper combination of C1060 and C2050.
> C1060 slower but able to calculate big systems (400K atoms and somewhat
> more) and C2050
> pretty fast and available for the small and middle sized systems (let say
> up to 300K).
> After these new information about the "magic patch" it seems that it will
> be sufficient to go here
> with combination 3xC2050 + 1xC1060 or even with 4xC2050.
> The question if it is worth to consider hybrid solution could be answered
> fully after
> is clear which is the difference in the max. system size computable on
> C2050 and C1060
> after the patch is applied.
> This will be still probably determined by the differences in the GPUs
> build-in memories (4 and 3 GB) but maybe not only since
> apart from the differences in the architectures which can affect the way
> how this memory can be used, there might be also possibility,
> that these GPUs can use differently some additional part of the available
> RAM memory of the machine where they are installed ...
> Probably the final answer could be given here after the new "after-patch"
> tests/benchmarks are done.
> Thanks again for the very good news and for the effort and time of all
> developers !
> Best wishes,
> Marek
> Dne Mon, 12 Jul 2010 20:02:19 +0200 Ross Walker <>
> napsal/-a:
>> Hi Marek,
>>> Could you please send some more information (the proper web link is
>>> enough) about the patch after which single
>>> C2050 was able calculate 400K atom system (i.e. the cellulose benchmark
>>> ) ?
>> Please be patient here. The patch will come as part of the 'monster'
>> patch
>> to add parallel support. This all needs to be extensively tested before
>> release to make sure the code is giving the correct answers, that we
>> have
>> found as many bugs as we can etc.
>> I would like to avoid people being given 'experimental' or 'partial'
>> patches
>> since it will just make support a complete disaster down the line. Given
>> people ultimately want to publish the results from their simulations it
>> is
>> also critical that others be able to reproduce their work and this is
>> difficult if there are multiple versions of AMBER out there, especially
>> with
>> something as new as the CUDA GPU support.
>>> You are rigt the speedup is (speaking about the explicit solvent Amber
>>> calc.) from cca 40 to 100% according
>>> to the relevant benchmark:
>>> From that benchmark is evident that speedup is strongly dependent on
>>> system size (with higher size the speedup is decreasing).
>> Yes this will ALWAYS be the case. The interesting thing about the GPU
>> situation is that the speedup for small systems such as JAC is greater
>> than
>> for large systems such as FactorIX. The reasons for this, as with all
>> benchmarks, are hopelessly complex and a function of the way memory
>> access
>> is done on the GPU but also the fact that on the CPU the larger test
>> case
>> scales better to the 8 cores of the test machine than the smaller one.
>> This
>> is often what is missing when people just talk about speedup since there
>> are
>> MANY degrees of freedom. However, the key point is that the AMBER GPU
>> code
>> gets better speedup with smaller systems than larger ones. This of
>> course
>> breaks down if you go too small. Probably JAC is the sweetspot although
>> I've
>> never had time to characterize it properly. Note this is the complete
>> reverse of MPI where the larger the system the better the scaling.
>> So, in summary with regards to the patch, please be patient. I wish
>> things
>> could be done a lot faster but ultimately funding is the limitation
>> which
>> limits the number of people that can work on this. I'm sure NVIDIA would
>> love to chuck out the patch to you right now etc but that is because
>> they
>> ultimately don't have to support this when things go wrong. Plus I
>> appreciate the need for the science to be correct! So just give us a
>> while
>> to get things properly tested and then the patch will be posted on the
>> amber
>> website.
>> All the best
>> Ross
>> /\
>> \/
>> |\oss Walker
>> ---------------------------------------------------------
>> | Assistant Research Professor |
>> | San Diego Supercomputer Center |
>> | Adjunct Assistant Professor |
>> | Dept. of Chemistry and Biochemistry |
>> | University of California San Diego |
>> | | |
>> | Tel: +1 858 822 0854 | EMail:- |
>> ---------------------------------------------------------
>> Note: Electronic Mail is not secure, has no guarantee of delivery, may
>> not
>> be read every day, and should not be used for urgent or sensitive
>> issues.
>> _______________________________________________
>> AMBER mailing list
>> __________ Informace od ESET NOD32 Antivirus, verze databaze 5272
>> (20100712) __________
>> Tuto zpravu proveril ESET NOD32 Antivirus.

Tato zpráva byla vytvořena převratným poštovním klientem Opery:
AMBER mailing list
This email message is for the sole use of the intended recipient(s) and may contain
confidential information.  Any unauthorized review, use, disclosure or distribution
is prohibited.  If you are not the intended recipient, please contact the sender by
reply email and destroy all copies of the original message.
AMBER mailing list
Received on Tue Jul 13 2010 - 15:30:03 PDT
Custom Search