Re: [AMBER] The system size limitations for Tesla C2050 ?

From: Scott Le Grand <>
Date: Tue, 13 Jul 2010 14:37:19 -0700

Without a lot of load-balancing work, a hybrid solution will have the performance of the slowest part...

-----Original Message-----
From: Marek Maly []
Sent: Tuesday, July 13, 2010 09:46
To: AMBER Mailing List
Cc:; massimo maiolo
Subject: Re: [AMBER] The system size limitations for Tesla C2050 ?


many thanks to Ross and Scott for their explanation !
I hope that many C2050 users were pleased as me :))

In fact my question was motivated just by the fact that my colleagues
are going to buy in these days small GPU machine (just 4GPUs "Workstation"
) so we
were thinking about the proper combination of C1060 and C2050.

C1060 slower but able to calculate big systems (400K atoms and somewhat
more) and C2050
pretty fast and available for the small and middle sized systems (let say
up to 300K).

After these new information about the "magic patch" it seems that it will
be sufficient to go here
with combination 3xC2050 + 1xC1060 or even with 4xC2050.

The question if it is worth to consider hybrid solution could be answered
fully after
is clear which is the difference in the max. system size computable on
C2050 and C1060
after the patch is applied.

This will be still probably determined by the differences in the GPUs
build-in memories (4 and 3 GB) but maybe not only since
apart from the differences in the architectures which can affect the way
how this memory can be used, there might be also possibility,
that these GPUs can use differently some additional part of the available
RAM memory of the machine where they are installed ...

Probably the final answer could be given here after the new "after-patch"
tests/benchmarks are done.

Thanks again for the very good news and for the effort and time of all
developers !

   Best wishes,


Dne Mon, 12 Jul 2010 20:02:19 +0200 Ross Walker <>

> Hi Marek,
>> Could you please send some more information (the proper web link is
>> enough) about the patch after which single
>> C2050 was able calculate 400K atom system (i.e. the cellulose benchmark
>> ) ?
> Please be patient here. The patch will come as part of the 'monster'
> patch
> to add parallel support. This all needs to be extensively tested before
> release to make sure the code is giving the correct answers, that we have
> found as many bugs as we can etc.
> I would like to avoid people being given 'experimental' or 'partial'
> patches
> since it will just make support a complete disaster down the line. Given
> people ultimately want to publish the results from their simulations it
> is
> also critical that others be able to reproduce their work and this is
> difficult if there are multiple versions of AMBER out there, especially
> with
> something as new as the CUDA GPU support.
>> You are rigt the speedup is (speaking about the explicit solvent Amber
>> calc.) from cca 40 to 100% according
>> to the relevant benchmark:
>> From that benchmark is evident that speedup is strongly dependent on
>> system size (with higher size the speedup is decreasing).
> Yes this will ALWAYS be the case. The interesting thing about the GPU
> situation is that the speedup for small systems such as JAC is greater
> than
> for large systems such as FactorIX. The reasons for this, as with all
> benchmarks, are hopelessly complex and a function of the way memory
> access
> is done on the GPU but also the fact that on the CPU the larger test case
> scales better to the 8 cores of the test machine than the smaller one.
> This
> is often what is missing when people just talk about speedup since there
> are
> MANY degrees of freedom. However, the key point is that the AMBER GPU
> code
> gets better speedup with smaller systems than larger ones. This of course
> breaks down if you go too small. Probably JAC is the sweetspot although
> I've
> never had time to characterize it properly. Note this is the complete
> reverse of MPI where the larger the system the better the scaling.
> So, in summary with regards to the patch, please be patient. I wish
> things
> could be done a lot faster but ultimately funding is the limitation which
> limits the number of people that can work on this. I'm sure NVIDIA would
> love to chuck out the patch to you right now etc but that is because they
> ultimately don't have to support this when things go wrong. Plus I
> appreciate the need for the science to be correct! So just give us a
> while
> to get things properly tested and then the patch will be posted on the
> amber
> website.
> All the best
> Ross
> /\
> \/
> |\oss Walker
> ---------------------------------------------------------
> | Assistant Research Professor |
> | San Diego Supercomputer Center |
> | Adjunct Assistant Professor |
> | Dept. of Chemistry and Biochemistry |
> | University of California San Diego |
> | | |
> | Tel: +1 858 822 0854 | EMail:- |
> ---------------------------------------------------------
> Note: Electronic Mail is not secure, has no guarantee of delivery, may
> not
> be read every day, and should not be used for urgent or sensitive issues.
> _______________________________________________
> AMBER mailing list
> __________ Informace od ESET NOD32 Antivirus, verze databaze 5272
> (20100712) __________
> Tuto zpravu proveril ESET NOD32 Antivirus.

Tato zpráva byla vytvořena převratným poštovním klientem Opery:
AMBER mailing list
This email message is for the sole use of the intended recipient(s) and may contain
confidential information.  Any unauthorized review, use, disclosure or distribution
is prohibited.  If you are not the intended recipient, please contact the sender by
reply email and destroy all copies of the original message.
AMBER mailing list
Received on Tue Jul 13 2010 - 15:00:03 PDT
Custom Search