Re: [AMBER] Problem with two jobs on one GPU

From: Marek Maly <marek.maly.ujep.cz>
Date: Wed, 08 Jun 2011 12:46:21 +0200

Hi Ross and Scott,

thanks a lot for your analyses and the final suggestion.

Just for completeness' sake, here is that sooner communication which I
mentioned in actual
discussion.

http://archive.ambermd.org/201102/0425.html

Of course that I am not able to say anything regarding that mentioned
martinis :))

   Best wishes,

        Marek







Dne Mon, 06 Jun 2011 18:23:08 +0200 Ross Walker <ross.rosswalker.co.uk>
napsal/-a:

> Hi Marek,
>
>> some time ago I asked here question, as a part of some discussion,
>> about
>> the possibility to overload GPU card (here TESLA 2050) with
>> calculations
>> of two
>> simultaneous jobs. If I remember Ross answered that it should be OK
>> although,
>> performance will decrease significantly comparing to just one job
>> calculation.
>
> I VERY MUCH doubt I said it like that. If I did I was probably answer
> emails
> after one too many martinis. Yes you can run 2 jobs at the same time on a
> single GPU. Do I recommend it? NO I CERTAINLY DO NOT. It is pointless for
> anything other than testing. The performance will completely suck, maybe
> each job will run about 10% of the speed it would normally. Thus you run
> 2
> jobs at once you get about 20% of the total throughput as you would
> running
> a single job. So consider you want to run 2 jobs each of 5 ns long and
> you
> get 10ns/day for each job on its own. If you run:
>
> Job 1 to completion followed by job 2 to completion. Time = 12 hours for
> job
> 1 + 12 for job 2 = 24 hours
>
> However, if you run both together then you get about 10% performance for
> each so Job 1 will get about 1ns/day, same for job 2. But they run
> concurrently so the time needed is that for a single job = 5 days!
>
> So you are ALWAYS better off running the jobs sequentially.
>
>> Today I tried to put 2 jobs on one Tesla 2050 and when I wanted to
>> launch
>> the second job
>> there it was crashed with this error message:
>>
>> -----------------------------------------------------------------------
>> ------------------------
>> cudaMemcpyToSymbol: SetSim copy to cSim failed all CUDA-capable devices
>> are busy or unavailable
>> -----------------------------------------------------------------------
>> ------------------------
>>
>> From the memory point of view it should be OK (enough for both jobs).
>
> It is possible NVIDIA changed the way things work in the driver meaning
> it
> won't let you do this anymore. Or maybe it lets you but you have to look
> for
> busy signals in your code and retry. This is something the AMBER code
> does
> not have and is something that will NOT be added. So, my simple
> suggestion
> is just to run 1 job at a time per GPU.
>
> All the best
> Ross
>
>
> /\
> \/
> |\oss Walker
>
> ---------------------------------------------------------
> | Assistant Research Professor |
> | San Diego Supercomputer Center |
> | Adjunct Assistant Professor |
> | Dept. of Chemistry and Biochemistry |
> | University of California San Diego |
> | NVIDIA Fellow |
> | http://www.rosswalker.co.uk | http://www.wmd-lab.org/ |
> | Tel: +1 858 822 0854 | EMail:- ross.rosswalker.co.uk |
> ---------------------------------------------------------
>
> Note: Electronic Mail is not secure, has no guarantee of delivery, may
> not
> be read every day, and should not be used for urgent or sensitive issues.
>
>
>
>
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>
> __________ Informace od ESET NOD32 Antivirus, verze databaze 6183
> (20110606) __________
>
> Tuto zpravu proveril ESET NOD32 Antivirus.
>
> http://www.eset.cz
>
>
>


-- 
Tato zpráva byla vytvořena převratným poštovním klientem Opery:  
http://www.opera.com/mail/
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Wed Jun 08 2011 - 04:30:02 PDT
Custom Search