Hi Jan-Philip,
Both Torque and Slurm are known to work. I have setup Slurm allowing
shared use of nodes which lets you handout GPUs on an individual basis
rather than as a whole node. Slurm's documentation is pretty lacking
though - there's lots of it but they tend to just focus on stupidly
complicated and arcane examples so it can be a pain to figure out.
Torque I have not used but others have reported success with it - there is
a minor issue in that the Maui scheduler doesn't properly understand GPUs
even though torque does so you can't run with Maui unless you are happy
allocating at node rather than GPU granularity.
Hope that helps. Others can hopefully give you more detailed info.
All the best
Ross
On 3/18/13 6:35 AM, "Jan-Philip Gehrcke" <jgehrcke.googlemail.com> wrote:
>Hello,
>
>I am aiming to set up a free GPU job scheduling solution in order to
>distribute (Amber) GPU computing jobs among various nodes containing
>CUDA devices. However, the corresponding resources in the web are still
>scarce. When searching the web for the term "GPU job scheduling" then
>this short list hosted by Nvidia is the most informative result:
>https://developer.nvidia.com/job-scheduling
>
>I am currently looking into setting up Torque which "supports" NVIDIA
>GPUs:
>http://docs.adaptivecomputing.com/torque/4-0-2/Content/topics/3-nodes/NVID
>IAGPGPUs.htm
>
>However, before proceeding, I would be very interested to hear about the
>experiences others have made.
>
>So, if you have set up a GPU cluster with proper job management based on
>free software, then I would be happy to read about the scheduler
>software of your choice, complications you ran into, and other
>experiences you find worth mentioning. Sure that this would help not
>only me!
>
>
>Thanks a lot,
>
>Jan-Philip
>
>
>
>_______________________________________________
>AMBER mailing list
>AMBER.ambermd.org
>http://lists.ambermd.org/mailman/listinfo/amber
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Mon Mar 18 2013 - 08:30:03 PDT