Re: [AMBER] Chained AMBER jobs crash on Dual GPU compute node.

From: Jason Swails <jason.swails.gmail.com>
Date: Wed, 29 May 2013 14:16:24 -0400

On Wed, May 29, 2013 at 12:39 PM, ET <sketchfoot.gmail.com> wrote:

> Hi Jason,
>
> Thanks very much for the information! In the present situation it is only
> me running the jobs, but I'm planning on getting TORQUE installed on some
> of the department machines. In that situation, from what has been said, it
> seems wise to not mention the PBS_GPUFILE and let the system allocate
> resource accordingly. This should be ok as all the GPUs are the same and
> thus I don't see a reason why someone would prefer one over another.
>

One thing to be wary of is whether or not a specific application uses the
Torque API and will attempt to assign GPU based on the contents of
PBS_GPUFILE (or if the Torque API even supports robust GPU selection).

I'm not trying to convince you to use a different strategy (it's the same
one I would probably employ). I'm just pointing out that no solution is
'ideal' for automating GPU load balancing in a scheduler context (the
Kernel handles it automagically for CPU allocation), and you should be
ready for any problems that may arise because of it.

All the best,
Jason

-- 
Jason M. Swails
Quantum Theory Project,
University of Florida
Ph.D. Candidate
352-392-4032
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Wed May 29 2013 - 11:30:02 PDT
Custom Search