RE: Amber+Mosix

From: Ross Walker <ross_at_rosswalker.co.uk>
Date: Sun 11 Aug 2002 22:33:04 +0100

Dear Jian,

> Do some people run Amber job on linux cluster with Mosix? As Mosix can
> migrate the job to other faster nodes automatically, it is quite
> convinience to use. You submit the job in one node and get the result
> without worrying (or even knowing) which nodes do the job.

I run mosix on both dedicated clusters and workstations / teaching
machines that are not in use and it works perfectly, very impresive. As
regards Amber use it is fine as long as you are running single processor
(non-mpi) jobs. On a cluster of 26 cpus connected with Dual (channel
bonded) 100MBPs ethernet I can run 26 copies of amber and it will
happily migrate them. There are very few networking issues since amber
does not write huge amounts of data to disk. Even with it writing to the
MDCRD file every step it is still well within the network capacity. You
can get things even faster by using the new parallel distributed
filesystem coupled with mfs. Then if you specify the files to be written
to /mfs/nodeid/home/mydir/ where nodeid is the id of a node in your
cluster then mosix will try to migrate the job to this node so that it
can do local IO. Hence cycling the scratch directory for each job by
changing the nodeid results in local I/O being conducted for most jobs
rather than network I/O.
 
> The thing worry me is that perhaps the migrations will make the
> interconnect latency issue more severe. Am I right? Do you
> use Amber with
> Mosix nodes? Are you happy with the performance in term of
> scalability?

The performance is fine, 26 single cpu jobs run just as fast as if I ran
them manually without mosix. The problem comes when you want to use MPI
to run multi cpu jobs. The mpi routines will NOT be migrated by mosix
and so doing "mpirun -np 8 sander" on your head node will simply fire up
the 8 processes on the nodes specified in the mpi definition file.
However, all is not lost since mosix will then detect that these nodes
are in use and migrate single processor jobs that are running on those
nodes to ones that are less loaded. Hence while you still have to
manually specify a node list for mpi jobs users of single processor jobs
need not worry since their jobs will automatically be moved "out of the
way" of the mpi jobs.

I hope this helps.
All the best
Ross

/\
\/
|\oss Walker

| Imperial College of Science, Technology & Medicine |
| Department of Chemistry | Theoretical Division |
| Tel:- +44 20 759(45851) |
| EMail:- http://www.rosswalker.co.uk/ |
| PGP Key available on request |

Please note:
The above message is not necessarily the views of the author, Imperial
College or The Department of Chemistry. The author of this message
accepts no responsibility for any offence caused. If this message is
forwarded to anyone not on the original contact list then the
responsibility lies solely with the person forwarding the email and not
with the author.
Received on Sun Aug 11 2002 - 14:33:04 PDT
Custom Search