Hi Alessandro,
> We are considering to buy a new 'low-cost' computer cluster dedicated
> to MD simulations. We already have two clusters interconnected by
> gigabit ethernet. One of these is composed by intel quad-core
> processors and, as discussed many times in this list, suffers from
> scaling problems when running Amber in more than two nodes.
>
> That said, I would like to ask to users and developers about good
> choices to be considered when buying gigabit Ethernet cards and
> switches. My dream is to convince my PI to upgrade myrinet, but price
> is also an important issue.
In the 'good old' days when gigabit ethernet was still somewhat usable as an
interconnect I would have advised you to go for good quality cards and a top
notch fully non-blocking switch but this advice doesn't really apply anymore
since even in this scenario it doesn't really help with running in parallel.
It can help a little bit with NFS say but to be honest most cheap gigabit
switches even if not non-blocking will still outstrip disk performance so I
would suggest going for a fairly cheap switch and cards that are compatible
with Linux. This is probably the key here - making sure they are supported
under Linux. I would recommend the Intel e1000 chip based cards which are
effectively guaranteed to work under Linux.
As for try to at least get some scalability - so you can run 2 node jobs
something you could try (given that extra network cards are cheap) is to buy
2 cards for each machine. Get yourself 1 cheap switch and hook up one set of
the cards to give you NFS, ssh access / management facilities etc. Then with
the second card in each machine use cross over cables to hook up each pait
of machines. E.g.
Mach 1 eth0 -----------------------
|
|
Mach 2 eth0 ------------------- |
| |
Switch ------- File Server / login node
| |
Mach 3 eth0 ------------------ |
|
|
Mach 4 eth0 -----------------------
Mach 1 eth1 ---X--- Mach2 eth1
Mach 3 eth1 ---X--- Mach4 eth1
Where X is a cross over cable. Then you just need to assign a different
private subnet to the eth1 cards and make sure your machine miles are
modified as necessary to send MPI traffic over the corssover cables. This
way you should be able to run 8 way jobs (as 2 x quad core) for only a small
amount of extra outlay and I think the performance would be better than
trying to run 2 node jobs over the switch. You would of course need to tweak
your queuing software to hand out nodes in the correct pairs and also create
the correct machine files but that shouldn't be too hard.
Note though I have not actually tried this (not for AMBER anyway, just for
doing private NFS mounts / backups etc) but I figure it should work quite
well - the thing I don't know is how well things will scale over the
crossover cable and whether it is worth it based on this so you might want
to just get a cross over cable and try it quickly on some of your existing
machines.
Anyway, just my 3c.
Good luck,
Ross
/\
\/
|\oss Walker
| Assistant Research Professor |
| San Diego Supercomputer Center |
| Tel: +1 858 822 0854 | EMail:- ross.rosswalker.co.uk |
|
http://www.rosswalker.co.uk | PGP Key available on request |
Note: Electronic Mail is not secure, has no guarantee of delivery, may not
be read every day, and should not be used for urgent or sensitive issues.
-----------------------------------------------------------------------
The AMBER Mail Reflector
To post, send mail to amber.scripps.edu
To unsubscribe, send "unsubscribe amber" (in the *body* of the email)
to majordomo.scripps.edu
Received on Fri Dec 05 2008 - 11:06:38 PST