Re: AMBER: parallel computation time vs serial computation time

From: Robert Duke <>
Date: Thu, 14 Dec 2006 09:16:22 -0500

Cenk -
I have a bunch of information out on the amber website, that you should read. Setting this stuff up, if you are not basically a computer geek, is not something that you will automatically get right. So look at the sections entitled "configuring amber x for various architectures" - the stuff I wrote for 8 applies to 9 in terms of mpich. Several other points: 1) using plain old mpich is actually somewhat simpler than mpich2 because you don't have to dink with the demons, 2) for dual cpu's in one box, be sure you build the shared memory version of mpich or mpich2, and 3) if you happen to be writing to an nfs share somewhere that can be the real bottleneck. Still, you should have mpi profiling info at the end of the job, showing a bit about communication costs and what have you, and even for a big problem, I think you get 70-80% efficiency at least in this sort of setup. As you move to multiple boxes connected via gigabit ethernet or (hopefully) infiniband, getting everything right gets even more complicated. The unfortunate truth about all this open source cluster stuff it actually is not at all user friendly or installer friendly. Anyway, looking at our benchmarks on should give you an idea what to expect from equivalent hardware - run the factor ix and jac benchmarks from the amber distribution. Read the mpich or mpich2 manual, or get somebody who understands it to set up the environment. Good luck! (and you may want to consider building pmemd for better performance too, but it won't solve all the other problems - if sander does not give you any parallel gain, pmemd won't either).
- Bob Duke
  ----- Original Message -----
  From: Cenk Andac
  Sent: Thursday, December 14, 2006 6:17 AM
  Subject: AMBER: parallel computation time vs serial computation time

  Dear Amber community,

  I have installed amber9 sometime ago on a Dual-Core PentiumD 3.4 GHz PC computer (that is, 1 node with two physical CPUs) opearting under Linux SUSE v10.0 . To the best of my knowledge, the serial and parallel versions of sander passed all tests...I installed sander.MPI using MPICH2 and ifort9.0. As far as I know, I set up library and binary paths corerctly for mpich2.
  In addition to the test experiments provided by amber9, I have conducted two minimization experiments for a small ligand, one with serial sander and the other with sander.mpi (mpiexec -n 2 sander.mpi -O -i in -o out -p prmtop -c incrd -r out.incrd). I do not know if it would be a useful info here, but I activated mpich2 by typing mpd on another console.

  Now, looking at the sander outputs for parallel and serial routines, I see no difference between the time to complete these experiments. Both (parallel and serial) routines took about ~ 50 secs to reach 500 steps of minimization.
  I am just wondering if this is the way it should be on a single node regardless of the number of CPUs. I mean can sander.serial really run both CPUs on one node or there is something else that I am missing here? or it is just that somewhat only one CPU is activated in both routines. In that sense, I would say there is something wrong with mpich2 settings and I can not figure it out what it is at this moment.
  In another test, this time I excuted sander.mpi without mpiexec option and I got exactly the same timing result (~50 secs).
  Can anyone please shed some light on the timing results that I got for paralel and serial sander routines. I am a little bit confused here...
  best regards,


  Cenk Andac, M.S., Ph.D. Student

  School of Pharmacy at
  Gazi University-Ankara Turkiye

  Address: Bandirma Sok. No:6

  Etiler, Ankara, 06330 Turkey

  Cell: +90-(536)-4813012

  Want to start your own business? Learn how on Yahoo! Small Business.
The AMBER Mail Reflector
To post, send mail to
To unsubscribe, send "unsubscribe amber" to
Received on Thu Dec 14 2006 - 16:48:22 PST
Custom Search