Re: [AMBER] very slow work of ptraj on cluster

From: Andrew Voronkov <drugdesign.yandex.ru>
Date: Sat, 28 Aug 2010 20:36:31 +0400

Hm, now it seems like the task is finished, but the output file is not written, here is the output:

  \-/
  -/- PTRAJ: a utility for processing trajectory files
  /-\
  \-/ Version: "AMBER 10.0 integrated" (2/15/2008)
  -/- Executable is: "ptraj"
  /-\
  \-/ Residue labels:

 ALA PRO VAL CYX GLN GLU ILE THR VAL PRO
 MET CYX ARG GLY ILE GLY TYR ASN LEU THR
 HIE MET PRO ASN GLN PHE ASN HIE ASP THR
 GLN ASP GLU ALA GLY LEU GLU VAL HIE GLN
 PHE TRP PRO LEU VAL GLU ILE GLN CYX SER
 PRO ASP LEU ARG PHE PHE LEU CYX SER MET
 TYR THR PRO ILE CYX LEU PRO ASP TYR HIE
 LYS PRO LEU PRO PRO CYX ARG SER VAL CYX
 GLU ARG ALA LYS ALA GLY CYX SER PRO LEU
 MET ARG GLN TYR GLY PHE ALA TRP PRO GLU
 ARG MET SER CYX ASP ARG LEU PRO VAL LEU
 GLY ARG ASP ALA GLU VAL LEU CYX MET ASP
 TYR ASN ARG ALA PRO VAL CYX GLN GLU ILE
 THR VAL PRO MET CYX ARG GLY ILE GLY TYR
 ASN LEU THR HIE MET PRO ASN GLN PHE ASN
 HIE ASP THR GLN ASP GLU ALA GLY LEU GLU
 VAL HIE GLN PHE TRP PRO LEU VAL GLU ILE
 GLN CYX SER PRO ASP LEU ARG PHE PHE LEU
 CYX SER MET TYR THR PRO ILE CYX LEU PRO
 ASP TYR HIE LYS PRO LEU PRO PRO CYX ARG
 SER VAL CYX GLU ARG ALA LYS ALA GLY CYX
 SER PRO LEU MET ARG GLN TYR GLY PHE ALA
 TRP PRO GLU ARG MET SER CYX ASP ARG LEU
 PRO VAL LEU GLY ARG ASP ALA GLU VAL LEU
 CYX MET ASP TYR ASN ARG Na+ Na+ Na+ Na+
 Na+ Na+ WAT WAT WAT WAT WAT WAT WAT WAT
 WAT WAT WAT WAT WAT WAT WAT WAT WAT WAT
 ...
 WAT

 Setting box to be an exact truncated octahedron, angle is 109.471221

PTRAJ: Processing input from file fzd5_w_ca.calc_rms

PTRAJ: trajin fzd5-7_w_prod1.mdcrd
  Checking coordinates: fzd5-7_w_prod1.mdcrd

PTRAJ: trajin fzd5-7_w_prod2.mdcrd
  Checking coordinates: fzd5-7_w_prod2.mdcrd

PTRAJ: trajin fzd5-7_w_prod3.mdcrd
  Checking coordinates: fzd5-7_w_prod3.mdcrd

PTRAJ: trajin fzd5-7_w_prod4.mdcrd
  Checking coordinates: fzd5-7_w_prod4.mdcrd

PTRAJ: rms first mass out /home/voronkov/RMS/fzd5-7_w_prod1.rms @CA time 10
Mask [.CA] represents 246 atoms
[No output trajectory specified (trajout)]

PTRAJ: Successfully read the input file.
       Coordinate processing will occur on 1800 frames.
       Summary of I/O and actions follows:

INPUT COORDINATE FILES
  File (fzd5-7_w_prod1.mdcrd) is an AMBER trajectory (with box info) with 481 sets
  File (fzd5-7_w_prod2.mdcrd) is an AMBER trajectory (with box info) with 464 sets
  File (fzd5-7_w_prod3.mdcrd) is an AMBER trajectory (with box info) with 471 sets
  File (fzd5-7_w_prod4.mdcrd) is an AMBER trajectory (with box info) with 384 sets

NO OUTPUT COORDINATE FILE WAS SPECIFIED

ACTIONS
  1> RMS to first frame using mass weighting
      Dumping RMSd vs. time (with time interval 10.00) to a file named /home/voronkov/RMS/fzd5-7_w_prod1.rms
      Atom selection follows :1-246.CA


Processing AMBER trajectory file fzd5-7_w_prod1.mdcrd

Set 1 .................................................
Set 50 .................................................
Set 100 .................................................
Set 150 .................................................
Set 200 .................................................
Set 250 .................................................
Set 300 .................................................
Set 350 .................................................
Set 400 .................................................
Set 450 ...............................
Processing AMBER trajectory file fzd5-7_w_prod2.mdcrd

Set 1 .................................................
Set 50 .................................................
Set 100 .................................................
Set 150 .................................................
Set 200 .................................................
Set 250 .................................................
Set 300 .................................................
Set 350 .................................................
Set 400 .................................................
Set 450 ..............
Processing AMBER trajectory file fzd5-7_w_prod3.mdcrd

Set 1 .................................................
Set 50 .................................................
Set 100 .................................................
Set 150 .................................................
Set 200 .................................................
Set 250 .................................................
Set 300 .................................................
Set 350 .................................................
Set 400 .................................................
Set 450 .....................
Processing AMBER trajectory file fzd5-7_w_prod4.mdcrd

Set 1 .................................................
Set 50 .................................................
Set 100 .................................................
Set 150 .................................................
Set 200 .................................................
Set 250 .................................................
Set 300 .................................................
Set 350 ..................................

PTRAJ: Successfully read in 1800 sets and processed 1800 sets.
       Dumping accumulated results (if any)

PTRAJ RMS: dumping RMSd vs time data
Connection to node-56-07 closed.


28.08.10, 16:54, "Daniel Roe" <daniel.r.roe.gmail.com>:

> That certainly is strange behavior. One reason this might be happening
> is you are directing your input file in as standard input, which in
> the context of having to submit with another command (mpirun) could be
> causing strange behavior; the ptraj input file might be redirected to
> mpirun instead of ptraj. In particular this seems odd:
>
> > Connection to node-25-06 closed by remote host.
> > Connection to node-25-06 closed.
>
> Try running ptraj without the redirect operator:
>
> mpirun -as single -np 1 -maxtime 360 ptraj fzd5-7_w.prmtop fzd5_w_ca.calc_rms
>
> Also, I would recommend you upgrade to AmberTools 1.4
> (http://ambermd.org/#AmberTools). There are significant performance
> improvements when compared to amber 10 ptraj. If you do decide to run
> AT1.4, be sure to apply all bugfixes after downloading
> (http://ambermd.org/bugfixesat.html).
>
> Let me know if you have any more issues - good luck!
>
> -Dan
>
> On Sat, Aug 28, 2010 at 7:33 AM, Andrew Voronkov wrote:
> > This is Amber10 package. About compillers I am not sure, but this is single processor run, not parallel, mpirun I just use to submit it in the queue on cluster according to description on how single processor mode should be run there. System has about 250 amino acids, four trajectories, each 5 ns. Only rms Ca calculation for now. When I don't use mpirun - on central node it takes just few minutes to the task to be completed, but there is pretty much trajectories and running software on central node is not tolerated by administration and from other side I want to put all trajectories analysis in the shell script.
> > Here is calc_rms file:
> > trajin fzd5-7_w_prod1.mdcrd
> > trajin fzd5-7_w_prod2.mdcrd
> > trajin fzd5-7_w_prod3.mdcrd
> > trajin fzd5-7_w_prod4.mdcrd
> > rms first mass out /home/voronkov/RMS/fzd5-7_w_prod1.rms @CA time 10
> >
> > Here is cluster lof file ptraj.out-114713:
> >
> > š\-/
> > š-/- š PTRAJ: a utility for processing trajectory files
> > š/-\
> > š\-/ š Version: "AMBER 10.0 integrated" (2/15/2008)
> > š-/- š Executable is: "ptraj"
> > š/-\
> > š\-/ š Residue labels:
> >
> > šALA šPRO šVAL šCYX šGLN šGLU šILE šTHR šVAL šPRO
> > šMET šCYX šARG šGLY šILE šGLY šTYR šASN šLEU šTHR
> > šHIE šMET šPRO šASN šGLN šPHE šASN šHIE šASP šTHR
> > šGLN šASP šGLU šALA šGLY šLEU šGLU šVAL šHIE šGLN
> > šPHE šTRP šPRO šLEU šVAL šGLU šILE šGLN šCYX šSER
> > šPRO šASP šLEU šARG šPHE šPHE šLEU šCYX šSER šMET
> > šTYR šTHR šPRO šILE šCYX šLEU šPRO šASP šTYR šHIE
> > šLYS šPRO šLEU šPRO šPRO šCYX šARG šSER šVAL šCYX
> > šGLU šARG šALA šLYS šALA šGLY šCYX šSER šPRO šLEU
> > šMET šARG šGLN šTYR šGLY šPHE šALA šTRP šPRO šGLU
> > šARG šMET šSER šCYX šASP šARG šLEU šPRO šVAL šLEU
> > šGLY šARG šASP šALA šGLU šVAL šLEU šCYX šMET šASP
> > šTYR šASN šARG šALA šPRO šVAL šCYX šGLN šGLU šILE
> > šTHR šVAL šPRO šMET šCYX šARG šGLY šILE šGLY šTYR
> > šASN šLEU šTHR šHIE šMET šPRO šASN šGLN šPHE šASN
> > šHIE šASP šTHR šGLN šASP šGLU šALA šGLY šLEU šGLU
> > šVAL šHIE šGLN šPHE šTRP šPRO šLEU šVAL šGLU šILE
> > šGLN šCYX šSER šPRO šASP šLEU šARG šPHE šPHE šLEU
> > šCYX šSER šMET šTYR šTHR šPRO šILE šCYX šLEU šPRO
> > šASP šTYR šHIE šLYS šPRO šLEU šPRO šPRO šCYX šARG
> > šSER šVAL šCYX šGLU šARG šALA šLYS šALA šGLY šCYX
> > šSER šPRO šLEU šMET šARG šGLN šTYR šGLY šPHE šALA
> > šTRP šPRO šGLU šARG šMET šSER šCYX šASP šARG šLEU
> > šPRO šVAL šLEU šGLY šARG šASP šALA šGLU šVAL šLEU
> > šCYX šMET šASP šTYR šASN šARG šNa+ šNa+ šNa+ šNa+
> > šNa+ šNa+ šWAT šWAT šWAT šWAT šWAT šWAT šWAT šWAT
> > šWAT šWAT šWAT šWAT šWAT šWAT šWAT šWAT šWAT šWAT
> > š...
> > šWAT
> >
> > šSetting box to be an exact truncated octahedron, angle is 109.471221
> >
> > PTRAJ: Processing input from "STDIN" ...
> > Connection to node-25-06 closed by remote host.
> > Connection to node-25-06 closed.
> >
> >
> > ptraj.rep-114713
> >
> > Task š š : ptraj
> > Args š š : ptraj
> > Nproc š š: 1
> > Exit code: unknown
> > Note š š : Time limit exceeded
> > Output in: /home/voronkov/fzd5-ions-tip3p-1ijy/ptraj.out-114713
> > Work dir : /home/voronkov/fzd5-ions-tip3p-1ijy
> > Work time: 6 hours 0 minutes 6 seconds
> > Started š: Fri Aug 27 19:55:33 2010
> > Nodes š š: node-25-06:1
> >
> >
> > š\-/
> > š-/- š PTRAJ: a utility for processing trajectory files
> > š/-\
> > š\-/ š Version: "AMBER 10.0 integrated" (2/15/2008)
> > š-/- š Executable is: "ptraj"
> > š/-\
> > š\-/ š Residue labels:
> >
> > šALA šPRO šVAL šCYX šGLN šGLU šILE šTHR šVAL šPRO
> > šMET šCYX šARG šGLY šILE šGLY šTYR šASN šLEU šTHR
> > šHIE šMET šPRO šASN šGLN šPHE šASN šHIE šASP šTHR
> > šGLN šASP šGLU šALA šGLY šLEU šGLU šVAL šHIE šGLN
> > šPHE šTRP šPRO šLEU šVAL šGLU šILE šGLN šCYX šSER
> > šPRO šASP šLEU šARG šPHE šPHE šLEU šCYX šSER šMET
> > šTYR šTHR šPRO šILE šCYX šLEU šPRO šASP šTYR šHIE
> > šLYS šPRO šLEU šPRO šPRO šCYX šARG šSER šVAL šCYX
> > šGLU šARG šALA šLYS šALA šGLY šCYX šSER šPRO šLEU
> > šMET šARG šGLN šTYR šGLY šPHE šALA šTRP šPRO šGLU
> > šARG šMET šSER šCYX šASP šARG šLEU šPRO šVAL šLEU
> > šGLY šARG šASP šALA šGLU šVAL šLEU šCYX šMET šASP
> > šTYR šASN šARG šALA šPRO šVAL šCYX šGLN šGLU šILE
> > šTHR šVAL šPRO šMET šCYX šARG šGLY šILE šGLY šTYR
> > šASN šLEU šTHR šHIE šMET šPRO šASN šGLN šPHE šASN
> > šHIE šASP šTHR šGLN šASP šGLU šALA šGLY šLEU šGLU
> > šVAL šHIE šGLN šPHE šTRP šPRO šLEU šVAL šGLU šILE
> > šGLN šCYX šSER šPRO šASP šLEU šARG šPHE šPHE šLEU
> > šCYX šSER šMET šTYR šTHR šPRO šILE šCYX šLEU šPRO
> > šASP šTYR šHIE šLYS šPRO šLEU šPRO šPRO šCYX šARG
> > šSER šVAL šCYX šGLU šARG šALA šLYS šALA šGLY šCYX
> > šSER šPRO šLEU šMET šARG šGLN šTYR šGLY šPHE šALA
> > šTRP šPRO šGLU šARG šMET šSER šCYX šASP šARG šLEU
> > šPRO šVAL šLEU šGLY šARG šASP šALA šGLU šVAL šLEU
> > šCYX šMET šASP šTYR šASN šARG šNa+ šNa+ šNa+ šNa+
> > šNa+ šNa+ šWAT šWAT šWAT šWAT šWAT šWAT šWAT šWAT
> > šWAT šWAT šWAT šWAT šWAT šWAT šWAT šWAT šWAT šWAT
> > š...
> > šWAT
> >
> > šSetting box to be an exact truncated octahedron, angle is 109.471221
> >
> > PTRAJ: Processing input from "STDIN" ...
> > Connection to node-12-10 closed by remote host.
> > Connection to node-12-10 closed.
> >
> >
> > 28.08.10, 00:08, "Daniel Roe" :
> >
> >> It's difficult to comment in too much detail based on the information
> >> šyou have given. What version of ptraj are you running, and what
> >> šcompilers did you use to build it? How big is the system you are
> >> šrunning ptraj on, and how long is the trajectory you are processing?
> >> šWhat sort of actions are you performing (I assume at least an RMSD
> >> šcalculation from the name of your input file). What output does ptraj
> >> šgive, if any? Also, why are you using 'mpirun' for a single processor
> >> šjob? If you don't use 'mpirun' does it complete faster?
> >>
> >> š-Dan
> >>
> >> šOn Fri, Aug 27, 2010 at 12:33 PM, Andrew Voronkov šwrote:
> >> š> It's a bit strange but while running on clusted in a single processor mode ptraj runs for hours:
> >> š> mpirun -as single -np 1 -maxtime 360 ptraj fzd5-7_w.prmtop < fzd5_w_ca.calc_rms
> >> š> while on central node just several minutes.
> >> š> I am not sure if this is a question to Amber mailing list or cluster admins, but I don't understand what causes such a difference.
> >> š>
> >> š> Sincerely yours,
> >> š> Andrew
> >> š>
> >> š> _______________________________________________
> >> š> AMBER mailing list
> >> š> AMBER.ambermd.org
> >> š> http://lists.ambermd.org/mailman/listinfo/amber
> >> š>
> >>
> >> š_______________________________________________
> >> šAMBER mailing list
> >> šAMBER.ambermd.org
> >> šhttp://lists.ambermd.org/mailman/listinfo/amber
> >>
> >>
> >
> > _______________________________________________
> > AMBER mailing list
> > AMBER.ambermd.org
> > http://lists.ambermd.org/mailman/listinfo/amber
> >
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>
>

_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Sat Aug 28 2010 - 10:00:03 PDT
Custom Search