Re: [AMBER] very slow work of ptraj on cluster

From: Andrew Voronkov <drugdesign.yandex.ru>
Date: Mon, 30 Aug 2010 13:10:24 +0400

Yes, that is what confuses me - the log shows that ptraj has completed the job without errors, but no .rms file was written.
I have written an e-mail to the administrator, but the funny thing is that when I had put the ptraj command into shell script
and used mpirun -as single -np 1 -maxtime 360 ptraj.sh
it has done the job - everything was working as it should and I've got the .rms files.

Best regards,
Andrew

29.08.10, 02:47, "Daniel Roe" <daniel.r.roe.gmail.com>:

> 2010/8/28 Andrew Voronkov :
> > Hm, now it seems like the task is finished, but the output file is not written, here is the output:
>
> Just to be specific, you mean that the file
> '/home/voronkov/RMS/fzd5-7_w_prod1.rms' is not written? From the
> output you posted it looks like ptraj completed without any errors.
> Since you had mentioned that ptraj completes on the head node fine,
> this makes me suspect there is some issue with the cluster. Are you
> certain that the cluster nodes have access to the '/home/voronkov'
> directory? Since I'm not familiar with how your cluster is set up you
> may want to check with the cluster's system administrator.
>
> -Dan
>
>
>
> >
> > š\-/
> > š-/- š PTRAJ: a utility for processing trajectory files
> > š/-\
> > š\-/ š Version: "AMBER 10.0 integrated" (2/15/2008)
> > š-/- š Executable is: "ptraj"
> > š/-\
> > š\-/ š Residue labels:
> >
> > šALA šPRO šVAL šCYX šGLN šGLU šILE šTHR šVAL šPRO
> > šMET šCYX šARG šGLY šILE šGLY šTYR šASN šLEU šTHR
> > šHIE šMET šPRO šASN šGLN šPHE šASN šHIE šASP šTHR
> > šGLN šASP šGLU šALA šGLY šLEU šGLU šVAL šHIE šGLN
> > šPHE šTRP šPRO šLEU šVAL šGLU šILE šGLN šCYX šSER
> > šPRO šASP šLEU šARG šPHE šPHE šLEU šCYX šSER šMET
> > šTYR šTHR šPRO šILE šCYX šLEU šPRO šASP šTYR šHIE
> > šLYS šPRO šLEU šPRO šPRO šCYX šARG šSER šVAL šCYX
> > šGLU šARG šALA šLYS šALA šGLY šCYX šSER šPRO šLEU
> > šMET šARG šGLN šTYR šGLY šPHE šALA šTRP šPRO šGLU
> > šARG šMET šSER šCYX šASP šARG šLEU šPRO šVAL šLEU
> > šGLY šARG šASP šALA šGLU šVAL šLEU šCYX šMET šASP
> > šTYR šASN šARG šALA šPRO šVAL šCYX šGLN šGLU šILE
> > šTHR šVAL šPRO šMET šCYX šARG šGLY šILE šGLY šTYR
> > šASN šLEU šTHR šHIE šMET šPRO šASN šGLN šPHE šASN
> > šHIE šASP šTHR šGLN šASP šGLU šALA šGLY šLEU šGLU
> > šVAL šHIE šGLN šPHE šTRP šPRO šLEU šVAL šGLU šILE
> > šGLN šCYX šSER šPRO šASP šLEU šARG šPHE šPHE šLEU
> > šCYX šSER šMET šTYR šTHR šPRO šILE šCYX šLEU šPRO
> > šASP šTYR šHIE šLYS šPRO šLEU šPRO šPRO šCYX šARG
> > šSER šVAL šCYX šGLU šARG šALA šLYS šALA šGLY šCYX
> > šSER šPRO šLEU šMET šARG šGLN šTYR šGLY šPHE šALA
> > šTRP šPRO šGLU šARG šMET šSER šCYX šASP šARG šLEU
> > šPRO šVAL šLEU šGLY šARG šASP šALA šGLU šVAL šLEU
> > šCYX šMET šASP šTYR šASN šARG šNa+ šNa+ šNa+ šNa+
> > šNa+ šNa+ šWAT šWAT šWAT šWAT šWAT šWAT šWAT šWAT
> > šWAT šWAT šWAT šWAT šWAT šWAT šWAT šWAT šWAT šWAT
> > š...
> > šWAT
> >
> > šSetting box to be an exact truncated octahedron, angle is 109.471221
> >
> > PTRAJ: Processing input from file fzd5_w_ca.calc_rms
> >
> > PTRAJ: trajin fzd5-7_w_prod1.mdcrd
> > šChecking coordinates: fzd5-7_w_prod1.mdcrd
> >
> > PTRAJ: trajin fzd5-7_w_prod2.mdcrd
> > šChecking coordinates: fzd5-7_w_prod2.mdcrd
> >
> > PTRAJ: trajin fzd5-7_w_prod3.mdcrd
> > šChecking coordinates: fzd5-7_w_prod3.mdcrd
> >
> > PTRAJ: trajin fzd5-7_w_prod4.mdcrd
> > šChecking coordinates: fzd5-7_w_prod4.mdcrd
> >
> > PTRAJ: rms first mass out /home/voronkov/RMS/fzd5-7_w_prod1.rms @CA time 10
> > Mask [.CA] represents 246 atoms
> > [No output trajectory specified (trajout)]
> >
> > PTRAJ: Successfully read the input file.
> > š š š Coordinate processing will occur on 1800 frames.
> > š š š Summary of I/O and actions follows:
> >
> > INPUT COORDINATE FILES
> > šFile (fzd5-7_w_prod1.mdcrd) is an AMBER trajectory (with box info) with 481 sets
> > šFile (fzd5-7_w_prod2.mdcrd) is an AMBER trajectory (with box info) with 464 sets
> > šFile (fzd5-7_w_prod3.mdcrd) is an AMBER trajectory (with box info) with 471 sets
> > šFile (fzd5-7_w_prod4.mdcrd) is an AMBER trajectory (with box info) with 384 sets
> >
> > NO OUTPUT COORDINATE FILE WAS SPECIFIED
> >
> > ACTIONS
> > š1> šRMS to first frame using mass weighting
> > š š šDumping RMSd vs. time (with time interval 10.00) to a file named /home/voronkov/RMS/fzd5-7_w_prod1.rms
> > š š šAtom selection follows :1-246.CA
> >
> >
> > Processing AMBER trajectory file fzd5-7_w_prod1.mdcrd
> >
> > Set š š š1 .................................................
> > Set š š 50 .................................................
> > Set š š100 .................................................
> > Set š š150 .................................................
> > Set š š200 .................................................
> > Set š š250 .................................................
> > Set š š300 .................................................
> > Set š š350 .................................................
> > Set š š400 .................................................
> > Set š š450 ...............................
> > Processing AMBER trajectory file fzd5-7_w_prod2.mdcrd
> >
> > Set š š š1 .................................................
> > Set š š 50 .................................................
> > Set š š100 .................................................
> > Set š š150 .................................................
> > Set š š200 .................................................
> > Set š š250 .................................................
> > Set š š300 .................................................
> > Set š š350 .................................................
> > Set š š400 .................................................
> > Set š š450 ..............
> > Processing AMBER trajectory file fzd5-7_w_prod3.mdcrd
> >
> > Set š š š1 .................................................
> > Set š š 50 .................................................
> > Set š š100 .................................................
> > Set š š150 .................................................
> > Set š š200 .................................................
> > Set š š250 .................................................
> > Set š š300 .................................................
> > Set š š350 .................................................
> > Set š š400 .................................................
> > Set š š450 .....................
> > Processing AMBER trajectory file fzd5-7_w_prod4.mdcrd
> >
> > Set š š š1 .................................................
> > Set š š 50 .................................................
> > Set š š100 .................................................
> > Set š š150 .................................................
> > Set š š200 .................................................
> > Set š š250 .................................................
> > Set š š300 .................................................
> > Set š š350 ..................................
> >
> > PTRAJ: Successfully read in 1800 sets and processed 1800 sets.
> > š š š Dumping accumulated results (if any)
> >
> > PTRAJ RMS: dumping RMSd vs time data
> > Connection to node-56-07 closed.
> >
> >
> > 28.08.10, 16:54, "Daniel Roe" :
> >
> >> That certainly is strange behavior. One reason this might be happening
> >> šis you are directing your input file in as standard input, which in
> >> šthe context of having to submit with another command (mpirun) could be
> >> šcausing strange behavior; the ptraj input file might be redirected to
> >> šmpirun instead of ptraj. In particular this seems odd:
> >>
> >> š> Connection to node-25-06 closed by remote host.
> >> š> Connection to node-25-06 closed.
> >>
> >> šTry running ptraj without the redirect operator:
> >>
> >> šmpirun -as single -np 1 -maxtime 360 ptraj fzd5-7_w.prmtop fzd5_w_ca.calc_rms
> >>
> >> šAlso, I would recommend you upgrade to AmberTools 1.4
> >> š(http://ambermd.org/#AmberTools). There are significant performance
> >> šimprovements when compared to amber 10 ptraj. If you do decide to run
> >> šAT1.4, be sure to apply all bugfixes after downloading
> >> š(http://ambermd.org/bugfixesat.html).
> >>
> >> šLet me know if you have any more issues - good luck!
> >>
> >> š-Dan
> >>
> >> šOn Sat, Aug 28, 2010 at 7:33 AM, Andrew Voronkov šwrote:
> >> š> This is Amber10 package. About compillers I am not sure, but this is single processor run, not parallel, mpirun I just use to submit it in the queue on cluster according to description on how single processor mode should be run there. System has about 250 amino acids, four trajectories, each 5 ns. Only rms Ca calculation for now. When I don't use mpirun - on central node it takes just few minutes to the task to be completed, but there is pretty much trajectories and running software on central node is not tolerated by administration and from other side I want to put all trajectories analysis in the shell script.
> >> š> Here is calc_rms file:
> >> š> trajin fzd5-7_w_prod1.mdcrd
> >> š> trajin fzd5-7_w_prod2.mdcrd
> >> š> trajin fzd5-7_w_prod3.mdcrd
> >> š> trajin fzd5-7_w_prod4.mdcrd
> >> š> rms first mass out /home/voronkov/RMS/fzd5-7_w_prod1.rms @CA time 10
> >> š>
> >> š> Here is cluster lof file ptraj.out-114713:
> >> š>
> >> š> š\-/
> >> š> š-/- š PTRAJ: a utility for processing trajectory files
> >> š> š/-\
> >> š> š\-/ š Version: "AMBER 10.0 integrated" (2/15/2008)
> >> š> š-/- š Executable is: "ptraj"
> >> š> š/-\
> >> š> š\-/ š Residue labels:
> >> š>
> >> š> šALA šPRO šVAL šCYX šGLN šGLU šILE šTHR šVAL šPRO
> >> š> šMET šCYX šARG šGLY šILE šGLY šTYR šASN šLEU šTHR
> >> š> šHIE šMET šPRO šASN šGLN šPHE šASN šHIE šASP šTHR
> >> š> šGLN šASP šGLU šALA šGLY šLEU šGLU šVAL šHIE šGLN
> >> š> šPHE šTRP šPRO šLEU šVAL šGLU šILE šGLN šCYX šSER
> >> š> šPRO šASP šLEU šARG šPHE šPHE šLEU šCYX šSER šMET
> >> š> šTYR šTHR šPRO šILE šCYX šLEU šPRO šASP šTYR šHIE
> >> š> šLYS šPRO šLEU šPRO šPRO šCYX šARG šSER šVAL šCYX
> >> š> šGLU šARG šALA šLYS šALA šGLY šCYX šSER šPRO šLEU
> >> š> šMET šARG šGLN šTYR šGLY šPHE šALA šTRP šPRO šGLU
> >> š> šARG šMET šSER šCYX šASP šARG šLEU šPRO šVAL šLEU
> >> š> šGLY šARG šASP šALA šGLU šVAL šLEU šCYX šMET šASP
> >> š> šTYR šASN šARG šALA šPRO šVAL šCYX šGLN šGLU šILE
> >> š> šTHR šVAL šPRO šMET šCYX šARG šGLY šILE šGLY šTYR
> >> š> šASN šLEU šTHR šHIE šMET šPRO šASN šGLN šPHE šASN
> >> š> šHIE šASP šTHR šGLN šASP šGLU šALA šGLY šLEU šGLU
> >> š> šVAL šHIE šGLN šPHE šTRP šPRO šLEU šVAL šGLU šILE
> >> š> šGLN šCYX šSER šPRO šASP šLEU šARG šPHE šPHE šLEU
> >> š> šCYX šSER šMET šTYR šTHR šPRO šILE šCYX šLEU šPRO
> >> š> šASP šTYR šHIE šLYS šPRO šLEU šPRO šPRO šCYX šARG
> >> š> šSER šVAL šCYX šGLU šARG šALA šLYS šALA šGLY šCYX
> >> š> šSER šPRO šLEU šMET šARG šGLN šTYR šGLY šPHE šALA
> >> š> šTRP šPRO šGLU šARG šMET šSER šCYX šASP šARG šLEU
> >> š> šPRO šVAL šLEU šGLY šARG šASP šALA šGLU šVAL šLEU
> >> š> šCYX šMET šASP šTYR šASN šARG šNa+ šNa+ šNa+ šNa+
> >> š> šNa+ šNa+ šWAT šWAT šWAT šWAT šWAT šWAT šWAT šWAT
> >> š> šWAT šWAT šWAT šWAT šWAT šWAT šWAT šWAT šWAT šWAT
> >> š> š...
> >> š> šWAT
> >> š>
> >> š> šSetting box to be an exact truncated octahedron, angle is 109.471221
> >> š>
> >> š> PTRAJ: Processing input from "STDIN" ...
> >> š> Connection to node-25-06 closed by remote host.
> >> š> Connection to node-25-06 closed.
> >> š>
> >> š>
> >> š> ptraj.rep-114713
> >> š>
> >> š> Task š š : ptraj
> >> š> Args š š : ptraj
> >> š> Nproc š š: 1
> >> š> Exit code: unknown
> >> š> Note š š : Time limit exceeded
> >> š> Output in: /home/voronkov/fzd5-ions-tip3p-1ijy/ptraj.out-114713
> >> š> Work dir : /home/voronkov/fzd5-ions-tip3p-1ijy
> >> š> Work time: 6 hours 0 minutes 6 seconds
> >> š> Started š: Fri Aug 27 19:55:33 2010
> >> š> Nodes š š: node-25-06:1
> >> š>
> >> š>
> >> š> š\-/
> >> š> š-/- š PTRAJ: a utility for processing trajectory files
> >> š> š/-\
> >> š> š\-/ š Version: "AMBER 10.0 integrated" (2/15/2008)
> >> š> š-/- š Executable is: "ptraj"
> >> š> š/-\
> >> š> š\-/ š Residue labels:
> >> š>
> >> š> šALA šPRO šVAL šCYX šGLN šGLU šILE šTHR šVAL šPRO
> >> š> šMET šCYX šARG šGLY šILE šGLY šTYR šASN šLEU šTHR
> >> š> šHIE šMET šPRO šASN šGLN šPHE šASN šHIE šASP šTHR
> >> š> šGLN šASP šGLU šALA šGLY šLEU šGLU šVAL šHIE šGLN
> >> š> šPHE šTRP šPRO šLEU šVAL šGLU šILE šGLN šCYX šSER
> >> š> šPRO šASP šLEU šARG šPHE šPHE šLEU šCYX šSER šMET
> >> š> šTYR šTHR šPRO šILE šCYX šLEU šPRO šASP šTYR šHIE
> >> š> šLYS šPRO šLEU šPRO šPRO šCYX šARG šSER šVAL šCYX
> >> š> šGLU šARG šALA šLYS šALA šGLY šCYX šSER šPRO šLEU
> >> š> šMET šARG šGLN šTYR šGLY šPHE šALA šTRP šPRO šGLU
> >> š> šARG šMET šSER šCYX šASP šARG šLEU šPRO šVAL šLEU
> >> š> šGLY šARG šASP šALA šGLU šVAL šLEU šCYX šMET šASP
> >> š> šTYR šASN šARG šALA šPRO šVAL šCYX šGLN šGLU šILE
> >> š> šTHR šVAL šPRO šMET šCYX šARG šGLY šILE šGLY šTYR
> >> š> šASN šLEU šTHR šHIE šMET šPRO šASN šGLN šPHE šASN
> >> š> šHIE šASP šTHR šGLN šASP šGLU šALA šGLY šLEU šGLU
> >> š> šVAL šHIE šGLN šPHE šTRP šPRO šLEU šVAL šGLU šILE
> >> š> šGLN šCYX šSER šPRO šASP šLEU šARG šPHE šPHE šLEU
> >> š> šCYX šSER šMET šTYR šTHR šPRO šILE šCYX šLEU šPRO
> >> š> šASP šTYR šHIE šLYS šPRO šLEU šPRO šPRO šCYX šARG
> >> š> šSER šVAL šCYX šGLU šARG šALA šLYS šALA šGLY šCYX
> >> š> šSER šPRO šLEU šMET šARG šGLN šTYR šGLY šPHE šALA
> >> š> šTRP šPRO šGLU šARG šMET šSER šCYX šASP šARG šLEU
> >> š> šPRO šVAL šLEU šGLY šARG šASP šALA šGLU šVAL šLEU
> >> š> šCYX šMET šASP šTYR šASN šARG šNa+ šNa+ šNa+ šNa+
> >> š> šNa+ šNa+ šWAT šWAT šWAT šWAT šWAT šWAT šWAT šWAT
> >> š> šWAT šWAT šWAT šWAT šWAT šWAT šWAT šWAT šWAT šWAT
> >> š> š...
> >> š> šWAT
> >> š>
> >> š> šSetting box to be an exact truncated octahedron, angle is 109.471221
> >> š>
> >> š> PTRAJ: Processing input from "STDIN" ...
> >> š> Connection to node-12-10 closed by remote host.
> >> š> Connection to node-12-10 closed.
> >> š>
> >> š>
> >> š> 28.08.10, 00:08, "Daniel Roe" :
> >> š>
> >> š>> It's difficult to comment in too much detail based on the information
> >> š>> šyou have given. What version of ptraj are you running, and what
> >> š>> šcompilers did you use to build it? How big is the system you are
> >> š>> šrunning ptraj on, and how long is the trajectory you are processing?
> >> š>> šWhat sort of actions are you performing (I assume at least an RMSD
> >> š>> šcalculation from the name of your input file). What output does ptraj
> >> š>> šgive, if any? Also, why are you using 'mpirun' for a single processor
> >> š>> šjob? If you don't use 'mpirun' does it complete faster?
> >> š>>
> >> š>> š-Dan
> >> š>>
> >> š>> šOn Fri, Aug 27, 2010 at 12:33 PM, Andrew Voronkov šwrote:
> >> š>> š> It's a bit strange but while running on clusted in a single processor mode ptraj runs for hours:
> >> š>> š> mpirun -as single -np 1 -maxtime 360 ptraj fzd5-7_w.prmtop < fzd5_w_ca.calc_rms
> >> š>> š> while on central node just several minutes.
> >> š>> š> I am not sure if this is a question to Amber mailing list or cluster admins, but I don't understand what causes such a difference.
> >> š>> š>
> >> š>> š> Sincerely yours,
> >> š>> š> Andrew
> >> š>> š>
> >> š>> š> _______________________________________________
> >> š>> š> AMBER mailing list
> >> š>> š> AMBER.ambermd.org
> >> š>> š> http://lists.ambermd.org/mailman/listinfo/amber
> >> š>> š>
> >> š>>
> >> š>> š_______________________________________________
> >> š>> šAMBER mailing list
> >> š>> šAMBER.ambermd.org
> >> š>> šhttp://lists.ambermd.org/mailman/listinfo/amber
> >> š>>
> >> š>>
> >> š>
> >> š> _______________________________________________
> >> š> AMBER mailing list
> >> š> AMBER.ambermd.org
> >> š> http://lists.ambermd.org/mailman/listinfo/amber
> >> š>
> >>
> >> š_______________________________________________
> >> šAMBER mailing list
> >> šAMBER.ambermd.org
> >> šhttp://lists.ambermd.org/mailman/listinfo/amber
> >>
> >>
> >
> > _______________________________________________
> > AMBER mailing list
> > AMBER.ambermd.org
> > http://lists.ambermd.org/mailman/listinfo/amber
> >
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>
>

_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Mon Aug 30 2010 - 02:30:04 PDT
Custom Search