Re: [AMBER] very slow work of ptraj on cluster

From: Daniel Roe <daniel.r.roe.gmail.com>
Date: Sat, 28 Aug 2010 08:54:54 -0400

That certainly is strange behavior. One reason this might be happening
is you are directing your input file in as standard input, which in
the context of having to submit with another command (mpirun) could be
causing strange behavior; the ptraj input file might be redirected to
mpirun instead of ptraj. In particular this seems odd:

> Connection to node-25-06 closed by remote host.
> Connection to node-25-06 closed.

Try running ptraj without the redirect operator:

mpirun -as single -np 1 -maxtime 360 ptraj fzd5-7_w.prmtop fzd5_w_ca.calc_rms

Also, I would recommend you upgrade to AmberTools 1.4
(http://ambermd.org/#AmberTools). There are significant performance
improvements when compared to amber 10 ptraj. If you do decide to run
AT1.4, be sure to apply all bugfixes after downloading
(http://ambermd.org/bugfixesat.html).

Let me know if you have any more issues - good luck!

-Dan

On Sat, Aug 28, 2010 at 7:33 AM, Andrew Voronkov <drugdesign.yandex.ru> wrote:
> This is Amber10 package. About compillers I am not sure, but this is single processor run, not parallel, mpirun I just use to submit it in the queue on cluster according to description on how single processor mode should be run there. System has about 250 amino acids, four trajectories, each 5 ns. Only rms Ca calculation for now. When I don't use mpirun - on central node it takes just few minutes to the task to be completed, but there is pretty much trajectories and running software on central node is not tolerated by administration and from other side I want to put all trajectories analysis in the shell script.
> Here is calc_rms file:
> trajin fzd5-7_w_prod1.mdcrd
> trajin fzd5-7_w_prod2.mdcrd
> trajin fzd5-7_w_prod3.mdcrd
> trajin fzd5-7_w_prod4.mdcrd
> rms first mass out /home/voronkov/RMS/fzd5-7_w_prod1.rms .CA time 10
>
> Here is cluster lof file ptraj.out-114713:
>
>  \-/
>  -/-   PTRAJ: a utility for processing trajectory files
>  /-\
>  \-/   Version: "AMBER 10.0 integrated" (2/15/2008)
>  -/-   Executable is: "ptraj"
>  /-\
>  \-/   Residue labels:
>
>  ALA  PRO  VAL  CYX  GLN  GLU  ILE  THR  VAL  PRO
>  MET  CYX  ARG  GLY  ILE  GLY  TYR  ASN  LEU  THR
>  HIE  MET  PRO  ASN  GLN  PHE  ASN  HIE  ASP  THR
>  GLN  ASP  GLU  ALA  GLY  LEU  GLU  VAL  HIE  GLN
>  PHE  TRP  PRO  LEU  VAL  GLU  ILE  GLN  CYX  SER
>  PRO  ASP  LEU  ARG  PHE  PHE  LEU  CYX  SER  MET
>  TYR  THR  PRO  ILE  CYX  LEU  PRO  ASP  TYR  HIE
>  LYS  PRO  LEU  PRO  PRO  CYX  ARG  SER  VAL  CYX
>  GLU  ARG  ALA  LYS  ALA  GLY  CYX  SER  PRO  LEU
>  MET  ARG  GLN  TYR  GLY  PHE  ALA  TRP  PRO  GLU
>  ARG  MET  SER  CYX  ASP  ARG  LEU  PRO  VAL  LEU
>  GLY  ARG  ASP  ALA  GLU  VAL  LEU  CYX  MET  ASP
>  TYR  ASN  ARG  ALA  PRO  VAL  CYX  GLN  GLU  ILE
>  THR  VAL  PRO  MET  CYX  ARG  GLY  ILE  GLY  TYR
>  ASN  LEU  THR  HIE  MET  PRO  ASN  GLN  PHE  ASN
>  HIE  ASP  THR  GLN  ASP  GLU  ALA  GLY  LEU  GLU
>  VAL  HIE  GLN  PHE  TRP  PRO  LEU  VAL  GLU  ILE
>  GLN  CYX  SER  PRO  ASP  LEU  ARG  PHE  PHE  LEU
>  CYX  SER  MET  TYR  THR  PRO  ILE  CYX  LEU  PRO
>  ASP  TYR  HIE  LYS  PRO  LEU  PRO  PRO  CYX  ARG
>  SER  VAL  CYX  GLU  ARG  ALA  LYS  ALA  GLY  CYX
>  SER  PRO  LEU  MET  ARG  GLN  TYR  GLY  PHE  ALA
>  TRP  PRO  GLU  ARG  MET  SER  CYX  ASP  ARG  LEU
>  PRO  VAL  LEU  GLY  ARG  ASP  ALA  GLU  VAL  LEU
>  CYX  MET  ASP  TYR  ASN  ARG  Na+  Na+  Na+  Na+
>  Na+  Na+  WAT  WAT  WAT  WAT  WAT  WAT  WAT  WAT
>  WAT  WAT  WAT  WAT  WAT  WAT  WAT  WAT  WAT  WAT
>  ...
>  WAT
>
>  Setting box to be an exact truncated octahedron, angle is 109.471221
>
> PTRAJ: Processing input from "STDIN" ...
> Connection to node-25-06 closed by remote host.
> Connection to node-25-06 closed.
>
>
> ptraj.rep-114713
>
> Task     : ptraj
> Args     : ptraj
> Nproc    : 1
> Exit code: unknown
> Note     : Time limit exceeded
> Output in: /home/voronkov/fzd5-ions-tip3p-1ijy/ptraj.out-114713
> Work dir : /home/voronkov/fzd5-ions-tip3p-1ijy
> Work time: 6 hours 0 minutes 6 seconds
> Started  : Fri Aug 27 19:55:33 2010
> Nodes    : node-25-06:1
>
>
>  \-/
>  -/-   PTRAJ: a utility for processing trajectory files
>  /-\
>  \-/   Version: "AMBER 10.0 integrated" (2/15/2008)
>  -/-   Executable is: "ptraj"
>  /-\
>  \-/   Residue labels:
>
>  ALA  PRO  VAL  CYX  GLN  GLU  ILE  THR  VAL  PRO
>  MET  CYX  ARG  GLY  ILE  GLY  TYR  ASN  LEU  THR
>  HIE  MET  PRO  ASN  GLN  PHE  ASN  HIE  ASP  THR
>  GLN  ASP  GLU  ALA  GLY  LEU  GLU  VAL  HIE  GLN
>  PHE  TRP  PRO  LEU  VAL  GLU  ILE  GLN  CYX  SER
>  PRO  ASP  LEU  ARG  PHE  PHE  LEU  CYX  SER  MET
>  TYR  THR  PRO  ILE  CYX  LEU  PRO  ASP  TYR  HIE
>  LYS  PRO  LEU  PRO  PRO  CYX  ARG  SER  VAL  CYX
>  GLU  ARG  ALA  LYS  ALA  GLY  CYX  SER  PRO  LEU
>  MET  ARG  GLN  TYR  GLY  PHE  ALA  TRP  PRO  GLU
>  ARG  MET  SER  CYX  ASP  ARG  LEU  PRO  VAL  LEU
>  GLY  ARG  ASP  ALA  GLU  VAL  LEU  CYX  MET  ASP
>  TYR  ASN  ARG  ALA  PRO  VAL  CYX  GLN  GLU  ILE
>  THR  VAL  PRO  MET  CYX  ARG  GLY  ILE  GLY  TYR
>  ASN  LEU  THR  HIE  MET  PRO  ASN  GLN  PHE  ASN
>  HIE  ASP  THR  GLN  ASP  GLU  ALA  GLY  LEU  GLU
>  VAL  HIE  GLN  PHE  TRP  PRO  LEU  VAL  GLU  ILE
>  GLN  CYX  SER  PRO  ASP  LEU  ARG  PHE  PHE  LEU
>  CYX  SER  MET  TYR  THR  PRO  ILE  CYX  LEU  PRO
>  ASP  TYR  HIE  LYS  PRO  LEU  PRO  PRO  CYX  ARG
>  SER  VAL  CYX  GLU  ARG  ALA  LYS  ALA  GLY  CYX
>  SER  PRO  LEU  MET  ARG  GLN  TYR  GLY  PHE  ALA
>  TRP  PRO  GLU  ARG  MET  SER  CYX  ASP  ARG  LEU
>  PRO  VAL  LEU  GLY  ARG  ASP  ALA  GLU  VAL  LEU
>  CYX  MET  ASP  TYR  ASN  ARG  Na+  Na+  Na+  Na+
>  Na+  Na+  WAT  WAT  WAT  WAT  WAT  WAT  WAT  WAT
>  WAT  WAT  WAT  WAT  WAT  WAT  WAT  WAT  WAT  WAT
>  ...
>  WAT
>
>  Setting box to be an exact truncated octahedron, angle is 109.471221
>
> PTRAJ: Processing input from "STDIN" ...
> Connection to node-12-10 closed by remote host.
> Connection to node-12-10 closed.
>
>
> 28.08.10, 00:08, "Daniel Roe" <daniel.r.roe.gmail.com>:
>
>> It's difficult to comment in too much detail based on the information
>>  you have given. What version of ptraj are you running, and what
>>  compilers did you use to build it? How big is the system you are
>>  running ptraj on, and how long is the trajectory you are processing?
>>  What sort of actions are you performing (I assume at least an RMSD
>>  calculation from the name of your input file). What output does ptraj
>>  give, if any? Also, why are you using 'mpirun' for a single processor
>>  job? If you don't use 'mpirun' does it complete faster?
>>
>>  -Dan
>>
>>  On Fri, Aug 27, 2010 at 12:33 PM, Andrew Voronkov  wrote:
>>  > It's a bit strange but while running on clusted in a single processor mode ptraj runs for hours:
>>  > mpirun -as single -np 1 -maxtime 360 ptraj fzd5-7_w.prmtop < fzd5_w_ca.calc_rms
>>  > while on central node just several minutes.
>>  > I am not sure if this is a question to Amber mailing list or cluster admins, but I don't understand what causes such a difference.
>>  >
>>  > Sincerely yours,
>>  > Andrew
>>  >
>>  > _______________________________________________
>>  > AMBER mailing list
>>  > AMBER.ambermd.org
>>  > http://lists.ambermd.org/mailman/listinfo/amber
>>  >
>>
>>  _______________________________________________
>>  AMBER mailing list
>>  AMBER.ambermd.org
>>  http://lists.ambermd.org/mailman/listinfo/amber
>>
>>
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>

_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Sat Aug 28 2010 - 06:00:03 PDT
Custom Search