Re: [AMBER] problem while running amber in parallel (wall time ?)

From: MUHAMMAD IMTIA SHAFIQ <imtiazshafiq.gmail.com>
Date: Fri, 29 Jan 2010 20:17:07 +0000

Dear Jason,

Thanks for your reply,

I was submitting my job like this

qsub -d /home/imtiaz/amber10/ script -V

here scrip is a file containing the job information as I mentioned in my last email. It was working fine until few days before. amber10 is my working directory and i was submitting job with the above command while in the amber10 directory. Now it is not working as I mentioned in my last email

on the cluster $AMBERHOME is already set to directory where Amber10 is installed.


Regards
Imtiaz




On 29 Jan 2010, at 20:06, Jason Swails wrote:

> Perhaps add a "cd $PBS_O_WORKDIR" to make sure the calculation is
> working in the same directory that the calculation was submitted from?
> It is unclear to me where this calculation would be performed
> (perhaps in the root directory?) without a "cd' statement.
>
> Moreover, the way you set your "path" variable isn't doing anything as
> far as I can tell (though admittedly my experience with csh variants
> is not extensive). First of all, you need to set environment
> variables with "setenv" rather than "set". Also, the path environment
> variable that includes directories to be searched for executables is
> PATH (unices are typically case-sensitive except for Mac OS X). That
> said, only directories should be added to PATH (not mpirun).
>
> On Fri, Jan 29, 2010 at 2:05 PM, imtiaz shafiq <imtiazshafiq.gmail.com> wrote:
>> Dear All,
>>
>> Have a nice day, I was able to run amber successfully in parallel few
>> days before using this qsub scrip
>>
>> #!/bin/tcsh
>> # This is file Run__amber
>>
>> #PBS -l nodes=1:ppn=4
>> #PBS -V
>> #PBS -N Imtiaz
>> #PBS -l walltime=00:59:00
>>
>> set path = (/home/imtiaz/amber10/src/lam-7.1.3
>> /opt/lam_install_intel/bin/mpirun $path .)
>
> It looks like you may be confusing the lam included with amber10 and a
> pre-existing installation of lam/mpi already on your cluster. If a
> previous installation of lam/mpi exists on your system already, then
> it is unnecessary to compile the version included with amber. You
> just have to make sure that you use the mpi compiler wrappers
> (mpif90/mpicc, etc) included with the mpi you plan to use for your
> simulations to compile amber10.
>
>>
>> echo "simulation started at" `date`
>>
>> /opt/lam/7.1.3/bin/lamboot
>>
>> /opt/lam/7.1.3/bin/mpirun -ssi rpi lamd N sander.MPI -O -i heat.in -o
>> heat.out -p ras-raf_solvated.prmtop -c min.rst -r
>> heat.rst -x heat.mdcrd -ref min.rst
>>
>
> perhaps you should have a "/opt/lam/7.1.3/bin/lamhalt" here as well?
> It's probably not required, but will help to clean up any rogue
> threads after a failed MPI run.
>
>> echo "simulation ended at" `date`
>>
>>
>> I am not sure now what happened with the same script when I submit a
>> qsub job, job is submitted but with no output and no error, even showq
>> does not show any job running. Our cluster admin is saying that it is
>> something related to Amber not to the cluster software and hardware
>
> If it was a problem related to amber, something would have been
> printed to stderr (which would have been the Imtiaz.e413948 file), and
> hopefully also an error message printed to the mdout file as well.
> The fact that no files were created means it's probably something in
> your submission script and not amber (though add a cd command so you
> know where files are being created).
>
>>
>> " ran a job for Imtiaz on cluster1 - came back with a walltime error -
>> that's not a system software/hardware problem."
>>
>> Please suggest something in this regards
>>
>> Is there some problem in my qsub script? if yes this was working fine
>> before as such?
>>
>> What could be potential problem with wall time?
>>
>> Here is an example screenshot
>>
>> [imtiaz.cluster1 amber10]$ qsub -d /home/mis9/amber_cdk2/ 1fin-min -V
>> 413948.cluster1
>> [imtiaz.cluster1 amber10]$
>> [imtiaz.cluster1 amber10]$ more Imtiaz.o413948
>> [imtiaz.cluster1 amber10]$ more Imtiaz.e413948
>> [imtiaz.cluster1 amber10]$
>>
>> * both the qsub o and e files related to the job id 413948 are empty
>
> This means nothing was written to either standard output or standard
> error (which eliminates any debugging information that could be
> gathered from those files).
>
>
> Good luck!
> Jason
> --
> ---------------------------------------
> Jason M. Swails
> Quantum Theory Project,
> University of Florida
> Ph.D. Graduate Student
> 352-392-4032
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber


_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Fri Jan 29 2010 - 12:30:03 PST
Custom Search