Dear Kailee,
My suspicion here is that there is some confusion between where you think
qsub actually runs your job and where it really does. Who setup your system
and more importantly who set up the various queues etc? It looks to me like
the system you are compiling and running interactively on is one
architecture (say x86) while the actual compute nodes that your job gets run
on through qsub are a different architecture (say x86_64). Unfortunately
without seeing all the details about how your cluster is setup, how the
queuing system works etc it is very difficult to help much more. Basically
the problem you are seeing comes down to two possibilities:
1) The architecture you are on interactively does not match the compute node
architecture.
2) The path/filesystem structure is different on the compute nodes such that
it cannot locate the file.
The 1st one is more likely as I would expect the second problem to give the
error "Path not found". Hence you should probably speak to whoever is in
charge of your cluster and see if they can help some more. And or talk to
other users of the cluster and see what they do.
All the best
Ross
/\
\/
|\oss Walker
| HPC Consultant and Staff Scientist |
| San Diego Supercomputer Center |
| Tel: +1 858 822 0854 | EMail:- ross.rosswalker.co.uk |
|
http://www.rosswalker.co.uk <
http://www.rosswalker.co.uk/> | PGP Key
available on request |
Note: Electronic Mail is not secure, has no guarantee of delivery, may not
be read every day, and should not be used for urgent or sensitive issues.
_____
From: owner-amber.scripps.edu [mailto:owner-amber.scripps.edu] On Behalf Of
Kailee
Sent: Monday, December 18, 2006 09:07
To: amber.scripps.edu
Subject: Re: AMBER: sander: cannot execute binary file
Hi, all,
Sorry for the unfinished email I just sent out, and looking forward to your
suggestions.
Best regards,
Kailee
On 12/18/06, Kailee <kaileeamber.googlemail.com> wrote:
Hi, all,
Thanks for your help, I looked at the run script for the tests and wrote a
similar script according to the test script, which looks like (named as
Run.test):
----------------------------------------------------------------------------
------------
#!/bin/bash
cat << eof > in.md <
http://in.md/>
test
&cntrl
imin = 0, irest = 1, ntx =7,
ntb = 2, pres0 = 1.0, ntp =1,
taup = 2.0,
cut = 10, ntr =0,
ntc = 2, ntf = 2,
tempi = 300.0, temp0 = 300.0,
ntt = 3, gamma_ln = 1.0,
nstlim = 10000, dt = 0.001,
ntpr = 200, ntwx = 200, ntwr = 2000
/
eof
$DO_PARALLEL $AMBERHOME/exe/sander -O -i in.md <
http://in.md/> -o test3.out
-c model_mass_md2.rst -p model_mass.prmtop -r test3.rst -x test3.mdcrd
----------------------------------------------------------------------------
-----------------
and then I used nohup to run this job as:
nohup ./Run.test &
And it ran OK without any errors, gave me same outputs as I ran on other
machines, therefore, I think it is as David suggested, it is because of the
qsub I used to submit my job. And I tried to change the shell script i used,
still no luck,
On 12/15/06, David A. Case <case.scripps.edu> wrote:
On Fri, Dec 15, 2006, Kailee wrote:
>
> Thanks for your reply, as Ross has suggested, I have checked the location
of
> sander, it was correct, and also I can see this file using my account. As
to
> the architecture, the node I was trying to run was the one I compiled the
> amber.
I know that this may not be much help, but (as I understand it)
this is a situation where the test cases all pass, but your attempt to run
your own job never even gets far enough to look at your input, but just
reports that it cannot execute the binary. You need to look for *anything*
that is different between what you are doing and what the test cases are
doing. For example, try running the test cases "by hand" in the same way
your run your job. I didn't see anything in what you reported that looks
wrong, so you will probably have to poke around yourself. The good thing is
that the test cases indicate that you do have a working installation.
My best guess is that you used "qsub" for your job, but that the
test cases didn't use the queuing system. If this is in fact the cause of
the
problem, then at least you will have narrowed it down.
...regards...dac
-----------------------------------------------------------------------
The AMBER Mail Reflector
To post, send mail to amber.scripps.edu
To unsubscribe, send "unsubscribe amber" to majordomo.scripps.edu
<mailto:majordomo.scripps.edu>
-----------------------------------------------------------------------
The AMBER Mail Reflector
To post, send mail to amber.scripps.edu
To unsubscribe, send "unsubscribe amber" to majordomo.scripps.edu
Received on Wed Dec 20 2006 - 06:07:30 PST