Hi all,
As seen before on the ML, more and more people are using AMBER on linux
clusters.
We bought such one, and now we have 4 dual-nodes athlons 1.2Ghz.
I compile amber with mpich-1.2.2.3 following the procedure described in
README and README.parallel, using Machine.g77_mpich.
The program compiled well.
But i'm not able to go on with the test procedure.
I got losts of errors like :
diffing run3e_w_300.out.save with run3e_w_300.out
PASSED
diffing run3e_w_300.mc.save with run3e_w_300.mc
PASSED
cd lj_lj/test; ./run
b run1_b_300
[1] MPI Abort by user Aborting program !
[1] Aborting program!
Error in setpar: check code, input
p1_28946: p4_error: : 1
bm_list_15437: p4_error: net_recv read: probable EOF on socket: 1
make[1]: *** [test] Interruption
make: *** [CMC] Interruption
I know not all the code is parallelized, but i think the test procedure cope
with it, so i don't know why it is not working ?
I tried with a well-know script for mpi, and i get same errors when i try to
run the script.
So as you can see, i tried to modify P4_GLOBMEMSIZE, but it doesn't worked
P4 est de : 100000000
DO est de :/usr/local/mpich-1.2.2.3/bin/mpirun -np 4 -nolocal
dirbas est de : /home/admin/Tests/Test_4_proc/DM_Tcte300_H2O/
La valeur de i est de : 2
Dynamique #2\tGCCGGGTCGC.dn300K2\tven nov 30 15:56:59 CET 2001
All processors started
3 - MPI_RECV : Invalid rank 133
[3] Aborting program !
[3] Aborting program!
p3_9590: p4_error: : 8262
Connection failed for reason: : Connection refused
Connection failed for reason: : Connection refused
rm_l_1_13537: (2.644713) net_recv failed for fd = 6
rm_l_1_13537: p4_error: net_recv read, errno = : 104
Connection failed for reason: : Connection refused
Connection failed for reason: : Connection refused
bm_list_28969: (4.683018) net_recv failed for fd = 5
bm_list_28969: p4_error: net_recv read, errno = : 104
Last of all : i patched until the patch 27 (Thanks D. Case, i could use the
command patch, i had to do it manually for those from B. Ross), checked that
i did use the right version.
Again, it compiled well, all my environment variables worked (PATH and
AMBERHOME).
Again, the tests failed.
I have to mention that with mpich-1.2.2.3, there are tests provided to check
wheter everything is working or not, and thaty these tests worked.
I hope i provided enough information, and that you could help me :-))
PS : i have been told that because of PME, the amber could not go up to 2
processors, is that true ?
Regards,
Stephane Teletchea
Received on Fri Nov 30 2001 - 07:36:37 PST