Hi,
Thanks for the reply and the original report.
This is under investigation (issue 31 in amber's nonpublic tracking).
scott
On Fri, Mar 08, 2019 at 07:05:50PM +0000, Charles Lin wrote:
> So I looked into this, but I'm not familiar with sander so I don't really know the details of what's going on.
>
> The run_lmod command calls lmodC which later calls a gradient descent that does the sander force call.
>
> In general the bug seems like it stems from an mpi communication issue within the lmodC library ($AMBERHOME/AmberTools/src/sff/lmodC.c) where the single mpi broadcast of the random seed doesn't work. If you comment out this broadcast the code seems to work fine.
>
> My initial thought which I may be wrong is that the lmodC library actual runs serially, as what comes out of it is essentially a set of coordinates to be used in a force calculation, if that is the case then syncing up the random seed probably isn't important as the force calculation does a broadcast of all the forces (since AMBER in general doesn't do domain decomposition and has a replicated data approach to comm).
>
> If someone more experienced in how sander works wants to look into it that'd be great.
>
> -Charlie
>
> ???On 2/24/19, 5:29 PM, "Istvan Kolossvary" <istvan.kolossvary.hu> wrote:
>
> Hi, I noticed that LMOD tests were disabled in the sander test suite, so
> this problem may have been overlooked. I am running tests with both xmin
> and lmod in sander using sander.MPI. The xmin minimizer works fine,
> however, the lmod minimizer has a problem with the exact same system,
> which is very strange since lmod is calling xmin for all force
> computations. In $AMBERHOME/AmberTools/src/sander/force.F90 (amber18),
> line 258
> call mpi_bcast(xx(lcrd), 3*natom, MPI_DOUBLE_PRECISION, 0, &
> commsander, ierr)
> The call never returns.
> ! xx: global real array (holding e.g. coordinates at
> position
> ! l15 etc., see locmem.F90)
> In this case the xx array holds coords and forces. lcrd=12481,
> natom=6240, commsander=0, and ierr=0. As I said the exact same system
> works fine with xmin.
>
> The lmod job works fine with serial sander.
>
> Does anyone have any suggestion?
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Fri Mar 08 2019 - 18:30:02 PST