Well, that is interesting...
I always thought there was a small chance of this happening. However,
since this requires that many jobs gets into the nodes at either the
SAME microsecond count or that the gettimeofday() from different nodes
be synchronized 'just so' to get you in trouble. I do not believe this
is the same as people's birthdays, since those are uniformly
distributed, etc etc.
The fact that the gettimeofday() is not accurate to the microseconds is
important for timing processes, but should be as important for what we
use it for.
Anyways, of your options below, I would not go with a or b, since those
are bound to be heavily machine dependent.
c looks like a good idea, but you need to create patches for sander,
pmemd and pmemd.cuda
Thanks for looking into this !
Adrian
On 6/20/17 2:29 PM, Chris Moth wrote:
> I have just had the ?interesting? experience that "ig=-1" does not
> always generate unique random seeds, and I thought I should share that
> experience.....
>
> I have used "ig = -1" to randomize seeds for some time. I used it
> without hesitation as I worked through the ASMD tutorial here:
>
> http://ambermd.org/tutorials/advanced/tutorial26/
>
> However, intriguingly, on our high performance cluster at Vanderbilt,
> when I submit 100 jobs-at-a-time (an ASMD "stage"), I am seeing
> duplicate ig values returned every few hundred runs.
>
> This could be imaginably attributable to a combination of factors:
>
> 1) Just as in a room of 23 people, there is a 50% chance that 2 will
> share the same birthday... in a collection of 100 MD jobs, there is
> around an approx 1% chance that 2 will share the same microsecond start
> time (Apologies in advance if I butchered some math.)
> 2) A high performance cluster, launching multiple simultaneous jobs "at
> once", could imaginably turn a 1% chance into a 10% chance on highly
> synchronized nodes.
> 3) The resolution of the gettimeofday() function (called from
> pmemd_clib.c) could be significantly lower than one microsecond in
> practice (if google is to be believed)
>
> https://www.google.com/#q=resolution+of+gettimeofday
>
> It admittedly a nuisance issue. The choices are:
>
> a) Ignore the issue entirely. Statistically, it's likely not too
> important if only one in every 200 md runs is a duplicate run.
>
> b) Set random seeds with environment variables available in the high
> performance cluster or ASMD job launch ecosystem *i**nstead* of
> "trusting" ig=-1 (examles: task IDs, (ASMD stage*10000 + ASMD_run),
> etc) (in which case we should update the ASMD tutorial - so that ig=-1
> at least has a caution around it)
>
> c) Modify the pmemd code to enhance randomness of ig=1, by adding
> entropy. (The current 0 to 999999 range is only using 20 bits of the 31
> that could be used).
>
> In case "c" is interesting.... read on....
>
> Below is a sketch of code that honors the current microsecond concept,
> but adds another 1000 possibilities based on the contents of
> /dev/urandom on a linux system. Portability issues are rightly of great
> concern to the community. You could activate code like this in response
> to a new "ig = -2" possibility, or in response to install-time's
> "./configure"'s reporting that /dev/urandom is available. The code
> below does not require any new third party libraries (like a "better"
> entropy generation scheme, or a guid generator - might require)... and I
> think it will work on any linux system I am aware of from the last decade.
>
> Again, this code below is not intended to be _the_ solution - just some
> food-for-thought if the team should consider enhancing randomness beyond
> the current 0-999999 limited sys clock. You might want "ig = -3" (say)
> to init all 31 bits of the seed from /dev/urandom.........
>
> #include <stdio.h>
> #include <unistd.h>
> #include <sys/types.h>
> #include <sys/stat.h>
> #include <fcntl.h>
> #include <sys/time.h>
> #include <assert.h>
>
> main(int argc, const char** argv)
> {
> struct timeval my_tv;
> int entropyRead;
> int entropyBits;
>
> int entropyFile = open("/dev/urandom",O_RDONLY);
> assert(entropyFile != -1);
>
> entropyRead = read(entropyFile,&entropyBits,sizeof(entropyBits));
> assert (entropyRead == sizeof(entropyBits));
> close(entropyFile);
> entropyBits &= 0x7fffff; // Mask off sign bit
> entropyBits %= 1000;
>
> // What you do today in pmemd_clib.c
> gettimeofday(&my_tv,NULL);
>
> printf("Today's random seed: %09d\n",(int)my_tv.tv_usec);
> printf("Enhanced random seed: %09d\n",(int)my_tv.tv_usec +
> entropyBits * 1000000);
> return 0;
>
> }
>
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
--
Dr. Adrian E. Roitberg
University of Florida Research Foundation Professor
Department of Chemistry
University of Florida
roitberg.ufl.edu
352-392-6972
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Tue Jun 20 2017 - 12:00:04 PDT