Thanks, Robin. The system admins are fixing some issues before I could create a module of amber12 for the user. The previous problem I have come in because some nodes on keeneland accidentally rolled back to old gpu driver version. The admins detected some nodes like this and fixed them already. We have no gpu on login node, so I am not sure you met the same problem.
On Sep 14, 2012, at 5:11 PM, Robin Betz wrote:
> Hi Shiquan,
>
> Ross is correct here-- attempting to run GPU code from the login node will
> result in errors. If you submit a job to run the tests, it should work
> fine. I've been running Amber successfully on Keeneland with CUDA 4.2 on
> the compute nodes but have encountered the problem you mentioned on the
> login node.
>
> Regards,
> Robin Betz
>
> On Wed, Sep 12, 2012 at 12:52 PM, Ross Walker <ross.rosswalker.co.uk> wrote:
>
>> Hi Shiquan,
>>
>> You should probably file this as a ticket with the Keeneland support
>> people. In order to use CUDA 4.2 you need to have driver v295 or later
>> installed which is clearly not the case here. Something is obviously wrong
>> with their setup since loading the module for cuda/4.2 should update the
>> driver since it is no use having the latest cuda with an old driver.
>>
>> It may just be the login node which has the ancient driver on. Try
>> requesting an interactive node through the queuing system (qsub -I) and
>> running the tests there (after setting AMBERHOME correctly). Alternatively
>> you could submit the test as a qsub script itself.
>>
>> Hopefully it is just the login node which has the mismatched driver.
>>
>> ps. I would make sure you clear the DO_PARALLEL environment variable as
>> well. Running the tests with that set to mpirun -np 8 is going to cause
>> all sorts of problems.
>>
>> All the best
>> Ross
>>
>>
>>
>> On 9/12/12 12:38 PM, "Su, Shiquan" <ssu2.utk.edu> wrote:
>>
>>> Dear Developers:
>>> I am a new user of amber. I tried to install amber12 on our center's
>>> machine (Keeneland, GATech & UTK & ORNL). I loaded the following modules,
>>> very similar to the setting we used to install abmer11.
>>> [shiquan1.kid117 amber12]$ module list
>>> Currently Loaded Modulefiles:
>>> 1) modules 4) gold 7) gcc/4.4.0 10)
>>> mpiexec/0.84
>>> 2) torque/2.5.11 5) openmpi/1.5.1-gnu 8) cuda/4.2 11)
>>> mvapich2/1.8
>>> 3) moab/6.1.5 6) PE-gnu 9) limic2/0.5.4
>>>
>>>
>>> I followed the instruction in AmberTools12.pdf, like:
>>> ./configure -cuda gnu
>>> make install
>>> make test.cuda
>>>
>>> The configuration and install steps seem to complete normally. But when I
>>> do make test.cuda, I got the following error message. I loaded cuda/4.2
>>> already. Why I still fail about "CUDA driver version". Which CUDA driver
>>> version I need. This is what I have on the machine. Thanks for any help:
>>> [shiquan1.kid117 amber12]$ nvidia-smi
>>> Wed Sep 12 15:37:09 2012
>>> +------------------------------------------------------+
>>>
>>> | NVIDIA-SMI 2.285.05 Driver Version: 285.05.33 |
>>>
>>>
>>> (cd test && make test.cuda)
>>> make[1]: Entering directory
>>> `/nics/e/sw/keeneland/amber/12/centos5.5_gnu4.4.0/amber12/test'
>>> ./test_amber_clean.sh
>>> ./test_amber_cuda.sh
>>> Using default PREC_MODEL = SPFP
>>> Warning. DO_PARALLEL is set to mpirun -np 8. This environment variable is
>>> being unset.
>>> make[2]: Entering directory
>>> `/nics/e/sw/keeneland/amber/12/centos5.5_gnu4.4.0/amber12/test'
>>> cd cuda && make -k test.pmemd.cuda PREC_MODEL=SPFP
>>> make[3]: Entering directory
>>> `/nics/e/sw/keeneland/amber/12/centos5.5_gnu4.4.0/amber12/test/cuda'
>>> ------------------------------------
>>> Running CUDA Implicit solvent tests.
>>> Precision Model = SPFP
>>> ------------------------------------
>>> cd gb_ala3/ && ./Run.igb1_ntc1_min SPFP
>>> /sw/keeneland/amber/12/centos5.5_gnu4.4.0/amber12/include/netcdf..
>>> mod
>>> cudaGetDeviceCount failed CUDA driver version is insufficient for CUDA
>>> runtime version
>>> ./Run.igb1_ntc1_min: Program error
>>> make[3]: [test.pmemd.cuda.gb.serial] Error 1 (ignored)
>>> cd gb_ala3/ && ./Run.irest1_ntt0_igb1_ntc1 SPFP
>>> /sw/keeneland/amber/12/centos5.5_gnu4.4.0/amber12/includee
>>> /netcdf.mod
>>> cudaGetDeviceCount failed CUDA driver version is insufficient for CUDA
>>> runtime version
>>> ./Run.irest1_ntt0_igb1_ntc1: Program error
>>> make[3]: [test.pmemd.cuda.gb.serial] Error 1 (ignored)
>>> ŠŠ..
>>> ŠŠ..
>>>
>>> _______________________________________________
>>> AMBER mailing list
>>> AMBER.ambermd.org
>>> http://lists.ambermd.org/mailman/listinfo/amber
>>
>>
>>
>> _______________________________________________
>> AMBER mailing list
>> AMBER.ambermd.org
>> http://lists.ambermd.org/mailman/listinfo/amber
>>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Fri Sep 14 2012 - 15:30:06 PDT