Hey Guys,
We have been running amber v12 for awhile now but recently started having
problems running jobs on these gpu's with our jobs failing with the "all
CUDA-capable devices are busy or unavailable" error message. This is
surprising because there are currently no processes using these gpus:
--
# nvidia-smi
Wed Feb 26 21:29:03 2014
+------------------------------------------------------+
| NVIDIA-SMI 4.304.54 Driver Version: 304.54
|
|-------------------------------+----------------------+----------------------+
| GPU Name | Bus-Id Disp. | Volatile Uncorr.
ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute
M. |
|===============================+======================+======================|
| 0 Tesla M2050 | 0000:06:00.0 Off |
Off |
| N/A N/A P0 N/A / N/A | 0% 6MB / 3071MB | 0% E.
Process |
+-------------------------------+----------------------+----------------------+
| 1 Tesla M2050 | 0000:14:00.0 Off |
Off |
| N/A N/A P0 N/A / N/A | 0% 6MB / 3071MB | 0% E.
Process |
+-------------------------------+----------------------+----------------------+
| 2 Tesla M2050 | 0000:11:00.0 Off |
Off |
| N/A N/A P0 N/A / N/A | 0% 6MB / 3071MB | 0% E.
Process |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Compute processes: GPU
Memory |
| GPU PID Process name
Usage |
|=============================================================================|
| *No running compute processes found *
|
+-----------------------------------------------------------------------------+
# cat /sys/module/nvidia/version
304.54
--
Thanks in advance for your help.
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Wed Feb 26 2014 - 22:00:03 PST