Re: [AMBER] pmemd.cuda segfaults

From: <pavel.banas.upol.cz>
Date: Sat, 08 Mar 2014 09:39:13 +0100 (CET)

Dear all,
thank you for all your help and suggestions. Finallt we were able to solve
the problem by downgrading version of linux core on nodes. 

Using 3.11.8 core we were obtaining segfaults on both CPU and GPU caused
most likely by memory leaks. After downgrade to 3.8.13 the problems of CPU
code were solved, but retain on GPU and after further downgrade to 3.2.55
all segfaults disappear. So now we have healthy compilation even with intel
compilers and after tests we found that all cards are without any hardware
errors. 

Bytheway, on recent linux core, all Ross's tests ended with segfautls, while
now all of them pass and give the consistent energies. 
thank you very much,

Pavel


-- 
Pavel Banáš
pavel.banas.upol.cz
Department of Physical Chemistry, 
Palacky University Olomouc 
Czech Republic 
---------- Původní zpráva ----------
Od: Tru Huynh <tru.pasteur.fr>
Komu: AMBER Mailing List <amber.ambermd.org>
Datum: 6. 3. 2014 20:28:16
Předmět: Re: [AMBER] pmemd.cuda segfaults
"On Wed, Mar 05, 2014 at 09:09:15PM +0100, pavel.banas.upol.cz wrote:
> 
> Dear all,
> 
...
> 
> Please, does anybody have the same architecture (GPU 
> SuperWorkstations 7047GR-TPRF with Super X9DRG-QF motherboards)?
we have one of those running CentOS-5 x86_64
dmidecode 
Manufacturer: Supermicro
Product Name: X9DRG-QF
.
+------------------------------------------------------+ 
| NVIDIA-SMI 5.325.15 Driver Version: 325.15 | 
|-------------------------------+----------------------+--------------------
--+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+====================
==|
| 0 GeForce GTX TITAN Off | 0000:03:00.0 N/A | N/A |
| 31% 43C N/A N/A / N/A | 107MB / 6143MB | N/A Default |
+-------------------------------+----------------------+--------------------
--+
| 1 GeForce GTX TITAN Off | 0000:04:00.0 N/A | N/A |
| 33% 47C N/A N/A / N/A | 87MB / 6143MB | N/A Default |
+-------------------------------+----------------------+--------------------
--+
| 2 GeForce GTX TITAN Off | 0000:83:00.0 N/A | N/A |
| 31% 44C N/A N/A / N/A | 87MB / 6143MB | N/A Default |
+-------------------------------+----------------------+--------------------
--+
| 3 GeForce GTX TITAN Off | 0000:84:00.0 N/A | N/A |
| 34% 50C N/A N/A / N/A | 87MB / 6143MB | N/A Default |
+-------------------------------+----------------------+--------------------
--+
We had the 4 initial cards replaced (then one of the second batch),
since then, no issue.
Cheers,
Tru
-- 
Dr Tru Huynh | http://www.pasteur.fr/recherche/unites/Binfs/
mailto:tru.pasteur.fr | tel/fax +33 1 45 68 87 37/19
Institut Pasteur, 25-28 rue du Docteur Roux, 75724 Paris CEDEX 15 France 
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber"
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Sat Mar 08 2014 - 01:00:03 PST
Custom Search