Hi Pavel,
Well that's a completely new one for me. And also VERY worrying. I've
never used Core linux (or even heard of it until now) but this makes it
sound like some seriously Ghetto Linux Distro. Do other programs fail on
it or just AMBER? - Either way it would scare me to use something flakey
like that. Is it even on the list of supported Distro's for CUDA?
I'd be tempted to dump it completely (not sure I'd even trust the older
version) and switch to something more extensively tested. I like CentOS 6
- which is identical to RedHat EL6 but free. It also tends to be the most
supported in my experience.
All the best
Ross
On 3/8/14, 12:39 AM, "pavel.banas.upol.cz" <pavel.banas.upol.cz> wrote:
>Dear all,
>thank you for all your help and suggestions. Finallt we were able to
>solve
>the problem by downgrading version of linux core on nodes.
>
>Using 3.11.8 core we were obtaining segfaults on both CPU and GPU caused
>most likely by memory leaks. After downgrade to 3.8.13 the problems of
>CPU
>code were solved, but retain on GPU and after further downgrade to 3.2.55
>all segfaults disappear. So now we have healthy compilation even with
>intel
>compilers and after tests we found that all cards are without any
>hardware
>errors.
>
>Bytheway, on recent linux core, all Ross's tests ended with segfautls,
>while
>now all of them pass and give the consistent energies.
>thank you very much,
>
>Pavel
>
>
>--
>Pavel Banáš
>pavel.banas.upol.cz
>Department of Physical Chemistry,
>Palacky University Olomouc
>Czech Republic
>
>
>
>---------- Původní zpráva ----------
>Od: Tru Huynh <tru.pasteur.fr>
>Komu: AMBER Mailing List <amber.ambermd.org>
>Datum: 6. 3. 2014 20:28:16
>Předmět: Re: [AMBER] pmemd.cuda segfaults
>
>"On Wed, Mar 05, 2014 at 09:09:15PM +0100, pavel.banas.upol.cz wrote:
>>
>> Dear all,
>>
>...
>>
>> Please, does anybody have the same architecture (GPU
>> SuperWorkstations 7047GR-TPRF with Super X9DRG-QF motherboards)?
>
>we have one of those running CentOS-5 x86_64
>dmidecode
>Manufacturer: Supermicro
>Product Name: X9DRG-QF
>.
>+------------------------------------------------------+
>| NVIDIA-SMI 5.325.15 Driver Version: 325.15 |
>|-------------------------------+----------------------+------------------
>--
>--+
>| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
>| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
>|===============================+======================+==================
>==
>==|
>| 0 GeForce GTX TITAN Off | 0000:03:00.0 N/A | N/A |
>| 31% 43C N/A N/A / N/A | 107MB / 6143MB | N/A Default |
>+-------------------------------+----------------------+------------------
>--
>--+
>| 1 GeForce GTX TITAN Off | 0000:04:00.0 N/A | N/A |
>| 33% 47C N/A N/A / N/A | 87MB / 6143MB | N/A Default |
>+-------------------------------+----------------------+------------------
>--
>--+
>| 2 GeForce GTX TITAN Off | 0000:83:00.0 N/A | N/A |
>| 31% 44C N/A N/A / N/A | 87MB / 6143MB | N/A Default |
>+-------------------------------+----------------------+------------------
>--
>--+
>| 3 GeForce GTX TITAN Off | 0000:84:00.0 N/A | N/A |
>| 34% 50C N/A N/A / N/A | 87MB / 6143MB | N/A Default |
>+-------------------------------+----------------------+------------------
>--
>--+
>
>We had the 4 initial cards replaced (then one of the second batch),
>since then, no issue.
>
>Cheers,
>
>Tru
>--
>Dr Tru Huynh | http://www.pasteur.fr/recherche/unites/Binfs/
>mailto:tru.pasteur.fr | tel/fax +33 1 45 68 87 37/19
>Institut Pasteur, 25-28 rue du Docteur Roux, 75724 Paris CEDEX 15 France
>
>_______________________________________________
>AMBER mailing list
>AMBER.ambermd.org
>http://lists.ambermd.org/mailman/listinfo/amber"
>_______________________________________________
>AMBER mailing list
>AMBER.ambermd.org
>http://lists.ambermd.org/mailman/listinfo/amber
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Sat Mar 08 2014 - 10:30:03 PST