Re: [AMBER] Max relative errors questions in parallel and cuda.serial test

From: 石谷沁 <guqin.shi.qilu-pharma.com>
Date: Thu, 22 Apr 2021 02:56:57 +0000

I double-checked the cuda.serial log. The possible failures all came out at DPFP:
==============================================================
cd myoglobin/ && ./Run_md_myoglobin_igb7 DPFP yes
Note: The following floating-point exceptions are signalling: IEEE_INVALID_FLAG IEEE_UNDERFLOW_FLAG IEEE_DENORMAL
diffing myoglobin_md_igb7.out.GPU_DPFP with myoglobin_md_igb7.out
possible FAILURE: check myoglobin_md_igb7.out.dif

According to manual, SPFP is the default precision model for pmemd.cuda. And I didn’t get any failures report on SPFP. So I guess I can ignore those errors with DPFP in most general situations?

Thanks,
Guqin

发件人: 石谷沁
发送时间: 2021年4月22日 10:50
收件人: AMBER Mailing List <amber.ambermd.org>
主题: Max relative errors questions in parallel and cuda.serial test

Dear Amber community,

I recently installed AMBER20 and AMBERTools20. During the test phase, there are a few failed comparisons that cannot be ignored, at least indicated by log file. Details could be found at the end of the message.

The differences in parallel test appear to be in rism_solventPotentialEnergy in erism.pme.out.dif, etc.
The differences in cuda.serial test appear to be in Etot and EPtot in irest1_ntt0_igb7_ntc2.out.dif, myoglobin_md_igb7.out.dif, etc.

The absolute errors in cuda.serial test are quite large… due to the magnitude of energy.
The maximum relative errors are usually at the level of e-01.

Can I safely ignore these differences? Especially the ones in cuda.serial related to Etot and EPtot…?

Thanks!
Guqin

PS:
my system is CentOS 7, 32 Xeon Gold cpus @3.3GHz, 1 Quadro RTX 5000
CUDA toolkit is 11.1


Export DO_PARALLEL=”mpirun –np 4” && make test.parallel
tail test_at_parallel/2021-04-19_11-25-25.log
1161 file comparisons passed
4 file comparisons failed (0 of which can be ignored)
0 tests experienced errors

In .diff file:
possible FAILURE: check erism.pme.out.dif
/home/gqshi/amber20/AmberTools/test/rism3d.periodic/1ahoa
2c2
< rism_solventPotentialEnergy -6.1473273988673300E+3 -3.3560132001761408E+3 -2.7645733652922204E+3 -1.3500096132126034 -2.5390823785755686E+1
> rism_solventPotentialEnergy -6.1181761175251750E+3 -3.3843866027301365E+3 -2.7061568218185312E+3 -1.6108250062402840 -2.6021867970266317E+1

### Maximum absolute error in matching lines = 2.61e-01 at line 2 field 5
### Maximum relative error in matching lines = 2.45e-01 at line 5 field 5

possible FAILURE: check erism.pme.out.dif
/home/gqshi/amber20/test/rism3d/1ahoa
3c3
< rism_solventPotentialEnergy -6.1473273988683177E+3 -3.3560132001766083E+3 -2.7645733652927374E+3 -1.3500096132125337 -2.5390823785758968E+1
> rism_solventPotentialEnergy -6.1181761175271131E+3 -3.3843866027308341E+3 -2.7061568218197699E+3 -1.6108250062405469 -2.6021867970267994E+1

### Maximum absolute error in matching lines = 2.61e-01 at line 3 field 5
### Maximum relative error in matching lines = 2.45e-01 at line 5 field 4



export CUDA_VISIBLE_DEVICES=0 && make test.cuda.serial
tail test_amber_cuda/2021-04-21_17-48-29.log

243 file comparisons passed
6 file comparisons failed (1 of which can be ignored)
0 tests experienced errors

In .diff file:
possible FAILURE: check irest1_ntt0_igb7_ntc2.out.dif
/home/gqshi/amber20/test/cuda/gb_ala3

131c131
< NSTEP = 12 TIME(PS) = 1050.006 TEMP(K) = 299.40 PRESS = 0.
> NSTEP = 12 TIME(PS) = 1050.006 TEMP(K) = 301.37 PRESS = 0.
132c132
< Etot = 21.2648 EKtot = 29.1533 EPtot = -7.8885
> Etot = 15.1451 EKtot = 29.3449 EPtot = -14.1998
133c133
< BOND = 7.7047 ANGLE = 14.2964 DIHED = 26.0152
> BOND = 7.6595 ANGLE = 14.3108 DIHED = 26.0126

167c167
< NSTEP = 18 TIME(PS) = 1050.009 TEMP(K) = 344.62 PRESS = 0.
> NSTEP = 18 TIME(PS) = 1050.009 TEMP(K) = 346.06 PRESS = 0.
168c168
< Etot = 21.2464 EKtot = 33.5563 EPtot = -12.3098
> Etot = 15.1271 EKtot = 33.6970 EPtot = -18.5699
169c169
< BOND = 6.5764 ANGLE = 12.9554 DIHED = 25.6576
> BOND = 6.6253 ANGLE = 13.1980 DIHED = 25.6610

### Maximum absolute error in matching lines = 6.12e+00 at line 132 field 3
### Maximum relative error in matching lines = 4.05e-01 at line 168 field 3
---------------------------------------
possible FAILURE: check myoglobin_md_igb7.out.dif
/home/gqshi/amber20/test/cuda/myoglobin
61c61
< NSTEP = 1 TIME(PS) = 1.502 TEMP(K) = 305.34 PRESS = 0.
> NSTEP = 1 TIME(PS) = 1.502 TEMP(K) = 305.16 PRESS = 0.
62c62
< Etot = -819.2397 EKtot = 1881.3075 EPtot = -2700.5472
> Etot = -1598.9356 EKtot = 1880.1797 EPtot = -3479.1153
65c65
< EELEC = -382.5872 EGB = -12266.1876 RESTRAINT = 0.
> EELEC = -382.5872 EGB = -13044.7557 RESTRAINT = 0.

85c85
< NSTEP = 5 TIME(PS) = 1.510 TEMP(K) = 286.43 PRESS = 0.
> NSTEP = 5 TIME(PS) = 1.510 TEMP(K) = 284.05 PRESS = 0.
86c86
< Etot = -808.7370 EKtot = 1764.7594 EPtot = -2573.4964
> Etot = -1588.0945 EKtot = 1750.1056 EPtot = -3338.2001
87c87
< BOND = 496.5916 ANGLE = 1594.7609 DIHED = 803.8983
> BOND = 486.2442 ANGLE = 1581.0681 DIHED = 799.8229

### Maximum absolute error in matching lines = 7.80e+02 at line 62 field 3
### Maximum relative error in matching lines = 9.64e-01 at line 86 field 3




***********免责声明*************

本电子邮件中包含的信息仅供指定的或授权的个人或团体使用。本电子邮件及附件中提到的信息可能是保密信息或者法律特许保密的信息。如果你不是指定收件人,对于邮件内容的任何披露、复制、散布或者任何针对邮件内容进行的行为都是违法行为,需要严格禁止。如果您误收该电子邮件,请立即通知本公司并从您的系统中删除全部原始信息。该邮件可能会对您的系统或者数据造成损坏,对此我公司不承担任何责任。除非与公司业务有关,否则本邮件中的观点、结论、或者其它包含在邮件中的信息均为发件人个人行为,并不代表我公司。我公司有权保留对收发邮件的监控权利。

***********Business Email Disclaimer**************

 This e-mail and any attachments are meant for the intended recipient only and may contain information belonging to Qilu Pharma that is privileged, confidential, proprietary, and/or otherwise protected or prohibited from disclosure. If you are not the correct recipient or received this e-mail erroneously, please inform the sender immediately and delete this mail from your system. Qilu Pharma state no liability for any damage to your system and data caused by this email. Unless this email is related to the business with the company, otherwise any views or opinions presented in this email are solely from the sender. Qilu Pharma has the right to monitor the sending and receiving of the e-mail.

_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Wed Apr 21 2021 - 20:00:03 PDT
Custom Search