欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页

could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR

程序员文章站 2022-05-26 23:43:45
...

当我们深度学习做训练的时候,偶尔会发生这种情况,我把源错误贴出来:

Epoch 1/16
2019-09-11 09:34:11.000335: E C:\users\nwani\_bazel_nwani\mmtm6wb6\execroot\org_tensorflow\tensorflow\stream_executor\cuda\cuda_dnn.cc:455] could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2019-09-11 09:34:11.000570: F C:\users\nwani\_bazel_nwani\mmtm6wb6\execroot\org_tensorflow\tensorflow\core\kernels\conv_ops.cc:713] Check failed: stream->parent()->GetConvolveAlgorithms( conv_parameters.ShouldIncludeWinogradNonfusedAlgo<T>(), &algorithms) 

Process finished with exit code -1073740791 (0xC0000409)

就是这么个事,按照我的理解,这是你使用GPU来训练网络的时候显卡不够用了,但是这个时候如果用CPU还是可以的,但是CPU和GPU的训练速度差的简直不是一星半点,我这里差了有十倍,解决方法只需要在import后边添加两行代码,让占用的显卡内存一开始不要那么高:
如下,只有最后两行有用

import cv2
‘’‘’‘’
import各种
‘’‘’‘’
import matplotlib.pyplot as plt

config = tf.ConfigProto(gpu_options=tf.GPUOptions(allow_growth=True))
sess = tf.Session(config=config)

这样就可以训练了

Epoch 1/16
2019-09-11 09:41:21.671435: E C:\users\nwani\_bazel_nwani\mmtm6wb6\execroot\org_tensorflow\tensorflow\core\grappler\clusters\utils.cc:81] Failed to get device properties, error code: 30
2019-09-11 09:41:33.268699: W C:\users\nwani\_bazel_nwani\mmtm6wb6\execroot\org_tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:219] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.09GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2019-09-11 09:41:33.282936: W C:\users\nwani\_bazel_nwani\mmtm6wb6\execroot\org_tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:219] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.27GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2019-09-11 09:41:33.301694: W C:\users\nwani\_bazel_nwani\mmtm6wb6\execroot\org_tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:219] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.29GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2019-09-11 09:41:33.342715: W C:\users\nwani\_bazel_nwani\mmtm6wb6\execroot\org_tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:219] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.16GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2019-09-11 09:41:33.352678: W C:\users\nwani\_bazel_nwani\mmtm6wb6\execroot\org_tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:219] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.09GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2019-09-11 09:41:33.359968: W C:\users\nwani\_bazel_nwani\mmtm6wb6\execroot\org_tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:219] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.19GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2019-09-11 09:41:33.399060: W C:\users\nwani\_bazel_nwani\mmtm6wb6\execroot\org_tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:219] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.13GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2019-09-11 09:41:33.410350: W C:\users\nwani\_bazel_nwani\mmtm6wb6\execroot\org_tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:219] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.14GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2019-09-11 09:41:33.416374: W C:\users\nwani\_bazel_nwani\mmtm6wb6\execroot\org_tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:219] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.19GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2019-09-11 09:41:33.614362: W C:\users\nwani\_bazel_nwani\mmtm6wb6\execroot\org_tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:219] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.06GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.

  1/300 [..............................] - ETA: 1:15:45 - loss: 90.8376
  2/300 [..............................] - ETA: 38:57 - loss: 89.4276  
  3/300 [..............................] - ETA: 26:41 - loss: 87.2493
  4/300 [..............................] - ETA: 20:32 - loss: 84.7477
  5/300 [..............................] - ETA: 16:51 - loss: 82.1824