could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
程序员文章站
2022-05-26 23:43:45
...
当我们深度学习做训练的时候,偶尔会发生这种情况,我把源错误贴出来:
Epoch 1/16
2019-09-11 09:34:11.000335: E C:\users\nwani\_bazel_nwani\mmtm6wb6\execroot\org_tensorflow\tensorflow\stream_executor\cuda\cuda_dnn.cc:455] could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2019-09-11 09:34:11.000570: F C:\users\nwani\_bazel_nwani\mmtm6wb6\execroot\org_tensorflow\tensorflow\core\kernels\conv_ops.cc:713] Check failed: stream->parent()->GetConvolveAlgorithms( conv_parameters.ShouldIncludeWinogradNonfusedAlgo<T>(), &algorithms)
Process finished with exit code -1073740791 (0xC0000409)
就是这么个事,按照我的理解,这是你使用GPU来训练网络的时候显卡不够用了,但是这个时候如果用CPU还是可以的,但是CPU和GPU的训练速度差的简直不是一星半点,我这里差了有十倍,解决方法只需要在import后边添加两行代码,让占用的显卡内存一开始不要那么高:
如下,只有最后两行有用
import cv2
‘’‘’‘’
import各种
‘’‘’‘’
import matplotlib.pyplot as plt
config = tf.ConfigProto(gpu_options=tf.GPUOptions(allow_growth=True))
sess = tf.Session(config=config)
这样就可以训练了
Epoch 1/16
2019-09-11 09:41:21.671435: E C:\users\nwani\_bazel_nwani\mmtm6wb6\execroot\org_tensorflow\tensorflow\core\grappler\clusters\utils.cc:81] Failed to get device properties, error code: 30
2019-09-11 09:41:33.268699: W C:\users\nwani\_bazel_nwani\mmtm6wb6\execroot\org_tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:219] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.09GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2019-09-11 09:41:33.282936: W C:\users\nwani\_bazel_nwani\mmtm6wb6\execroot\org_tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:219] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.27GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2019-09-11 09:41:33.301694: W C:\users\nwani\_bazel_nwani\mmtm6wb6\execroot\org_tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:219] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.29GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2019-09-11 09:41:33.342715: W C:\users\nwani\_bazel_nwani\mmtm6wb6\execroot\org_tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:219] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.16GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2019-09-11 09:41:33.352678: W C:\users\nwani\_bazel_nwani\mmtm6wb6\execroot\org_tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:219] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.09GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2019-09-11 09:41:33.359968: W C:\users\nwani\_bazel_nwani\mmtm6wb6\execroot\org_tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:219] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.19GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2019-09-11 09:41:33.399060: W C:\users\nwani\_bazel_nwani\mmtm6wb6\execroot\org_tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:219] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.13GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2019-09-11 09:41:33.410350: W C:\users\nwani\_bazel_nwani\mmtm6wb6\execroot\org_tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:219] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.14GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2019-09-11 09:41:33.416374: W C:\users\nwani\_bazel_nwani\mmtm6wb6\execroot\org_tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:219] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.19GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2019-09-11 09:41:33.614362: W C:\users\nwani\_bazel_nwani\mmtm6wb6\execroot\org_tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:219] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.06GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
1/300 [..............................] - ETA: 1:15:45 - loss: 90.8376
2/300 [..............................] - ETA: 38:57 - loss: 89.4276
3/300 [..............................] - ETA: 26:41 - loss: 87.2493
4/300 [..............................] - ETA: 20:32 - loss: 84.7477
5/300 [..............................] - ETA: 16:51 - loss: 82.1824
推荐阅读
-
Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED的一种解决方案
-
could not create cudnn handle: CUDNN_STATUS_NOT_INITIALIZED
-
Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
-
Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
-
could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
-
could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
-
could not create cudnn handle: CUDNN_STATUS_NOT_INITIALIZED
-
Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
-
E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_ALLOC
-
could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR