欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页

ubuntu18:报错NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver.解决

程序员文章站 2022-05-27 12:40:05
...

服务器重启后,输入nvidia-smi,报错如下:

NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

输入nvcc -V输入如下:

[email protected]:~$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Sat_Aug_25_21:08:01_CDT_2018
Cuda compilation tools, release 10.0, V10.0.130

解决方法:

  • sudo apt-get install dkms

  • ll /usr/src/ 查看nvidia版本(最后一行的nvidia-410.48)

    [email protected]:~$ ll /usr/src/
    总用量 36
    drwxr-xr-x  9 root root 4096 Dec 14 06:40 ./
    drwxr-xr-x 12 root root 4096 Dec 27 15:46 ../
    drwxr-xr-x 27 root root 4096 Feb 26  2019 linux-headers-4.15.0-45/
    drwxr-xr-x  8 root root 4096 Feb 26  2019 linux-headers-4.15.0-45-generic/
    drwxr-xr-x 27 root root 4096 Apr  3  2019 linux-headers-4.15.0-47/
    drwxr-xr-x  8 root root 4096 Apr  3  2019 linux-headers-4.15.0-47-generic/
    drwxr-xr-x 25 root root 4096 Dec 13 06:15 linux-headers-4.15.0-72/
    drwxr-xr-x  8 root root 4096 Dec 13 06:15 linux-headers-4.15.0-72-generic/
    drwxr-xr-x  7 root root 4096 Feb 26  2019 nvidia-410.48/
    
  • sudo dkms install -m nvidia -v 410.48(-v后面的参数根据自己的nvidia的版本决定)

  • 到此,该问题已解决输入nvidia-smi即可得到如下输出:

[email protected]:~$ nvidia-smi
Sun Jan  5 21:10:18 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 410.48                 Driver Version: 410.48                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla K80           Off  | 00000000:0B:00.0 Off |                    0 |
| N/A   33C    P8    26W / 149W |      0MiB / 11441MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  Tesla K80           Off  | 00000000:0C:00.0 Off |                    0 |
| N/A   25C    P8    30W / 149W |      0MiB / 11441MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   2  Tesla K80           Off  | 00000000:8A:00.0 Off |                    0 |
| N/A   30C    P8    25W / 149W |      0MiB / 11441MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   3  Tesla K80           Off  | 00000000:8B:00.0 Off |                    0 |
| N/A   25C    P8    29W / 149W |      0MiB / 11441MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+