Ubuntu:NVIDIA-SMI has failed because it couldn‘t communicate with the NVIDIA driver. 解决方法总结
在Ubuntu上运行Cuda并行计算的渲染项目(《Massively Parallel Rendering of Complex Closed-Form Implicit Surfaces》论文代码GUI部分源码),Cuda版本是10.0.130,显卡为NVIDIA GEFORCE GTX 960m,GUI项目运行成功。
昨天准备再次运行该项目,测试一些数据,突然报错显示CUDA运行失败,调用nvidia-smi查看驱动运行情况,出现如下错误:
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver.
在经验不足的情况下,随便参考了一篇文章,卸载并重新安装了nvidia显卡驱动,途中漏掉了开启x-window这一步,所以死活进不了GUI界面,现象是进入系统后黑屏,左上角显示一个不断闪烁的光标_,在这种情况下,发现只要不安装nvidia显卡驱动,通过这篇博文的方法可以直接进入到图形界面中,但是由于不安装nvidia驱动,cuda项目无法运行,所以陷入了死循环。
今天冷静下来,开始分析出现这个错误的原因。
最终发现,出现这种错误的原因是使用sudo apt-get update、sudo apt-get upgrade命令更新软件列表信息及软件时,造成了内核版本的升级,导致以前下载的nvidia显卡驱动不再适配,解决方法是重新安装nvidia显卡驱动或者是降低linux内核版本。
- 方法一降低linux内核版本:在进入系统前的选择系统启动项界面,进入高级选项模块,可以选择进入历史低版本内核(不含recovery后缀),本机(HP-Laptop-960m)的历史低版本内核均存在问题,无法正常进入,所以选择了重新安装显卡驱动。
- 方法二重新安装nvidia显卡驱动:本机安装了CUDA10.0.130,最低支持410.48版本的nvidia驱动,实际在安装418.88版本的nvidia驱动时出现内核不匹配的现象,所以最好是安装最新版本的驱动,这里重新安装了450.66版本的驱动。
信息来源:https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html
采用了.run文件安装的方法,首先去官网上下载和本机硬件匹配的最新驱动,然后开始安装:
1. 卸载掉旧版本的驱动
sudo apt-get autoremove --purge nvidia-*
sudo /usr/bin/nvidia-uninstall
执行第二行命令时会出现
【If you plan to no longer use NVIDIA driver, you should make sure that no X screens are configured to use the NVIDIA X driver in your X configuration file. If you used nvidia-xconfig to configure X, it may have created a backup of you original configuration. Would you like to run 'nvidia-xconfig --restore-original-backup' to attempt restoration of the original X configuration file?】
选择‘No’,然后点击’OK’
2. 禁用nouveau驱动(nouveau是一个开源驱动,用于支持nvidia显卡,linux默认安装)
新建禁用名单文件:
sudo vim /etc/modprobe.d/blacklist-nouveau.conf
在文件中添加如下内容:
blacklist nouveau
options nouveau modeset=0
更新上述内容,将其编译进内核:
sudo update-initramfs -u
重新启动一下系统,并检测一下nouveau是否被成功禁用:
lsmod | grep nouveau
如果是已经禁用过一次了,就不必再次重复禁用。
3. 关闭x-window,也就是图形界面
Ctrl+Alt+F1(1~6都可以,7是图形界面)切换ttf界面,登录账户后,输入:
sudo service lightdm stop
4. 安装nvidia驱动
sudo ./NVIDIAXXX.run
关于选项的选择,依次为:Continue installation、No、No、Yes、OK
安装完毕后输入nvidia-smi测试是否安装成功,若是成功会打印出一张显卡详细信息的表,按下Ctrl+Alt+F7切换回图形界面。
安装结束,再次测试GUI项目,运行成功。
推荐阅读
-
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver ...
-
Ubuntu:NVIDIA-SMI has failed because it couldn‘t communicate with the NVIDIA driver. 解决方法总结
-
【Ubuntu 18.04 LTS】NVIDIA-SMI has failed because it couldn‘t communicate with the NVIDIA driver.
-
Ubuntu16安装nvidia驱动成功出现NVIDIA-SMI has failed because it couldn’t communicate with the NVIDIA driver
-
NVIDIA-SMI has failed because it couldn‘t communicate with the NVIDIA driver.
-
解决 NVIDIA-SMI has failed because it couldn‘t communicate with the NVIDIA driver.
-
显卡驱动安装教程 NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver
-
Ubuntu切换显卡,NVIDIA-SMI has failed because it couldn‘t communicate with the NVIDIA driver
-
NVIDIA-SMI has failed because it couldn‘t communicate with the NVIDIA driver
-
ubuntu18:报错NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver.解决