ubuntu18.04配置nvidia驱动+tensorflow-gpu1.15.0总结
安装显卡驱动
1.禁用secure boot
这一步很重要,如果没有禁用之后会报错。
首先,根据自己电脑的情况(F12或F10)进入BIOS。
将Secure Boot Option改成Disabled
我用的是雷神电脑,修改这里之后重启又恢复成了Enable,其他的电脑也有可能出现这种情况,需要调整为自定义模式,其是就是将下面一栏,Change to Customization启用,这样Secure boot会自动变为Disabled。
2.禁用nouveau
编辑文件blacklist.conf
sudo vim /etc/modprobe.d/blacklist.conf
在文件最后部分插入以下两行内容
blacklist nouveau
options nouveau modeset=0
更新系统
sudo update-initramfs -u
重启系统
验证nouveau是否已禁用
lsmod | grep nouveau
没有信息显示,说明nouveau已被禁用,接下来可以安装nvidia的显卡驱动。
3. 在英伟达的官网上查找你自己电脑的显卡型号然后下载相应的驱动。网址:http://www.nvidia.cn
将下载后的run文件拷贝至home目录下
4. 在ubuntu下进入命令行界面
我是ctrl+alt+f3,不同的电脑会不同。
首先切换至root用户:
su root
关闭图形界面,不执行会出错。
service lightdm stop
然后卸载掉原有驱动:
apt-get remove nvidia-*
给驱动run文件赋予执行权限
chmod a+x [NVIDIA run文件]
安装:
./[NVIDIA run文件] -no-x-check -no-nouveau-check -no-opengl-files
-no-x-check:安装驱动时关闭X服务
-no-nouveau-check:安装驱动时禁用nouveau
-no-opengl-files:只安装驱动文件,不安装OpenGL文件
避免出现循环登陆的问题。
安装过程中的选项:
- Continue installation
- Install without signing
其他选择ok或者yes就行。
挂载Nvidia驱动:
modprobe nvidia
检查驱动是否安装成功:
nvidia-smi
conda安装tensorflow-gpu1.15.0
之所以选择这个版本是因为它是一个承前启后的版本,可以向后兼容2.0.0的内容。
而通过conda安装可以自动配置合适的cuda和cudnn。
conda install tensorflow-gpu=1.15.0
报错解决:
首先是可能因为网速的问题出现下载失败的情况,需要将conda配置为清华源:
运行以下命令:
conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free/
conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main/
conda config --set show_channel_urls yes
其次我出现了以下错误:
Verifying transaction: failed
RemoveError: 'setuptools' is a dependency of conda and cannot be removed from
conda's operating environment.
一开始使用:
conda install -c anaconda setuptools
但还是报错。
感觉是conda版本需要更新:
conda update --force conda
成功解决
验证gpu
import tensorflow as tf
a = tf.test.is_built_with_cuda() # 判断CUDA是否可以用
b = tf.test.is_gpu_available(
cuda_only=False,
min_cuda_compute_capability=None
) # 判断GPU是否可以用
print(a)
print(b)
输出结果是:
True
True
代表CUDA和GPU可用
import tensorflow as tf
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
输出如下:
2020-04-13 22:44:58.936998: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2020-04-13 22:44:58.968713: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2799925000 Hz
2020-04-13 22:44:58.969389: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55aab2112f20 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-04-13 22:44:58.969426: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2020-04-13 22:44:58.972287: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2020-04-13 22:44:59.320078: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-04-13 22:44:59.320520: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55aab1df0a10 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2020-04-13 22:44:59.320539: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): GeForce GTX 1050 Ti, Compute Capability 6.1
2020-04-13 22:44:59.320701: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-04-13 22:44:59.320951: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:
name: GeForce GTX 1050 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.62
pciBusID: 0000:01:00.0
2020-04-13 22:44:59.357052: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2020-04-13 22:44:59.361052: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0
2020-04-13 22:44:59.400897: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10.0
2020-04-13 22:44:59.445225: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10.0
2020-04-13 22:44:59.446472: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0
2020-04-13 22:44:59.497395: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10.0
2020-04-13 22:44:59.528163: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-04-13 22:44:59.528302: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-04-13 22:44:59.528658: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-04-13 22:44:59.528860: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2020-04-13 22:44:59.528901: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2020-04-13 22:44:59.529559: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-04-13 22:44:59.529571: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165] 0
2020-04-13 22:44:59.529576: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0: N
2020-04-13 22:44:59.529651: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-04-13 22:44:59.529887: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-04-13 22:44:59.530106: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3686 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1050 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1)
Device mapping:
/job:localhost/replica:0/task:0/device:XLA_CPU:0 -> device: XLA_CPU device
/job:localhost/replica:0/task:0/device:XLA_GPU:0 -> device: XLA_GPU device
/job:localhost/replica:0/task:0/device:GPU:0 -> device: 0, name: GeForce GTX 1050 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1
2020-04-13 22:44:59.530773: I tensorflow/core/common_runtime/direct_session.cc:359] Device mapping:
/job:localhost/replica:0/task:0/device:XLA_CPU:0 -> device: XLA_CPU device
/job:localhost/replica:0/task:0/device:XLA_GPU:0 -> device: XLA_GPU device
/job:localhost/replica:0/task:0/device:GPU:0 -> device: 0, name: GeForce GTX 1050 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1
推荐阅读
-
ubuntu18.04装nvidia驱动
-
Ubuntu 16.04 配置Nvidia驱动
-
Ubuntu18.04 安装 显卡驱动 Nvidia Driver CUDA CUDNN 与GPU 版本的Pytorch
-
ubuntu18.04配置nvidia驱动+tensorflow-gpu1.15.0总结
-
Ubuntu18.04安装NVIDIA显卡驱动 cuda10.1 cudnn7.6.1
-
Ubuntu18.04安装NVIDIA显卡驱动、cuda10.1以及cudnn-10.1-7.6.4
-
Ubuntu18.04安装Nvidia显卡驱动教程(图文)
-
Ubuntu16安装NVIDIA驱动避坑总结
-
独显笔记本安装ubuntu18.04双系统,以及安装nvidia显卡驱动,调节屏幕亮度
-
Ubuntu16.04 nvidia显卡驱动 cuda9.0 cudnn7.0.5 简要配置流程