欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页

02.手动安装NVIDIA GPU驱动--Linux

程序员文章站 2022-05-31 16:03:59
...

 

 

1、  设备环境

1.1  软件环境:

    (1) 执行命令:cat /etc/redhat-release 查看CentOS版本

 

       CentOS Linux release 7.3.1611 (Core) 

(2)执行命令:cat /proc/version  查看内核版本

02.手动安装NVIDIA GPU驱动--Linux
            
    
    博客分类: GPU服务器  

内核:4.8.5 (后面会用到)

2、   配置环境

2.1  方法及步骤:

(1)安装对应版本内核源码包、gcc编译器

         执行命令:

#sudo yum install -y gcc (或者:sudo yum install  -y gcc-c++  或者 yum install http://mirror.centos.org/centos/7/updates/x86_64/Packages/gcc-4.8.5-36.el7_6.2.x86_64.rpm  [版本要按照内核的版本] )

# sudo yum install -y kernel

# sudo yum install -y kernel-devel

# sudo yum install -y kernel-header

02.手动安装NVIDIA GPU驱动--Linux
            
    
    博客分类: GPU服务器  

查看相应版本,确认已经完成下载并安装gcc,kernel, kernel-devel和kernel-header包

 查看相应版本执行命令:

#rpm -qa|grep gcc

02.手动安装NVIDIA GPU驱动--Linux
            
    
    博客分类: GPU服务器  

#rpm -qa|grep kernel  

02.手动安装NVIDIA GPU驱动--Linux
            
    
    博客分类: GPU服务器  

从上面看到kernel 有两个版本,卸载一个,并确保kernel与kernel-devel和kernel-header包的版本号一致

卸载执行命令:

#rpm -e --nodeps   kernel-3.10.0-514.el7.x86_64  ( 不检查依赖关系直接删除)

查看确认卸载成功内核版本,执行命令:

#rpm -qa|grep kernel

02.手动安装NVIDIA GPU驱动--Linux
            
    
    博客分类: GPU服务器  

 

(2)查看nouveau是否被禁用

         执行命令:lsmod | grep nouveau

02.手动安装NVIDIA GPU驱动--Linux
            
    
    博客分类: GPU服务器  

有输出说明没有被禁用。

 

 

(3)禁用系统nouveau

         执行命令: 

#su

#echo -e "blacklist nouveau\noptions nouveau modeset=0" > /etc/modprobe.d/blacklist.conf

 

(4)重启系统

 

(5)验证nouveau是否被禁用

         执行命令: lsmod | grep nouveau

02.手动安装NVIDIA GPU驱动--Linux
            
    
    博客分类: GPU服务器  

没有输出说明禁用成功,如果有输出说明禁用失败。

 

(6)下载NVIDIA显卡驱动

         到英伟达官网(https://www.nvidia.cn/Download/index.aspx?lang=cn)选择显卡和系统版本进行驱动下载

02.手动安装NVIDIA GPU驱动--Linux
            
    
    博客分类: GPU服务器  

点击搜索

02.手动安装NVIDIA GPU驱动--Linux
            
    
    博客分类: GPU服务器  

点击DOWNLOAD

02.手动安装NVIDIA GPU驱动--Linux
            
    
    博客分类: GPU服务器  

右键【AGREE&DOWNLOAD】,菜单里复制链接地址。此处为:http://us.download.nvidia.com/tesla/418.67/NVIDIA-Linux-x86_64-418.67.run

执行命令:wget http://us.download.nvidia.com/tesla/418.67/NVIDIA-Linux-x86_64-418.67.run   对驱动进行下载y

02.手动安装NVIDIA GPU驱动--Linux
            
    
    博客分类: GPU服务器  

 

(7)安装NVIDIA显卡驱动

         Ctrl+alt+f2然后切换到运行级别3

执行命令:

# init 3

# chmod +x NVIDIA-Linux-x86_64-384.59.run  

# sudo ./NVIDIA-Linux-x86_64-418.67.run -no-x-check -no-nouveau-check -no-opengl-files

02.手动安装NVIDIA GPU驱动--Linux
            
    
    博客分类: GPU服务器  

02.手动安装NVIDIA GPU驱动--Linux
            
    
    博客分类: GPU服务器  

点击 ok

02.手动安装NVIDIA GPU驱动--Linux
            
    
    博客分类: GPU服务器  

点击 yes

02.手动安装NVIDIA GPU驱动--Linux
            
    
    博客分类: GPU服务器  

02.手动安装NVIDIA GPU驱动--Linux
            
    
    博客分类: GPU服务器  

点击ok ,显卡驱动安装成功。

 

(8)验证安装NVIDIA显卡驱动是否成功

执行命令:nvidia-smi    ,如果输出下图,说明安装成功。

02.手动安装NVIDIA GPU驱动--Linux
            
    
    博客分类: GPU服务器  

 

3、   问题处理

错误1:

ERROR: The Nouveau kernel driver is currently in use by your system.  This driver is incompatible with the NVIDIA driver, and must be disabled before proceeding.  
Please consult the NVIDIA driver README and your Linux distribution's documentation for details on how to correctly disable the Nouveau kernel driver.

解释:如果没有执行屏蔽nouveau操作,报以上错误。

错误2:

unable to find the development too 'cc' in you path; please make sure that you have the package 'gcc

解决办法:

yum install gcc

错误3:

 ERROR: Unable to find the kernel source tree for the currently running kernel.  Please make sure you have installed the kernel source files for your kernel 
and that they are properly configured; on Red Hat Linux systems, for example, be sure you have the  'kernel-source' or 'kernel-devel' RPM installed.  
If you know the correct kernel source files are installed, you may specify the kernel source path with the '--kernel-source-path' command line option.

解决办法:

yum install kernel-delve

错误4:

 ERROR: Unable to find the kernel source tree for the currently running kernel.  Please make sure you have installed the kernel source files for your kernel 
and that they are properly configured; on Red Hat Linux systems, for example, be sure you have the         'kernel-source' or 'kernel-devel' RPM installed.  
If you know the correct kernel source files are installed, you may specify the kernel source path with the '--kernel-source-path' command line option.

 解决方法:

./NVIDIA-Linux-x86_64-390.67.run --kernel-source-path=/usr/src/kernels/3.10.0-862.3.2.el7.x86_64/ 

错误5:

ERROR: Unable to load the kernel module 'nvidia.ko'.  This happens most frequently when this kernel module was built against the wrong 
or improperly configured kernel sources, with a version of gcc that differs from the one used to build the target kernel, 
or if another driver, such as nouveau, is present and prevents the NVIDIA kernel module from obtaining ownership of the NVIDIA GPU(s), 
or no NVIDIA GPU installed in this system is supported by this NVIDIA Linux graphics driver release. 
Please see the log entries 'Kernel module load error' and 'Kernel messages' at the end of the file '/var/log/nvidia-installer.log' for more information.

解决办法:

  • 可以通过以下方式查看内核版本和源码包版本:
    ls /boot | grep vmlinuz
  • 如果上面的命令输出中有多个内核,则按grub.conf中指定的文件为准。
    rpm -aq | grep kernel-devel kernel-devel-2.6.35.13-92.fc14.i686
  • 从上面的输出中可以看出内核版本号和内核源码版本。为了解决这个错误,需要从FC官方网站上下载与内核版本对应的源码包进行安装。

         可以在以下网站下载并安装:
        http://rpmfind.net/linux/rpm2html/search.php?query=kernel-devel

备注:执行更新内核操作好需要重新执行屏蔽nouveau,及重建initramfs image步骤。

警告:

 

WARNING: nvidia-installer was forced to guess the X library path '/usr/lib64' and X module path '/usr/lib64/xorg/modules'; 
these paths were not queryable from the system.  If X fails to find the NVIDIA X driver module, please install the `pkg-config` utility 
and the X.Org SDK/development package for your distribution and reinstall the driver.

字符模式安装警告信息,可忽略。

 

 

参考:https://www.cnblogs.com/2012blog/p/9431432.html