NVIDIA/nvidia-drivers
x11-drivers/nvidia-drivers 包含用于 NVIDIA 显卡的“专有”图形驱动程序。
x11-drivers/nvidia-drivers 在树中被发布的 NVDIA是针对 Linux 内核构建的。它们包含二进制数据, 和驱动的重任。驱动程序包括两个部分,一个内核模块和 X11 的驱动程序。两个部分都包含在一个包中。由于 NVDIA有他们自己的驱动包 ,就必须在安装的驱动程序做出一些选择。
x11-drivers/nvidia-drivers包,包含NVIDIA的最新驱动程序与大部分显卡的支持,附带几个现有的版本,这依赖于显卡的新旧情况。它使用的eclass检测系统正在运行什么样的显卡,使其安装正确的版本。
建议(并且是良好实践)检查 NVIDIA 提供的与系统安装驱动版本对应的文档,因为这些信息可能是最新的,并且适用于更多的用例。如果出现任何问题,README 文档可能会非常有帮助。所有 NVIDIA 驱动的官方 README 文档也可以在线上找到。
USE 标记
USE flags for x11-drivers/nvidia-drivers NVIDIA Accelerated Graphics Driver
+X
|
Add support for X11 |
+modules
|
Build the kernel modules |
+static-libs
|
Install the XNVCtrl static library for accessing sensors and other features |
+strip
|
Allow symbol stripping to be performed by the ebuild for special files |
+tools
|
Install additional tools such as nvidia-settings |
dist-kernel
|
Enable subslot rebuilds on Distribution Kernel upgrades |
kernel-open
|
Use the open source variant of the drivers (Turing/Ampere+ GPUs only, aka GTX 1650+ -- recommended with >=560.xx drivers if usable) |
modules-compress
|
Install compressed kernel modules (if kernel config enables module compression) |
modules-sign
|
Cryptographically sign installed kernel modules (requires CONFIG_MODULE_SIG=y in the kernel) |
persistenced
|
Install the persistence daemon for keeping devices state when unused (e.g. for headless) |
powerd
|
Install the NVIDIA dynamic boost support daemon (only useful with specific laptops, ignore if unsure) |
wayland
|
Enable dev-libs/wayland backend |
开源内核模块
2022年5月,Nvidia 宣布 开始将其显卡内核驱动开源。该驱动 托管在 GitHub 上。
用户可以通过相应的 USE 标志来尝试开源驱动程序:
x11-drivers/nvidia-drivers kernel-open
硬件兼容性
x11-drivers/nvidia-drivers 包支持多种 NVIDIA 显卡。根据系统中的显卡,可以选择多个版本进行安装。请参阅 功能支持列表 和官方 NVIDIA 文档 什么是旧版驱动?,以了解应该使用哪个版本的 nvidia-drivers。
老旧硬件
If the card has been identified as a legacy card (470 or lower) then it is highly recommended to either replace the hardware or switch to the nouveau driver, as the official driver is no longer receiving security updates.
It is possible to still use the unsupported driver, if the user is happy with the risks and lack of support by Nvidia and Gentoo:
~x11-drivers/nvidia-drivers-470.256.02
>x11-drivers/nvidia-drivers-471
Change the values to the 390 variants if that driver is the one required.
安装
发行版内核
在使用发行版内核(sys-kernel/gentoo-kernel 或 sys-kernel/gentoo-kernel-bin)时,构建驱动支持只需将以下内容添加到 /etc/portage/make.conf 中:
USE="dist-kernel"
这将使 Nvidia 驱动在每次内核更新时自动重新构建。建议内核更新后重启系统。
手动编译内核
如上所述,NVIDIA内核驱动程序安装并运行对当前内核。它作为模块建立,所以内核必须支持的内核模块的加载(见下文)。
内核模块 (nvidia.ko)由一个专有部分(通常称为“二进制blob)驱动图形芯片,和一个开源部分(“glue”),在运行时作为专有部分和内核之间的媒介。这些都需要很好地协同工作,否则用户可能面临数据丢失(通过内核panics, X servers崩溃伴随着未保存数据X的应用程序),甚至是硬件故障(应该想到过热和其他电源管理相关问题)。
内核兼容性
不时地,一个新的内核版本改变了内部ABI的驱动程序,这意味着所有使用这些ABI的驱动程序必须相应地改变。对于开源驱动,尤其是那些随内核,这些变化几乎是可以微不足道的修复,因为驱动程序与内核的其他部分之间的调用整个链条可以很容易地修正。对于专有的驱动程序nvidia.ko,这是行不通的。当内部的ABI改变,那么就不可能仅仅修复“glue”,因为没有人知道如何glue所使用的专有的那一部分是很忙。即使设法修补东西似乎很好地工作,用户仍然存在运行nvidia的风险。在新的,不支持的内核nvidia.ko运行会导致数据丢失,硬件故障。
当一个新的,不兼容的内核版本发布时,它可能是最好坚持使用最新支持的内核了一段时间。 NVIDIA公司通常需要几个星期的时间准备新的专有版本,他们认为适合用于一般用途。 耐心一点。如果绝对必要,那么就可以使用epatch_user命令和NVIDIA驱动的ebuild:这允许用户打补丁的NVIDIA驱动程序以某种方式适应了最新的,不支持的内核版本。请注意,即使是NVIDIA驱动维护者,也不NVIDIA将支持这一情况。硬件保修将最有可能是无效的,Gentoo的维护者不能解决问题,因为它是一个专有的驱动程序,只有NVIDIA官方能够正确调试,内核维护者(包括Gentoo和上游)肯定不会支持专有的驱动程序,或任何“污点“系统,当碰巧遇到麻烦时。
如果 genkernel all 是用于配置内核,那么一切都准备好了。如果不是,仔细检查内核配置,启用以下支持:
[*] Enable loadable module support --->
还需要在内核中启用“Memory Type Range Register”(启用 /proc/mtrr):
Processor type and features --->
[*] MTRR (Memory Type Range Register) support
With at least some if not all driver versions it may also be required to enable VGA Arbitration and the IPMI message handler:
对于部分 (甚至是全部)的驱动程序版本,可能还需要启用 VGA 仲裁和 IPMI 消息管理器:
Device Drivers --->
PCI support --->
[*] VGA Arbitration
Device Drivers --->
Character devices --->
[*] IPMI top-level message handler
如果系统中有 AGP 显卡,则可以选择在内核中启用 agpgart 支持,也可以选择编译进内核或作为模块。如果未使用内核中的 agpgart 模块,驱动将使用其自己的 agpgart 实现,称为 NvAGP。在一些系统上,这比内核中的 agpgart 性能更好,而在其他系统上则可能性能较差。根据系统评估这两种选择,以获得最佳性能。当不确定时,使用内核中的 agpgart:
Device Drivers --->
Graphics support --->
-*- /dev/agpgart (AGP Support) --->
对于 x86 和 AMD64 处理器,内核中的 framebuffer 驱动与 NVIDIA 提供的二进制驱动存在冲突。在为这些 CPU 编译内核时,请完全移除对内核驱动的支持,如下所示。
Device Drivers --->
Graphics support --->
Frame buffer Devices --->
<*> Support for frame buffer devices --->
< > nVidia Framebuffer Support
< > nVidia Riva support
现在确保禁用了 nouveau 驱动:
Device Drivers --->
Graphics support --->
< > Nouveau (nVidia) cards
SimpleDRM 禁止内置(CONFIG_DRM_SIMPLEDRM=y
)[1]。启用它可能会导致以下问题:没有 TTY、不工作的 Xorg 会话/Wayland 组合器等。不过,作为模块时(CONFIG_DRM_SIMPLEDRM=m
),它是无害的。
Device Drivers --->
Graphics support --->
< > Simple framebuffer driver
A framebuffer driver is required for rendering the Linux console (TTY) as this functionality is not yet provided by the proprietary NVIDIA driver[2][3], i.e. nvidia-drivers, unlike in-tree DRM drivers, rely on other framebuffer drivers to provide Linux console (TTY) support, instead of providing its own. As shown below, set Mark VGA/VBE/EFI FB as generic system framebuffer (CONFIG_SYSFB_SIMPLEFB=y
), and then enable a framebuffer driver. Common options for this are to use either efifb (CONFIG_FB_EFI=y
) for UEFI devices or vesafb (CONFIG_FB_VESA=y
) for BIOS/CSM devices. simplefb (CONFIG_FB_SIMPLE=y|m
) may also be chosen, however there are reports of it not working, as there exist reports of it working as well as others; the decision is up to end user to make.
Device Drivers --->
Firmware Drivers --->
[*] Mark VGA/VBE/EFI FB as generic system framebuffer
Graphics support --->
Frame buffer Devices --->
<*> Support for frame buffer devices --->
[*] VESA VGA graphics support
[*] EFI-based Framebuffer Support
<*> Simple framebuffer support
The nvidia-drivers ebuild automatically discovers the kernel version based on the /usr/src/linux symlink. Please ensure that this symlink is pointing to the correct sources and that the kernel is correctly configured. Please refer to the "Configuring the Kernel" section of the Gentoo Handbook for details on configuring the kernel.
First, choose the right kernel source using eselect. When using sys-kernel/gentoo-sources version 3.7.10 for instance, the kernel listing might look something like this:
root #
eselect kernel list
Available kernel symlink targets: [1] linux-3.7.10-gentoo * [2] linux-3.7.9-gentoo
In the above output, notice that the linux-3.7.10-gentoo kernel is marked with an asterisk (*
) to show that it is the kernel that the symbolic link points to.
If the symlink is not pointing to the correct sources, update the link by selecting the number of the desired kernel sources, as in the example above.
root #
eselect kernel set 1
Kernel GCC plugins
If GCC plugins of the Kernel are enabled compilation of nvidia-drivers will use them. If the compiler version that was used to compile the plugins does not match the nvidia-drivers' compiler an error will occur.
General architecture-dependent options --->
GCC plugins --->
...
This behavior cannot be fixed, see bug #804618 or various forum posts [4] [5] [6]. Using these plugins seems controversial, too [7].
If the problem occurs, re-compile the plugins (in /usr/src/linux):
root #
make oldconfig && make prepare
配置
驱动
Now it's time to install the drivers. First follow the X Server Configuration Guide and set VIDEO_CARDS="nvidia"
in /etc/portage/make.conf. During the installation of the X server, it will then install the right version of x11-drivers/nvidia-drivers.
The drivers can be installed with the
tools
USE flag. This will install nvidia-settings, a handy graphical tool for monitoring and configuring several aspects of the NVIDIA card.Every time a kernel is built, it is necessary to reinstall the NVIDIA kernel modules. An easy way to rebuild the modules installed by ebuilds (such as x11-drivers/nvidia-drivers) is to run emerge @module-rebuild.
Once the installation has finished, run modprobe nvidia to load the kernel module into memory. If this is an upgrade, remove the previous module first.
root #
lsmod | grep nvidia
root #
rmmod nvidia
root #
modprobe nvidia
内核模块签名(可选)
对于不实现签名内核模块的系统,本节信息不是必需的。可以跳过。
If secure boot kernel signing is used, then the NVIDIA kernel modules need to be signed before they can be loaded.
This can be accomplished by using the kernel-provided perl script as follows.
root #
/usr/src/linux/scripts/sign-file sha512 /usr/src/linux/certs/signing_key.pem /usr/src/linux/certs/signing_key.x509 /lib/modules/Kernel-Version-modules-path/video/nvidia-uvm.ko
root #
/usr/src/linux/scripts/sign-file sha512 /usr/src/linux/certs/signing_key.pem /usr/src/linux/certs/signing_key.x509 /lib/modules/Kernel-Version-modules-path/video/nvidia.ko
As of driver version 358.09 a new module has been made to handle monitor mode setting and for this driver version this module must also be signed.
root #
/usr/src/linux/scripts/sign-file sha512 /usr/src/linux/certs/signing_key.pem /usr/src/linux/certs/signing_key.x509 /lib/modules/Kernel-Version-modules-path/video/nvidia-modeset.ko
When using a Wayland compositor or need to use PRIME offload, sign the following two modules:
root #
/usr/src/linux/scripts/sign-file sha512 /usr/src/linux/certs/signing_key.pem /usr/src/linux/certs/signing_key.x509 /lib/modules/Kernel-Version-modules-path/video/nvidia-drm.ko
root #
/usr/src/linux/scripts/sign-file sha512 /usr/src/linux/certs/signing_key.pem /usr/src/linux/certs/signing_key.x509 /lib/modules/Kernel-Version-modules-path/video/nvidia-peermem.ko
Once the modules are signed, the driver will load as expected on boot up. This module signing method can be used to sign other modules too - not only the nvidia-drivers. Just modify the path and corresponding module accordingly.
Dracut configuration (optional)
When using Dracut, it may be worthwhile to ensure that the NVIDIA modules are not bundled in the generated ramdisk (initramfs) image. Otherwise, every update may require regeneration of the image.
# Omit the nvidia driver from the ramdisk, to avoid needing to regenerate
# the ramdisk on updates.
omit_drivers+=" nvidia nvidia-drm nvidia-modeset nvidia-uvm "
X 服务
Once the appropriate drivers are installed, the X server should work without any extra configuration. An example of /etc/X11/xorg.conf for single-GPU systems is provided below.
Section "Device"
Identifier "nvidia"
Driver "nvidia"
EndSection
For laptops with integrated Intel graphics card, try the XOrg config as suggested by the NVIDIA/Optimus page.
nvidia-persistenced
NVIDIA packages a daemon called nvidia-persistenced to assist in situations where the tearing down of the GPU device state isn't desired. Typically, the tearing down of the device state is the intended behavior of the device driver. Still, the latencies incurred by repetitive device initialization can significantly impact performance for some applications.
nvidia-persistenced is intended to be run as a daemon from system initialization and is generally designed as a tool for compute-only platforms where the NVIDIA device is not used to display a graphical user interface. Depending on the user's system and its uses, it may not be necessary to set persistenced
USE flag.
Currently, Gentoo does not elect to set the persistenced
USE flag as default.
Permissions
The user(s) needing to access the video card will need to be added to the video group:
root #
gpasswd -a larry video
Note that users will be able to run X without permission to the DRI subsystem, but hardware acceleration will be disabled. For Wayland sessions not setting this may result in a very low FPS.
PCI-Express Runtime D3 (RTD3) Power Management
NVIDIA GPUs have many power-saving mechanisms. Some of them will reduce clocks and voltages to different parts of the chip. Sometimes, turning off clocks or power to parts of the chip entirely, without affecting functionality or continuing to function, just at a slower speed.
The NVIDIA Linux driver includes initial experimental support for dynamically managing power to the NVIDIA GPU.
Thus, this feature is available only when the following conditions are satisfied:
- This feature is supported only on notebooks.
- This feature requires system hardware as well as ACPI support. The necessary hardware and ACPI support was first added in the Intel Coffeelake chipset series. Hence, this feature is supported from Intel Coffeelake chipset series.
- This feature requires a Turing or newer GPU (i.e. GTX 1650+).
- This feature is supported with Linux kernel versions 4.18 and newer. With older kernel versions, it may not work as intended.
- This feature is supported when Linux kernel defines
CONFIG_PM=y
. Typically, if the system supports S3 (suspend-to-RAM), thenCONFIG_PM
would be defined.
Setup
If the user wants to enable this feature, then it's recommended to follow the 'Automated Setup' section in Chapter 22 of the official NVIDIA README documentation. It has also been outlined below for convenience.
Create a file named 80-nvidia-pm.rules in /etc/udev/rules.d/ directory with the following contents:
# Enable runtime PM for NVIDIA VGA/3D controller devices on driver bind
ACTION=="bind", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x030000", TEST=="power/control", ATTR{power/control}="auto"
ACTION=="bind", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x030200", TEST=="power/control", ATTR{power/control}="auto"
# Disable runtime PM for NVIDIA VGA/3D controller devices on driver unbind
ACTION=="unbind", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x030000", TEST=="power/control", ATTR{power/control}="on"
ACTION=="unbind", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x030200", TEST=="power/control", ATTR{power/control}="on"
The following file needs be added to /etc/modprobe.d/ file to seamlessly enable this feature.
# Enable RTD3
options nvidia NVreg_DynamicPowerManagement=0x02
More information and other configuration options are documented in Chapter 22 of NVIDIA's README documentation.
启用全局 NVIDIA 支持
Most NVIDIA GPUs have hardware encoding/decoding. Here is the support matrix. GeForce 8 series and later GPUs have VDPAU superseding legacy XvMCNVIDIA support. Some ebuilds, like media-video/ffmpeg and media-video/obs-studio, have USE flags vdpau
and nvenc
to enable NVIDIA hardware encoding/decoding.
Applications support the nvidia
USE flag in /etc/portage/make.conf.
To rebuild applications with new USE flags:
root #
emerge -uD --newuse @world
使用 NVIDIA 设置工具
NVIDIA also provides a settings tool. This tool allows the user to monitor and change graphical settings without restarting the X server and is available through Portage as part of x11-drivers/nvidia-drivers with the tools
USE flag set.
使用
测试显卡
To test the NVIDIA card, fire up X and run glxinfo, which is part of the x11-apps/mesa-progs package. It should say that direct rendering is activated:
user $
glxinfo | grep direct
direct rendering: Yes
要监视 FPS值,运行 glxgears。
故障解决
Random freezes
Freezes can occur for various reasons. Check that:
- All power saving options turned off in the system firmware setup.
- Only the original (from installation) driver options card defined in the /etc/modprobe.d/nvidia.conf file.
FATAL: modpost: GPL-incompatible module *.ko uses GPL-only symbol
When the ebuild is complaining about the 'mutex_destroy' GPL-only symbol:
root #
emerge x11-drivers/nvidia-drivers
FATAL: modpost: GPL-incompatible module nvidia.ko uses GPL-only symbol 'mutex_destroy'
Be sure to disable CONFIG_DEBUG_MUTEXES in the kernel's .config file, as suggested by this forum thread.
当启用MSI中断,驱动程序初始化失败
Linux的NVIDIA驱动程序使用消息信号中断(MSI)在默认情况下。这提供了兼容性和可扩展性优势,主要是由于避免IRQ共享。有些系统存在微星的支持问题,工作时virtual wire经常中断。这些问题表现为NVIDIA驱动程序无法启动X,或CUDA初始化失败。
MSI interrupts can be disabled via the NVIDIA kernel module parameter NVreg_EnableMSI=0
. This can be set on the command line when loading the module, or more appropriately via the distribution's kernel module configuration files (such as those under /etc/modprobe.d/).
例如:
# Nvidia drivers support
alias char-major-195 nvidia
alias /dev/nvidiactl char-major-195
# To tweak the driver the following options can be used, note that
# you should be careful, as it could cause instability!! For more
# options see /usr/share/doc/nvidia-drivers-337.19/README
#
# !!! SECURITY WARNING !!!
# DO NOT MODIFY OR REMOVE THE DEVICE FILE RELATED OPTIONS UNLESS YOU KNOW
# WHAT YOU ARE DOING.
# ONLY ADD TRUSTED USERS TO THE VIDEO GROUP, THESE USERS MAY BE ABLE TO CRASH,
# COMPROMISE, OR IRREPARABLY DAMAGE THE MACHINE.
options nvidia NVreg_DeviceFileMode=0660 NVreg_DeviceFileUID=0 NVreg_DeviceFileGID=27 NVreg_ModifyDeviceFiles=1 NVreg_EnableMSI=0
在4GB内存或更大内存的机器上获得2D加速支持
When NVIDIA 2D acceleration is giving problems, then it is likely that the system is unable to set up a write-combining range with MTRR. To verify, check the contents of /proc/mtrr:
root #
cat /proc/mtrr
Every line should contain write-back
or write-combining
. When a line shows up with uncachable
in it then it is necessary to change a BIOS setting to fix this.
Reboot and enter the BIOS, then find the MTRR settings (probably under "CPU Settings"). Change the setting from continuous
to discrete
and boot back into Linux. There is now no uncachable
entry anymore and 2D acceleration now works without any glitches.
Alternatively, it might be necessary to enable MTRR cleanup support (CONFIG_MTRR_SANITIZER=Y) in the Linux kernel:
Processor type and features --->
[*] MTRR (Memory Type Range Register) support
[*] MTRR cleanup support
(0) MTRR cleanup enable value (0-1)
(1) MTRR cleanup spare reg num (0-7)
Failed to initialize DMA on Ryzen
Disable AMD Secure Memory Encryption[8]:
Processor type and features --->
[ ] AMD Secure Memory Encryption (SME) support
当试图加载内核模块出现"no such device"
这通常是由于以下问题之一引起的:
- The system does not have a NVIDIA card at all. Check lspci output to confirm that the system has a NVIDIA graphics card installed and detected.
- The currently installed version of x11-drivers/nvidia-drivers does not support the installed graphics card model. Check the README file in /usr/share/nvidia-drivers-*/ for a list of supported devices, or use the driver search at http://www.geforce.com/drivers.
- Another kernel driver has control of the hardware. Check lspci -k to see if another driver like "nouveau" or "efifb" is bound to the graphics card. If so, disable or blacklist this driver.
直接渲染未启用
If direct rendering does not work, it may be because the kernel has Direct Rendering Manager enabled, which conflicts with the driver. See the direct rendering status by following instructions in the section Testing the card.
First, disable Direct Rendering Manager (CONFIG_DRM
) in the kernel :
Device drivers --->
Graphics support --->
< > Direct Rendering Manager (XFree86 4.1.0 and higher DRI support)
Next, rebuild x11-drivers/nvidia-drivers since the driver may have built against the kernel DRM symbols. It should fix the problem.
视频播放不流畅或缓慢
最近似乎有一些重大的在播放某些类型的NVIDIA二进制驱动的视频,引起慢速视频播放或显著卡顿。这个问题似乎在英特尔 CPU 空闲更换为某些 CPU 而不是常见的 ACPI CPU 减速规造成的。
Disable the Intel CPU idling method using intel_idle.max_cstate=0
on the kernel command line boot method, which should cause the kernel to automatically fall back to the normal or older ACPI CPU idling method. Also, disabling the NVIDIA Powermizer feature, or setting Powermizer to maximum performance within nvidia-settings has been said to help. Although the Intel CPU idling method recently was introduced as the default CPU idling method for i5 and i7 CPUs (versus using ACPI CPU idling) is the root cause here. This idling method significantly solves the problem, however some minimal stuttering or slow video is encountered if deinterlacing was enabled; this is when the video is likely already deinterlaced (ie. alias mplayer-nodeint with something similar to {{{1}}} as a work around.)
If using GRUB as the bootloader, add this kernel parameter to /etc/default/grub like so:
GRUB_CMDLINE_LINUX_DEFAULT="intel_idle.max_cstate=0"
Don't forget to run grub-mkconfig -o /boot/grub/grub.cfg after making the change, so that the new configuration is generated (see the GRUB article for further details).
After you have rebooted, you can verify that the change is active:
user $
cat /sys/module/intel_idle/parameters/max_cstate
0
No HDMI Output/Video/Sound
This problem tended to occur whenever the HDMI hub device turned-off for a period of time, or the computer was started with an HDMI hub device turned off.
First, find the PCI device ID, using lspci.
When this problem occurs, substitute the PCI ID within the following command for rescanning the PCI bus:
root #
echo on > /sys/bus/pci/devices/0000\:06\:00.0/power/control
This avoides disable runtime power management for PCI function 0, placing this PCI bus always on.
No vertical synchronization (no VSync, tearing) in OpenGL applications
Adding the following option to the screen section prevents tearing on GTX 660, 660 Ti, and probably some other GPUs (reference):
Section "Screen"
. . .
Option "metamodes" "nvidia-auto-select +0+0 { ForceFullCompositionPipeline = On }"
. . .
EndSection
udevd using 100% of the CPU, X server failed to start
Workaround available in bug #670340 comment #8
Distorted white lines during early boot
If nothing but a black screen with distorted white lines appears right after the kernel and initramfs is loaded, try disabling CONFIG_SYSFB_SIMPLEFB
and all framebuffer device drivers except CONFIG_FB_EFI
.
ERROR: Kernel configuration is invalid.
When building nvidia-drivers, a message could appear like:
root #
emerge x11-drivers/nvidia-drivers
... * Preparing nvidia module make -j8 HOSTCC=x86_64-pc-linux-gnu-gcc 'LDFLAGS=-m elf_x86_64' NV_VERBOSE=1 IGNORE_CC_MISMATCH=yes SYSSRC=/usr/src/linux SYSOUT=/usr/src/linux modules make[1]: Entering directory '/usr/src/linux-5.15.23-gentoo' test -e include/generated/autoconf.h -a -e include/config/auto.conf || ( \ echo >&2; \ echo >&2 " ERROR: Kernel configuration is invalid."; \ echo >&2 " include/generated/autoconf.h or include/config/auto.conf are missing.";\ echo >&2 " Run 'make oldconfig && make prepare' on kernel src to fix it."; \ echo >&2 ; \ /bin/false)
This is not an error but the code logic that checks for this error. Therefore, the kernel configuration is in fact not invalid.
Plymouth can't find nvidia-uvm
module
When using systemd, it may be worth considering adding the following configuration to /etc/modprobe.d to ensure that nvidia-uvm
is loaded as a soft dependency of the nvidia
module. This helps prevent an error that happens when the configuration file is added to the initrd but not the nvidia-uvm
module; causing an error on Plymouth about not being able to find the nvidia-uvm
module.
This may not be required unless specifically using Dracut, systemd, and observe the error produced by Plymouth (not finding nvidia-uvm
) in the logs.
# Make a soft dependency for nvidia-uvm as adding the module loading to
# /usr/lib/modules-load.d/nvidia-uvm.conf for systemd consumption, makes the
# configuration file to be added to the initrd but not the module, throwing an
# error on plymouth about not being able to find the module.
# Ref: /usr/lib/dracut/modules.d/00systemd/module-setup.sh
# Even adding the module is not the correct thing, as we don't want it to be
# included in the initrd, so use this configuration file to specify the
# dependency.
softdep nvidia post: nvidia-uvm
Wayland GLAMOR (weird keyboard typing bug)
Symtoms: weird keyboard behavior when trying, it deletes and redraw the characters.
Affected: xwayland apps, i noticed it especially on discord and skype
Workaround:
XWAYLAND_NO_GLAMOR=1
Enable this envvar and it will disable GLAMOR, will take more resources but the issue won't occur anymore. you can just add it to /etc/environment file. more information in the gitlab link.
- notice: disabling glamor causes wine-proton games to work a lot slower and there are reports of steam for linux not even starting.
API mismatch
Symptoms: API mismatch can cause launching GPU accelerated applications to fail to launch. It can also cause external displays which are connected via a discrete NVIDIA graphics card to be detected, but not be enabled or activated (the screen will show up in xrandr, but will refuse to display output - the display will stay in low power mode.).
Detection: This problem can be detected using a few different methods:
1. Compare the currently loaded kernel module version with the currently available userspace management utilities.
Kernel module check:
user $
modinfo nvidia | grep version | head -n 1
version: 515.65.01
Userspace utility version:
user $
nvidia-settings --version | grep version
nvidia-settings: version 520.56.06
Observe in the previous command output there is a difference in the patch versions: 515.65.01 vs 515.65.06.
2. Something like the following message will be written to the dmesg log:
user $
dmesg
[ 337.995427] NVRM: API mismatch: the client has the version 520.56.06, but NVRM: this kernel module has the version 515.65.01. Please NVRM: make sure that this kernel module and all NVIDIA driver NVRM: components have the same version. [ 339.048386] [drm:nv_drm_dumb_map_offset [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Failed to lookup gem object for mapping: 0x00000006 [ 339.048400] [drm:nv_drm_dumb_map_offset [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Failed to lookup gem object for mapping: 0x00000007
3. The post-install ebuild log output also includes logic to detect for the API mismatch and instructions for the solution:
user $
emerge @module-rebuild
>>> Installing (1 of 1) x11-drivers/nvidia-drivers-520.56.06::gentoo * >>> SetUID: [chmod go-r] /usr/bin/nvidia-modprobe ... [ ok ] * Removing x11-drivers/nvidia-drivers-520.56.06 from moduledb. * Updating module dependencies for 6.0.5-gentoo-x86_64 ... [ ok ] * Adding module to moduledb. * Currently loaded NVIDIA modules do not match the newly installed * libraries and may prevent launching GPU-accelerated applications. * The easiest way to fix this is usually to reboot
Cause: API mismatch occurs when the nvidia kernel modules are of a different version than the userspace utilities. This occurs when a full system reboot is not performed after an nvidia-drivers package the update.
Solution: The simplest solution is to perform a full system reboot.
专家配置
文档
The x11-drivers/nvidia-drivers package also comes with comprehensive documentation. This is installed into /usr/share/doc and can be viewed with the following command:
user $
less /usr/share/doc/nvidia-drivers-*/README.bz2
内核模块参数
The nvidia
kernel module accepts a number of parameters (options) which can be used to tweak the behavior of the driver. Most of these are mentioned in the documentation. To add or change the values of these parameters, edit the file /etc/modprobe.d/nvidia.conf. Remember to run update-modules after modifying this file, and bear in mind to reload the nvidia
module before the new settings take effect.
Pay close attention to this section as these kernel options can enable features that the hardware may or may not support. These options are not forgiving, so be careful with the parameters. Do not make any changes without validating and double-checking that the change is needed.
Attribute | Default | Description |
---|---|---|
NVreg_DeviceFileUID |
0 |
Modify the user ID for the device file. The default value sets it to the root user. Setting this to another user ID will make the driver module create the device file with access available to that user ID. |
NVreg_DeviceFileGID |
27 |
Modify the Group ID for the device file. The default value sets it to the video group. |
NVreg_DeviceFileMode |
Undefined | Set the permissions for the device file. A value of 0660 grants the owner and group-owner read-write access while other users cannot access the device file. |
NVreg_ModifyDeviceFiles |
1 |
Instruct the driver to enable or disable dynamic device file management. |
NVreg_EnablePCIeGen3 |
0 |
Enable PCIe Gen 3.x support. If the system supports this 8GT high speed bus then enable it with this module option flag. When this is enabled but the system does not support Gen 3.0, the behavior of the system can become irratic and unstable. Some have even reported damage to hardware enabling this when it is not properly supported. By default the Nvidia driver is set to use PCIe Gen 2.x for compatibility reasons. |
NVreg_UsePageAttributeTable |
0 |
This is one of the latest and newest additions to the Nvidia driver modules option. It allows the driver to take full advantage of the PAT technology - a newer way of allocating memory, replacing the older Memory Type Range Register (MTRR) method. The PAT method creates a partition type table at a specific address mapped inside the register and utilizes the memory architecture and instruction set more efficiently and faster. If the computer supports PAT and the feature is enabled in the kernel then this flag can be enabled. Without PAT support, users may experience unstable performance and even crashes if this is enabled. So be careful with these options. |
NVreg_EnableVia4x |
0 |
Enable AGP 4x mode in the the NVIDIA driver on Via-chipset-powered systems. Some of these hardware configurations would not work properly in AGP 4x mode when others would. The default leaves it at AGP 2x mode. |
NVreg_EnableALiAGP |
0 |
On ALi1541 and ALi1647 chipsets, AGP support is by default disabled by the NVIDIA drivers. The value specifies the speed factor to use, so the values 1, 2, 4 and 8 represent AGP 1x, 2x, 4x and 8x respectively. NVIDIA does not recommend changing the value as it may lead to unstable systems. |
NVreg_ReqAGPRate |
Unspecified | Forces the AGP mode on the driver. For instance, a value of 1 means AGP 1x, while a value of 4 means AGP 4x. |
NVreg_NvAGP |
Changes the AGP Gart mode setting. Possible values are: 0 (Disable), 1 (Enable using NVIDIAs internal AGP-Gart), 2 (Enable using the Linux kernel AGP-Gart) and 3 (Enable and use any available, but try th NVIDIA internal one first).
| |
NVreg_EnableAGPSBA |
0 |
Disables (0 ) or enables (1 ) AGP Side Banding. For stability reasons, the setting is by default disabled, but the setting can be enabled for testing and debugging purposes. This is not supported by NVIDIA though.
|
NVreg_EnableAGPFW |
0 |
Enables AGP Fast-Writes when set to 1 . Depending on the system's chipset this may cause stability issues if enabled.
|
NVreg_Mobile |
0 |
Through this setting, users can force the EDID information for particular systems. This workaround is provided for mobile GPU's where EDID information is either non-functional or disabled. Potential values are 0 (Auto detection of the correct setting), 1 (Dell notebooks), 2 (non-Compaq Toshiba laptops), 3 (All other notebooks/laptops), 4 (Compaq Toshiba laptops) or 5 (Gateway machines).
|
NVreg_RemapLimit |
60 |
Maximum amount of system memory remapping. It specifies the amount of memory that the driver will be allowed to remap through the IOMMU/SWIOTLB on a 64-bit system. Only use it if the IOMMU or SMIOTLB is larger than 64mb. NVIDIA recommends to subtract 4mb from the total amount of memory to use. For instance, the default value is 60 which is in fact 64mb. To set it to 128mb, set the value to 124 .
|
NVreg_UpdateMemoryTypes |
0 |
Tweak the use of page table attributes. Possible values are: 0 (Nvidias logic mechanism), 1 (Enable the use of changed page table attributes) and 2 (Disable the use of page table attributes).
|
NVreg_InitializeSystemMemoryAllocations |
1 |
Tell the NVIDIA driver to clear system memory allocations prior to using it for the GPUs. Disabling can give a slight performance boost but at the cost of increased security risks. By default the driver will wipe the allocated by zeroing out its content. |
NVreg_UseVBios |
1 |
Enable or disable the use of the video BIOS int10 code. Set to 0 to disable.
|
NVreg_RMEdgeIntrCheck |
Unspecified | Enable or disable checking for edge-triggered interrupts. |
NVreg_EnableMSI |
1 |
Enable or disable PCIe-MSI capabilities. Enable this to use MSI interrupts instead of wired interrupts. |
NVreg_MapRegistersEarly |
0 |
If set to 1 , allow the driver to map the memory locations early when the system is probing the hardware instead of the default option of doing this when loaded by modprobe or during startx. This is a debugging feature.
|
NVreg_RegisterForACPIEvents |
1 |
Enable the driver to register with the ACPI of the system to receive ACPI events. This can be disabled (0 ) when issues occur with ACPI or while debugging an issue.
|
NVreg_EnableGpuFirmware |
Varies | Enable or disable use of GSP firmware. Turing and later GPUs include a GPU System Processor (GSP) which can be used to offload GPU initialization and management tasks. When using GSP firmware, the driver will not yet correctly support display-related features or power management related features. These features will be added to GSP firmware in future driver releases. |
编辑/etc/modprobe.d/nvidia.conf 文件, 然后更新模块信息:
root #
update-modules
Unload the nvidia
module...
root #
modprobe -r nvidia
…然后加载它再一次:
root #
modprobe nvidia
高级 X 配置
The GLX layer also has a plethora of options which can be configured. These control the configuration of TV out, dual displays, monitor frequency detection, etc. Again, all of the available options are detailed in the documentation.
To use any of these options, list them in the relevant Device section of the X config file (usually /etc/X11/xorg.conf.d/nvidia.conf). For example, to disable the splash logo:
Section "Device"
Identifier "nVidia Inc. GeForce2"
Driver "nvidia"
Option "NoLogo" "true"
VideoRam 65536
EndSection
参考
- nouveau & nvidia-drivers switching - Hybrid图形模式使用开源驱动。
- NVIDIA Optimus - 配置系统以使用Hybrid图形的封闭源驱动程序(模式设置)。
References
- ↑ https://github.com/NVIDIA/open-gpu-kernel-modules/issues/228
- ↑ https://bugzilla.kernel.org/show_bug.cgi?id=216303#c5
- ↑ https://github.com/NVIDIA/open-gpu-kernel-modules/issues/341#issuecomment-1201748789
- ↑ https://forums.gentoo.org/viewtopic-t-1103798-start-0.html
- ↑ https://forums.gentoo.org/viewtopic-t-1103938-start-0.html
- ↑ https://forums.gentoo.org/viewtopic-t-1117792-start-0.html
- ↑ https://lwn.net/ml/linux-kernel/CAHk-=whA6zru0BaNm4uu5KyZe+aQpRScOnmc9hdOpO3W+xN9Xw@mail.gmail.com/
- ↑ https://devtalk.nvidia.com/default/topic/1039297/linux/unable-to-start-x-failed-to-initialize-dma-/
This page is based on a document formerly found on our main website gentoo.org.
The following people contributed to the original document: Sven Vermeulen (SwifT) , M Curtis Napier, and Chris Gianelloni
They are listed here because wiki history does not allow for any external attribution. If you edit the wiki article, please do not add yourself here; your contributions are recorded on each article's associated history page.