AMDGPU

From Gentoo Wiki
Jump to:navigation Jump to:search
This page is a translated version of the page AMDGPU and the translation is 17% complete.
Other languages:
Resources

AMDGPU是新一代开源图形驱动程序系列,采用新的显示核心(Display Core, DC)框架,专为 Vega、Raven Ridge 及后续 GPU 设计。然而,它还能够支持基于 GCN1.0+ 架构的较新 AMD/ATI Radeon 系列显卡,包括但不限于南方群岛(Southern Islands)、海洋群岛(Sea Islands)、火山群岛(Volcanic Islands)以及北极群岛(Arctic Islands)系列芯片组。

如果想要查询的显卡没有出现在下方的特性/功能支持部分,那就说明您的显卡并不是由AMDGPU驱动的。这种情况下请检查这篇文章:radeon,里面包含了 用于旧版开源AMD/ATI Radeon显卡驱动程序的安装指南。

在内核4.15版本之前,Display Core(DC,由Display Abstraction Layer,DAL发展而来)并未包含在原生内核源代码中,[1][2][3] 因此AMDGPU无法为VEGA及其后续芯片提供向显示器输出图形的能力。

安装

设置系统以使用AMDGPU需要完成以下步骤:正确识别显卡型号,安装相应的固件,配置内核,并安装X11驱动程序。

前提

硬件检测

想要选择正确的驱动程序,首先需要检测显卡。为此请用lspci :

root #lspci | grep -i VGA

检查输出中是否包含下表列出的产品名称之一。

特性/功能支持

AMDGPU 驱动支持的视频核心具备OpenGL 4.6和OpenGL ES 3.2特性。 必须将VIDEO_CARDS 设置为"amdgpu radeonsi"。 通过media-libs/mesa (版本20.0或更高版本)驱动还额外支持Vulkan (RADV驱动),并通过ROCm(dev-libs/rocm-opencl-runtime)提供OpenCL 2.0支持。 此外,还可以通过radeonsi支持VDPAUVAAPI支持。

家族 芯片组名称 微架构[4] 实际指令集架构[5] 产品名称 备注
南方群岛 CAPE VERDE, PITCAIRN, TAHITI, OLAND, HAINAN GCN1.0+ DCE 6.x HD7750-HD7970, R9 270, R9 280, R9 370X, R7 240, R7 250 自内核4.9-rc1以来实验性、可选支持。GCN1.x的稳定支持可以在较旧的radeon驱动程序中找到。
海洋群岛 BONAIRE, KABINI, KAVERI, HAWAII, MULLINS GCN2.x DCE 8.x HD7790, R7 260, R9 290, R7 360, R9 390 内核支持是可选的,想要内核支持则必须通过设置 DRM_AMDGPU_CIK=y来激活, 如果不这样做,较老的radeon驱动程序将为海洋群岛(GCN2.x)系列显卡提供稳定支持。
火山群岛 CARRIZO, FIJI, STONEY, TONGA, TOPAZ, WANI GCN3.x DCE 10/11.x R9 285, R9 380, R9 380X, R9 Fury, R9 Nano, R9 Fury X, Pro Duo 从Linux内核4.7-rc6版本开始已支持。
北极群岛 POLARIS10/11/12, VEGAM GCN4.x DCE 11.2 RX 460, RX 470, RX 480, RX 540, RX 550, RX 560, RX 570, RX 580, RX 590, Pro WX 3200 从Linux内核4.15版本开始已支持。
Vega VEGA10/11/12/20 GCN5.x DCE 12.x RX Vega 56, RX Vega 64, Radeon Vega II, Radeon VII 从Linux内核4.15版本开始已支持。
Vega RAVEN GCN5.x DCN 1.0 Raven Ridge APU series Since kernel 4.16.[6][7]
Vega RENOIR GCN5.x DCN 2.1 Renoir, Lucienne, and Cezanne APU series
Navi NAVI10/14 RDNA DCN 2.0 RX 5500, RX 5500 XT, RX 5600, RX 5600 XT, RX 5700, RX 5700 XT 要求的内核版本最低为5.3,Mesa最低版本为19.2和LLVM的最低版本为9.0。[8]
Navi NAVI21/22/23/24 RDNA2 DCN 3.0 RX 6500 XT, RX 6600, RX 6600 XT, RX 6650 XT, RX 6700, RX 6700 XT, RX 6750 XT, RX 6800, RX 6800 XT, RX 6900 XT, RX 6950 XT 要求的内核版本最低为5.3,Mesa最低版本为19.2和LLVM的最低版本为9.0。[8] RX 6*00 series since kernel 5.9.12 with CONFIG_DRM_AMD_DC_DCN3_0=Y.[9][10]

固件

安装显卡所需的适当固件(或微代码)是必要的。固件文件由sys-kernel/linux-firmware软件包提供。

加载固件有两种方法:

  1. 将 AMDGPU 编译成模块然后简单地安装 sys-kernel/linux-firmware 软件包(固件会在运行时加载),
  2. 将 AMDGPU 和 必须的固件 编译进内核(固件会在编译时加载)。


最简单的方法是先进行步骤1,然后,如果你愿意的话,再找出你需要的固件blob并完成步骤2。

USE flags for sys-kernel/linux-firmware Linux firmware files

bindist Flag to enable or disable options for prebuilt (GRP) packages (eg. due to licensing issues)
compress-xz Compress firmware using xz (app-arch/xz-utils) before installation
compress-zstd Compress firmware using zstd (app-arch/zstd) before installation
deduplicate Create symlinks for all firmware that is duplicate using rdfind
initramfs Create and install initramfs for early microcode loading in /boot (only AMD for now)
redistributable Install also non-free (but redistributable) firmware files
savedconfig Allows individual selection of firmware files
unknown-license Install firmware files whose license is unknown

如果启用了 savedconfig USE 标志,请确保所有必要的硬件对应文件(指firmware文件)写进配置文件中。如有疑问,在你明确知道自己需要什么之前,先禁用savedconfig。

root #emerge --ask sys-kernel/linux-firmware

通过这种方式安装的固件文件将会被整合到内核中。

附注
Navi10系列显卡(RX 5700, RX 5700XT [FE])要求sys-kernel/linux-firmware的最低版本为20190923

内核

附注
最简单的安装方法是将"AMD GPU"作为模块(M)进行选择并且将其包含在initramfs中。这样,当udev变得活跃时,驱动程序会稍晚一些加载,而且在这种情况下,固件无需手动管理。否则,请仔细阅读下方的固件整合章节。

为上面提到的显示芯片组设置如下内核选项:

内核 Configuring the kernel for AMD graphics (Linux kernels 4.15 and newer)
Processor type and features  --->
    [*] MTRR (Memory Type Range Register) support (''CONFIG_MTRR'')
Memory Management options  --->
    [*] Allow for memory hot-add
    [*] Allow for memory hot remove
    [*] Device memory (pmem, HMM, etc...) hotplug support
    [*] Unaddressable device memory (GPU memory, ...)
Device Drivers  --->
    Graphics support  --->
        <*/M> Direct Rendering Manager (XFree86 4.1.0 and higher DRI support) ---> (''DRM_FBDEV_EMULATION'')
              [*]   Enable legacy fbdev support for your modesetting driver
        <   > ATI Radeon
        <M/*> AMD GPU
              [ /*] Enable amdgpu support for SI parts (''DRM_AMDGPU_SI'')
                    (only needed for Southern Islands GPUs with the amdgpu driver)
              [ /*] Enable amdgpu support for CIK parts (''DRM_AMDGPU_CIK'')
                    (only needed for Sea Islands GPUs with the amdgpu driver)
              ACP (Audio CoProcessor) Configuration  ---> 
                  [*] Enable AMD Audio CoProcessor IP support (''CONFIG_DRM_AMD_ACP'')
                        (only needed for APUs)
              Display Engine Configuration  --->
                  [*] AMD DC - Enable new display engine (''DRM_AMD_DC'')
                  [ /*] DC support for Polaris and older ASICs
                        (only needed for Polaris, Carrizo, Tonga, Bonaire, Hawaii)
                  [ /*] AMD FBC - Enable Frame Buffer Compression
                  [ /*] DCN 1.0 Raven family
                        (only needed for Vega RX as part of Raven Ridge APUs)
                  [ /*] DCN 3.0 family
                        (only needed for NAVI21/Sienna Cichlid GPUs with the amdgpu driver)
        <*/M> HSA kernel driver for AMD GPU devices (''HSA_AMD'')
    <*/M> Sound card support  --->
        <*/M> Advanced Linux Sound Architecture  --->
            [*]   PCI sound devices ---> (''CONFIG_SND_PCI'')
                  HD-Audio  --->
                      <*> HD Audio PCI (''CONFIG_SND_HDA_INTEL'')
                      [*] Support initialization patch loading for HD-audio (''CONFIG_SND_HDA_PATCH_LOADER'')
                      <*> whatever audio codec your soundcard needs
                      <*> Build HDMI/DisplayPort HD-audio codec support (''CONFIG_SND_HDA_CODEC_HDMI'')
                  (2048) Pre-allocated buffer size for HD-audio driver (''CONFIG_SND_HDA_PREALLOC_SIZE'')
附注
使用AMDGPU时,建议不设置ATI Radeon选项(即在内核选项中不开启该选项)以免构建radeon模块。者,可以构建该模块并将其 加入黑名单(重启后通过lsmod | grep radeon检查是否成功屏蔽)。amdgpu和radeon模块不应同时加载,除非像multiseat,系统这样的特殊情况需要。

仅当声卡支持HDMI或DisplayPort音频且用户希望使用这些功能时,才需从“声卡支持”菜单中设置相关选项。对于出现“启用AMD Audio CoProcessor IP支持”的较新内核版本,还应将其一并勾选。

AMDGPU 最初是为 VEGA10(GCN5.0)和 RAVEN(配备 DCN 1.0)系列 GPU/APU 实现的。版本号低于4.17的内核为较旧的显卡(GCN1.1及以上型号)提供了(实验性的)显示核心支持,可以通过命令行选项amdgpu.dc=1实现,这一选项可能比旧版的radeon内核模块表现更佳。同样地,如果出于某种特定原因需要禁用显示核心功能,则可以在内核命令行中使用选项amdgpu.dc=0

为了获得更多使用HDMI/DisplayPort音频相关细节,请阅读radeon

整合固件

The firmware package installed in an earlier section provides files in /lib/firmware/amdgpu (for Volcanic Islands and newer cards) and/or /lib/firmware/radeon (for Southern Islands and Sea Islands cards). AMDGPU must be able to access the correct firmware files when it is loaded.

重要
If the amdgpu module is compiled as a loadable kernel module (i.e. AMDGPU in the kernel configuration is set to M), the firmware files need to be accessible at the time it is loaded. In particular, if the module is loaded from an initrd/initramfs, the kernel will initialize it during early boot, just like when the module is built into the kernel directly (i.e. AMDGPU in the kernel configuration is set to *). For the firmware files to be accessible at this stage they need to be either included in the initrd/initramfs (which needs to be loaded by the bootloader, e.g. GRUB) or included directly in the kernel image.
内核 Including firmware in the kernel (prior to 4.18)
Device Drivers  --->
    Generic Driver Options  --->
        -*- Userspace firmware loading support
        [*] Include in-kernel firmware blobs in kernel binary 
            (amdgpu/<YOUR-MODEL>.bin or radeon/<YOUR-MODEL>.bin) (''CONFIG_EXTRA_FIRMWARE'')
            (/lib/firmware) Firmware blobs root directory
内核 Including firmware in the kernel (4.18 and later)
Device Drivers  --->
    Generic Driver Options  --->
        Firmware loader --->
          -*- Firmware loading facility
          (amdgpu/<YOUR-MODEL>.bin or radeon/<YOUR-MODEL>.bin) Build named firmware blobs into the kernel binary
          (/lib/firmware) Firmware blobs root directory
附注
With sys-kernel/genkernel > 4.0 it is easily possible to include specified firmware files in an initramfs. Refer to the Firmware loading section of the genkernel article. Likewise, with Dracut it is also easily possible to add files to the image.
重要
Kernels before 4.15.x (Aug. 2018) and 4.19.9[11] (Dec. 2018) require a different (older) set of firmware files than listed here in order to boot successfully. For all current kernels it is recommended to always make sure that sys-kernel/linux-firmware is updated.

In the case that the firmware needs to be included in the kernel or in an initramfs, and if using the savedconfig USE flag for sys-kernel/linux-firmware, make sure that the savedconfig configuration file is updated with a changed set of firmware files as well (like the change in 2018 mentioned above). Incorporate all the newly added files to the kernel configuration file in the firmware line, then rebuild and install the new kernel image. Otherwise boot will likely fail with a blank screen and firmware load errors thrown to the kernel log.

It is important you include all the firmware blobs that are needed by the driver. The required blobs can either be determined by a discovery approach or, if you know your card model, using the table in the next section.

Discovering which firmware blobs are needed

In the case you are unsure which blobs are needed, a trial and error method often leads to success. In a multi-step process a basic bootable system may suffice to get the required information: missing firmware is indicated by an amdgpu error in dmesg, which helps to identify the required firmware files.

附注
This method will, without any firmware files, very likely result in a blank screen since the AMDGPU driver doesn't work properly without firmware. A very basic method to still get the required information is to type in the blind and save the dmesg output into a file, which can be analyzed when rebooting without the AMDGPU driver in use. A better choice might be to intermittently include all the firmware as in amdgpu/* since dmesg normally shows which firmware was loaded tied to CONFIG_GENTOO_PRINT_FIRMWARE_INFO=y being set in the kernel .config, or to force the use of another framebuffer driver (like vesafb or efifb).
root #dmesg -t | grep amdgpu | grep firmware
amdgpu 0000:07:00.0: Direct firmware load for amdgpu/green_sardine_sdma.bin failed with error -2
[drm:sdma_v4_0_early_init] *ERROR* sdma_v4_0: Failed to load firmware "amdgpu/green_sardine_sdma.bin"
amdgpu 0000:07:00.0: Direct firmware load for amdgpu/green_sardine_asd.bin failed with error -2
root #dmesg -t | grep amdgpu | grep firmware
Loading firmware: amdgpu/green_sardine_sdma.bin
Loading firmware: amdgpu/green_sardine_asd.bin
Loading firmware: amdgpu/green_sardine_ta.bin
Loading firmware: amdgpu/green_sardine_pfp.bin
Loading firmware: amdgpu/green_sardine_me.bin
Loading firmware: amdgpu/green_sardine_ce.bin
Loading firmware: amdgpu/green_sardine_rlc.bin
Loading firmware: amdgpu/green_sardine_mec.bin
Loading firmware: amdgpu/green_sardine_dmcub.bin
Loading firmware: amdgpu/green_sardine_vcn.bin
重要
The following will only work when sys-kernel/linux-firmware is installed and the required (but in the example above missing) firmware is actually available. For very new graphics cards the firmware may be included in the unstable package, which can be installed using ~ in ACCEPT_KEYWORDS, e.g. ~amd64 like in ACCEPT_KEYWORDS="~amd64" emerge --ask sys-kernel/linux-firmware or by adding it to /etc/portage/package.accept_keywords.

The way the AMDGPU firmware files are named, all files starting with the GPU model code name are the right firmware blobs to include. In the above example the code name is "Green Sardine", thus this command looking for green_sardine will get the required list for CONFIG_EXTRA_FIRMWARE:

user $ls /lib/firmware/amdgpu/green_sardine*.bin | sed 's/\/lib\/firmware\///' | echo $(cat)
amdgpu/green_sardine_asd.bin amdgpu/green_sardine_ce.bin amdgpu/green_sardine_dmcub.bin amdgpu/green_sardine_me.bin amdgpu/green_sardine_mec2.bin amdgpu/green_sardine_mec.bin amdgpu/green_sardine_pfp.bin amdgpu/green_sardine_rlc.bin amdgpu/green_sardine_sdma.bin amdgpu/green_sardine_ta.bin amdgpu/green_sardine_vcn.bin
附注
If using genkernel, refer to the Firmware loading section there. If using Dracut, refer to the Adding files to the image section.
Firmware blobs for a known card model

If you know what card model you have then this section will tell you which blobs you need.

amdgpu/<YOUR-MODEL>.bin or radeon/<YOUR-MODEL>.bin should be replaced with the full list of filenames given with the chipset's name in the table below, separated by spaces. Use echo to expand the filenames. E.g. for Volcanic Islands/TONGA, run:

user $echo amdgpu/tonga_{ce,k_smc,mc,me,mec2,mec,pfp,rlc,sdma1,sdma,smc,uvd,vce}.bin
amdgpu/tonga_ce.bin amdgpu/tonga_k_smc.bin amdgpu/tonga_mc.bin amdgpu/tonga_me.bin amdgpu/tonga_mec2.bin amdgpu/tonga_mec.bin amdgpu/tonga_pfp.bin amdgpu/tonga_rlc.bin amdgpu/tonga_sdma1.bin amdgpu/tonga_sdma.bin amdgpu/tonga_smc.bin amdgpu/tonga_uvd.bin amdgpu/tonga_vce.bin

Then amdgpu/tonga_ce.bin amdgpu/tonga_k_smc.bin amdgpu/tonga_mc.bin amdgpu/tonga_me.bin amdgpu/tonga_mec2.bin amdgpu/tonga_mec.bin amdgpu/tonga_pfp.bin amdgpu/tonga_rlc.bin amdgpu/tonga_sdma1.bin amdgpu/tonga_sdma.bin amdgpu/tonga_smc.bin amdgpu/tonga_uvd.bin amdgpu/tonga_vce.bin is the string that should be put into the kernel configuration.

After expanding the firmware file names from the following table and copying them into the kernel configuration, save the configuration, then compile and install the new kernel and modules.

Chipset name Firmware
CAPE VERDE radeon/verde_{ce,mc,me,pfp,rlc,smc}.bin radeon/TAHITI_{uvd,vce}.bin
PITCAIRN radeon/pitcairn_{ce,mc,me,pfp,rlc,smc,k_smc}.bin radeon/TAHITI_{uvd,vce}.bin
TAHITI radeon/tahiti_{ce,mc,me,pfp,rlc,smc,uvd,vce}.bin
OLAND amdgpu/oland_{uvd,smc,rlc,pfp,me,mc,ce}.bin
HAINAN radeon/hainan_{ce,mc,me,pfp,rlc,smc}.bin radeon/TAHITI_uvd.bin
BONAIRE radeon/bonaire_{ce,k_smc,mc,me,mec,pfp,rlc,sdma1,sdma,smc,uvd,vce}.bin
KABINI radeon/kabini_{ce,me,mec,pfp,rlc,sdma1,sdma,uvd,vce}.bin
KAVERI radeon/kaveri_{ce,me,mec2,mec,pfp,rlc,sdma1,sdma,uvd,vce}.bin
HAWAII amdgpu/hawaii_{ce,k_smc,mc,me,mec,pfp,rlc,sdma,sdma1,smc,uvd,vce}.bin
MULLINS radeon/mullins_{ce,me,mec,pfp,rlc,sdma1,sdma,uvd,vce}.bin
CARRIZO amdgpu/carrizo_{ce,me,mec2,mec,pfp,rlc,sdma1,sdma,uvd,vce}.bin
FIJI amdgpu/fiji_{ce,mc,me,mec2,mec,pfp,rlc,sdma1,sdma,smc,uvd,vce}.bin
TONGA amdgpu/tonga_{ce,k_smc,mc,me,mec2,mec,pfp,rlc,sdma1,sdma,smc,uvd,vce}.bin
TOPAZ amdgpu/topaz_{ce,mc,me,mec2,mec,pfp,rlc,sdma1,sdma,smc}.bin
STONEY amdgpu/stoney_{ce,me,mec,pfp,rlc,sdma,uvd,vce}.bin
POLARIS10 amdgpu/polaris10_{ce,ce_2,k_smc,k2_smc,k_mc,mc,me,me_2,mec2,mec2_2,mec,mec_2,pfp,pfp_2,rlc,sdma1,sdma,smc,smc_sk,uvd,vce}.bin
POLARIS11 amdgpu/polaris11_{ce,k_smc,k2_smc,k_mc,mc,me,mec2,mec,pfp,rlc,sdma1,sdma,smc,smc_sk,uvd,vce}.bin
POLARIS12 amdgpu/polaris12_{ce,ce_2,k_mc,k_smc,mc,me,me_2,mec,mec2,mec2_2,mec_2,pfp,pfp_2,rlc,sdma,sdma1,smc,uvd,vce}.bin
VEGA10 amdgpu/vega10_{acg_smc,asd,ce,gpu_info,me,mec,mec2,pfp,rlc,sdma,sdma1,smc,sos,uvd,vce}.bin
RAVEN amdgpu/raven_{asd,ce,gpu_info,me,mec,mec2,pfp,rlc,sdma,vcn}.bin
VEGA12 amdgpu/vega12_{asd,ce,gpu_info,me,mec,mec2,pfp,rlc,sdma,sdma1,smc,sos,uvd,vce}.bin
RENOIR amdgpu/renoir_{asd,ce,dmcub,gpu_info,me,mec2,mec,pfp,rlc,sdma,ta,vcn}.bin
CEZANNE amdgpu/green_sardine_{asd,ce,dmcub,me,mec2,mec,pfp,rlc,sdma,ta,vcn}.bin
REMBRANDT amdgpu/yellow_carp_{asd,ce,dmcub,me,mec2,mec,pfp,rlc,sdma,ta,toc,vcn}.bin
NAVI10 amdgpu/navi10_{asd,ce,gpu_info,me,mec2,mec,pfp,rlc,sdma1,sdma,smc,sos,ta,vcn}.bin
NAVI14 amdgpu/navi14_{asd,ce,ce_wks,gpu_info,me,mec2,mec2_wks,mec,mec_wks,me_wks,pfp,pfp_wks,rlc,sdma1,sdma,smc,sos,ta,vcn}.bin
NAVI21 amdgpu/sienna_cichlid_{ce,dmcub,me,mec2,mec,pfp,rlc,sdma,smc,sos,ta,vcn}.bin
NAVI22 amdgpu/navy_flounder_{ce,me,mec2,rlc,smc,ta,dmcub,mec,pfp,sdma,sos,vcn}.bin
NAVI23 amdgpu/dimgrey_cavefish_{ce,me,mec2,rlc,smc,ta,dmcub,mec,pfp,sdma,sos,vcn}.bin
NAVI24 amdgpu/beige_goby_{ce,ta,rlc,sos,dmcub,smc,sdma,mec,mec2,pfp,vcn,me}.bin
NAVI31 amdgpu/gc_11_0_0_{imu,pfp,me,rlc,mec,mes,mes1,mes_2}.bin amdgpu/psp_13_0_0_{sos,ta}.bin amdgpu/smu_13_0_0.bin amdgpu/dcn_3_2_0_dmcub.bin amdgpu/sdma_6_0_0.bin amdgpu/vcn_4_0_0.bin
NAVI32 amdgpu/dcn_3_2_0_dmcub.bin amdgpu/gc_11_0_3_{imu,me,mec,mes1,mes_2,pfp,rlc}.bin amdgpu/psp_13_0_10_{sos,ta}.bin amdgpu/sdma_6_0_3.bin amdgpu/smu_13_0_10.bin amdgpu/vcn_4_0_0.bin

X11 driver

Emerge

Portage uses the VIDEO_CARDS USE_EXPAND variable for enabling support for various graphics cards in packages. Setting VIDEO_CARDS to amdgpu radeonsi (see the feature matrix section above) and asking Portage to update changed USE flags in the @world set will pull in the correct driver:

文件 /etc/portage/make.conf
VIDEO_CARDS="amdgpu radeonsi"
附注
All AMD video cards supported by amdgpu require video_cards_radeonsi to enable OpenGL support provided by media-libs/mesa. This adds a hard-dependency on x11-libs/libdrm with video_cards_radeon enabled, which may be satisfied via /etc/portage/package.use if support for the old radeon kernel driver is not desired.
root #emerge --ask --deep --changed-use @world

The system should now be prepared to use amdgpu after the next reboot.

Power management

附注
This section only covers the newer AMDGPU Dynamic Power Management (DPM) methods (beginning from Radeon HD 2000 series / r600). Older dynpm and profile methods can be found on the radeon wiki page.

Dynamic Power Management (DPM) is a technique that allows for the driver to dynamically adjust the core clock frequency, memory clock frequency, and voltage levels based on the current GPU demand. Since kernel 3.13, DPM is enabled by default for a majority of AMD hardware.[12]

Starting with kernel 4.5[13] AMDGPU supported PowerPlay profiles. These profiles replaced the power_dpm_state on newer hardware.

重要
The following sections assume that card0 is the GPU users want to adjust. Identify the correct card number listed in /sys/class/drm/ and modify the commands accordingly.

To check if the system is using PowerPlay, review the contents of the devices's sysfs directory:

user $ls /sys/class/drm/card0/device/pp_*

Any files returned with the pp_ prefix indicate PowerPlay is implemented by the drivers.

警告
The 'files' in /sys/class/drm/card0/device/ expose low-level graphics APIs. Changing their contents requires specific command operations and may irrevocably damage system hardware.[14]

Enabling DPM and PowerPlay features

DPM

The following kernel parameter can be used to explicitly enable (1) or disable (0) DPM. The default is -1 (auto)[15].

代码 Enable DPM
amdgpu.dpm=1

PowerPlay feature mask

附注
Enabling DPM also appears to enable PowerPlay, if it is supported, as the kernel parameter amdgpu.powerplay has not been documented since kernel 4.5[16].

The PowerPlay feature mask kernel parameter overrides display features of the GPU. It is required to unlock access to adjust clocks and voltages in sysfs. The mask consists of 32 bits, currently there are 20 features implemented[17]. The default is the current set of stable display features[18].

代码 kernel:root/drivers/gpu/drm/amd/include/amd_shared.h
* @PP_SCLK_DPM_MASK: Dynamic adjustment of the system (graphics) clock.
* @PP_MCLK_DPM_MASK: Dynamic adjustment of the memory clock.
* @PP_PCIE_DPM_MASK: Dynamic adjustment of PCIE clocks and lanes.
* @PP_SCLK_DEEP_SLEEP_MASK: System (graphics) clock deep sleep.
* @PP_POWER_CONTAINMENT_MASK: Power containment.
* @PP_UVD_HANDSHAKE_MASK: Unified video decoder handshake.
* @PP_SMC_VOLTAGE_CONTROL_MASK: Dynamic voltage control.
* @PP_VBI_TIME_SUPPORT_MASK: Vertical blank interval support.
* @PP_ULV_MASK: Ultra low voltage.
* @PP_ENABLE_GFX_CG_THRU_SMU: SMU control of GFX engine clockgating.
* @PP_CLOCK_STRETCH_MASK: Clock stretching.
* @PP_OD_FUZZY_FAN_CONTROL_MASK: Overdrive fuzzy fan control.
* @PP_SOCCLK_DPM_MASK: Dynamic adjustment of the SoC clock.
* @PP_DCEFCLK_DPM_MASK: Dynamic adjustment of the Display Controller Engine Fabric clock.
* @PP_OVERDRIVE_MASK: Over- and under-clocking support.
* @PP_GFXOFF_MASK: Dynamic graphics engine power control.
* @PP_ACG_MASK: Adaptive clock generator.
* @PP_STUTTER_MODE: Stutter mode.
* @PP_AVFS_MASK: Adaptive voltage and frequency scaling.
* @PP_GFX_DCS_MASK: GFX Async DCS.

Determine the current system mask:

user $printf 'amdgpu.ppfeaturemask=0x%x\n' "$(($(cat /sys/module/amdgpu/parameters/ppfeaturemask)))"
amdgpu.ppfeaturemask=0x0007bfff

Features may be changed by setting the kernel parameter at boot.

代码 Set PowerPlay mask
amdgpu.ppfeaturemask=0x0007bfff
附注
Setting all 32 bits (0xffffffff) is not recommended as this will enable potentially unstable features by default.
警告
Enabling or disabling features without understanding their intent may lead to hardware damage or data loss.

Configuration

警告
Some of these instructions interface directly with the hardware and have the potential to irreversibly damage the device. Proofread the commands before executing.

AMDGPU handles configuration of the hardware through exposed APIs, using sysfs files located in /sys/class/drm/card0/device/. The files contained within this directory will depend on the specific hardware and features that are enabled. Some of the files can be safely read by the user using cat, less, or any other non-root text editing program. Although, many of the files output binary data that is not human readable.

Adjusting the clock rates and voltages (under/over clocking) is accomplished through the DPM and PowerPlay APIs. The full documentation can be found at kernel.org and should be reviewed before proceeding.

Viewing current metrics

The amdgpu driver provides a sysfs API for retrieving current gpu metrics data through the gpu_metrics file and gives a snapshot of all sensors at the same time. This include temperature, frequency, engines utilization, power consume, throttler status, fan speed and cpu core statistics (available for APU only).

It can be parsed using a script such as amdgpu_metrics.py

Update feature mask

Before any parameters can be adjusted, the correct feature mask must be set with a kernel parameter. Generally, setting the PP_OVERDRIVE_MASK bit 0x4000 in combination with the system's current mask is sufficient for adjusting the profile, clock, and voltage values.

Determine the new system mask.

user $printf 'amdgpu.ppfeaturemask=0x%x\n' "$(($(cat /sys/module/amdgpu/parameters/ppfeaturemask) | 0x4000))"
amdgpu.ppfeaturemask=0x0007ffff

Update kernel parameter.

代码 kernel parameter: PowerPlay feature mask
amdgpu.ppfeaturemask=0x0007ffff

Performance profiles

The amdgpu driver provides a sysfs API for adjusting certain power related parameters. The file power_dpm_force_performance_level is used for this. A full description of the profiles can be found in the kernel documentation. The default is set to 'auto'.

The performance profile must be set to manual to enable modification of power profiles, clock speeds, and voltages.

To change the current profile:

root #echo 'manual' > /sys/class/drm/card0/device/power_dpm_force_performance_level
附注
These changes do not persist after a reboot.

Power states

The amdgpu driver provides a sysfs API for adjusting the heuristics related to switching between power levels in a power state. The file pp_power_profile_mode is used for this. A full description of the profiles can be found in the kernel documentation.

To view the supported profiles look at the contents of the pp_power_profile_mode file (the asterisk * shows the current profile)

附注
The output of this command will vary depending on the specific hardware and kernel drivers.
user $cat /sys/class/drm/card0/device/pp_power_profile_mode
PROFILE_INDEX(NAME) CLOCK_TYPE(NAME) FPS MinFreqType MinActiveFreqType MinActiveFreq BoosterFreqType BoosterFreq PD_Data_limit_c PD_Data_error_coeff PD_Data_error_rate_coeff
 0 BOOTUP_DEFAULT :
                    0(       GFXCLK)       0       5       1       0       4     800 4587520  -65536       0
                    1(       SOCCLK)       0       5       1       0       1       0 3276800   -6553   -6553
                    2(        MEMLK)       0       5       1       0       4     800  327680  -65536       0
 1 3D_FULL_SCREEN :
                    0(       GFXCLK)       0       5       1       0       4     650 4587520   -3276  -65536
                    1(       SOCCLK)       0       5       1       0       1       0  655360   -6553   -6553
                    2(        MEMLK)       0       5       4     850       4     800  327680  -65536       0
 2   POWER_SAVING :
                    0(       GFXCLK)       0       5       1       0       3       0 5898240  -65536       0
                    1(       SOCCLK)       0       5       1       0       1       0 3407872   -6553   -6553
                    2(        MEMLK)       0       5       1       0       3       0 1966080  -65536       0
 3          VIDEO*:
                    0(       GFXCLK)       0       5       1       0       4     500 4587520  -65536       0
                    1(       SOCCLK)       0       5       1       0       1       0 3473408   -6553   -6553
                    2(        MEMLK)       0       5       1       0       4     500 1966080  -65536       0
 4             VR :
                    0(       GFXCLK)       0       5       4    1000       1       0 3932160       0       0
                    1(       SOCCLK)       0       5       1       0       1       0  655360   -6553   -6553
                    2(        MEMLK)       0       5       1       0       4     800  327680  -65536       0
 5        COMPUTE :
                    0(       GFXCLK)       0       5       4    1000       1       0 3932160       0       0
                    1(       SOCCLK)       0       5       1       0       1       0  655360   -6553   -6553
                    2(        MEMLK)       0       5       4     850       3       0  327680  -65536  -32768
 6         CUSTOM :
                    0(       GFXCLK)       0       5       1       0       4     800 4587520  -65536       0
                    1(       SOCCLK)       0       5       1       0       1       0 3276800   -6553   -6553
                    2(        MEMLK)       0       5       1       0       4     800  327680  -65536       0

To update the power profile, first change the performance mode to manual.

root #echo 'manual' > /sys/class/drm/card0/device/power_dpm_force_performance_level

Then update pp_power_profile_mode with the number of the pre-defined profile.

root #echo '3' > /sys/class/drm/card0/device/pp_power_profile_mode

The power profiles can be modified by sending commands to the pp_power_profile_mode file. The command syntax begins with the profile index number, then the clock type number, followed by a number for each column in the output.

For example, to change the CUSTOM power profile GFXCLK Booster Frequency from 800 to 500.

root #echo '6 0 0 5 1 0 4 500 4587520 -65536 0' > /sys/class/drm/card0/device/pp_power_profile_mode
附注
These changes do not persist after a reboot.

Power levels

The amdgpu driver provides a sysfs API for adjusting what power levels are enabled for a given power state. The files pp_dpm_sclk, pp_dpm_mclk, pp_dpm_socclk, pp_dpm_fclk, pp_dpm_dcefclk and pp_dpm_pcie are used for this. A full description of the profiles can be found in the kernel documentation.

pp_dpm_socclk and pp_dpm_dcefclk interfaces are only available for Vega10 and later ASICs. pp_dpm_fclk interface is only available for Vega20 and later ASICs.

Reading back the files will show the available power levels within the power state and the clock information for those levels.

user $cat /sys/class/drm/card0/device/pp_dpm_sclk
0: 500Mhz 
1: 700Mhz *
2: 2765Mhz 
user $cat /sys/class/drm/card0/device/pp_dpm_mclk
0: 96Mhz *
1: 541Mhz 
2: 675Mhz 
3: 1094Mhz 

Clock speed and voltage

The amdgpu driver provides a sysfs API for adjusting the clocks and voltages in each power level within a power state. The pp_od_clk_voltage is used for this. A full description of the profiles can be found in the kernel documentation.

Determine the current values.

user $cat /sys/class/drm/card0/device/pp_od_clk_voltage
OD_SCLK:
 0: 700Mhz
 1: 2744Mhz
OD_MCLK:
 0: 97Mhz
 1: 1094MHz
OD_VDDGFX_OFFSET:
 0mV
OD_RANGE:
 SCLK:     500Mhz       3150Mhz
 MCLK:     674Mhz       1200Mhz
附注
The actual memory controller clock rates are shown here, not the effective clock of the DRAMs.

To update the clock speeds and voltages, first change the performance mode to manual.

root #echo 'manual' > /sys/class/drm/card0/device/power_dpm_force_performance_level

Then write a string to the file for each adjustment. Follow the syntax given in the kernel documentation.

root #echo 's 1 2410' > /sys/class/drm/card0/device/pp_od_clk_voltage
root #echo 'm 1 1024' > /sys/class/drm/card0/device/pp_od_clk_voltage

Once finished, commit your changes.

root #echo 'c' > /sys/class/drm/card0/device/pp_od_clk_voltage
user $cat /sys/class/drm/card0/device/pp_od_clk_voltage
OD_SCLK:
 0: 700Mhz
 1: 2410Mhz
OD_MCLK:
 0: 97Mhz
 1: 1024MHz
OD_VDDGFX_OFFSET:
 0mV
OD_RANGE:
 SCLK:     500Mhz       3150Mhz
 MCLK:     674Mhz       1200Mhz

These changes can be reverted.

root #echo 'r' > /sys/class/drm/card0/device/pp_od_clk_voltage
附注
These changes do not persist after a reboot.

Troubleshooting

Debug tools

x11-apps/mesa-progs

It might be helpful to install the package x11-apps/mesa-progs, which provides the glxgears and glxinfo utilities.

app-misc/radeontop

View the GPU utilization, both for the total activity percent and individual blocks:

user $radeontop
Collecting data, please wait....
            radeontop 1.4, running on RAVEN bus 06, 120 samples/sec
                                         │
                   Graphics pipe   0.00% │
─────────────────────────────────────────┼──────────────────────────────────────
                    Event Engine   0.00% │
                                         │
     Vertex Grouper + Tesselator   0.00% │
                                         │
               Texture Addresser   0.00% │
                                         │
                   Shader Export   0.00% │
     Sequencer Instruction Cache   0.00% │
             Shader Interpolator   0.00% │
                                         │
                  Scan Converter   0.00% │
              Primitive Assembly   0.00% │
                                         │
                     Depth Block   0.00% │
                     Color Block   0.00% │
                                         │
                67M / 2016M VRAM   3.35% │ 
                 23M / 3063M GTT   0.76% │
      1.20G / 1.20G Memory Clock 100.00% │████████████████████████████████████████                               
      0.20G / 1.20G Shader Clock  16.67% │██████
                                         │

Identifying which graphics card is in use

First make sure that the kernel was compiled with the following settings:

内核 Activate VGA Arbitration (CONFIG_VGA_ARB) and Laptop Hybrid Graphics (CONFIG_VGA_SWITCHEROO)
Device Drivers --->
    Graphics support  --->
        -*- VGA Arbitration
        [*] Laptop Hybrid Graphics - GPU switching support

Check, if the discrete graphics card was recognized:

user $lspci -k
[...]
01:00.0 Display controller: Advanced Micro Devices, Inc. [AMD/ATI] Mars [Radeon HD 8670A/8670M/8750M]
        Subsystem: Lenovo Mars [Radeon HD 8670A/8670M/8750M]
        Kernel driver in use: radeon
[...]

After that. Make sure that the path /sys/kernel/debug/ was mounted successfully:

root #findmnt debugfs
TARGET            SOURCE  FSTYPE  OPTIONS
/sys/kernel/debug debugfs debugfs rw,nosuid,nodev,noexec,relatime

Then, check, if the driver vga_switcheroo was loaded successfully and can output values:

root #cat /sys/kernel/debug/vgaswitcheroo/switch
0:DIS: :DynOff:0000:01:00.0
1:IGD:+:Pwr:0000:00:02.0

This output has the following structure[19]:

Iterator ID Active state Power state Device ID (xxxx:xx:xx.x)
0 DIS inactive (denoted by the lack of a + symbol) DynOff 0000:01:00.0
1 IGD active (denoted by + symbol) Pwr 0000:00:02.0

DIS represents the discrete graphics card, which is inactive, but currently disconnected (DynOff).
IGD is the integrated graphics card, which is active (+) and is currently in use (Pwr).

The status can be manipulated using the following command:

root #echo "<some_parameter>" > /sys/kernel/debug/vgaswitcheroo/switch

Replace <some_parameter> with one of the following paramters[20]:

Parameter Description
ON Turns on the disconnected GPU, which is currently not displaying anything and does not switch outputs.
IGD Connects the integrated graphics card with the display.
DIS Connects the discrete graphics card with the display.
OFF Turns off the graphics card, which is currently disconnected.
DIGD Inside of an X session: Queues a switch to the integrated graphics card to occur, when the X server is next restarted.
DDIS Inside of an X session: Queues a switch to the discrete graphics card to occur, when the X server is next restarted.

By using the environment variable DRI_PRIME=1, one can use the discrete graphics card individually:

user $DRI_PRIME=1 glxgears

This opens an X window with rotating gears.

Let it run in the background and check, vga_switcheroo again:

root #cat /sys/kernel/debug/vgaswitcheroo/switch
0:DIS: :DynPwr:0000:01:00.0
1:IGD:+:Pwr:0000:00:02.0
附注
This time the status of the discrete graphics card switched to DynPwr, which means, that it is active and running.

Another indicator is to check the temperature sensors. This requires sys-apps/lm-sensors:

user $sensors
[...]
radeon-pci-0100
Adapter: PCI adapter
temp1:            +42.0°C  (crit = +120.0°C, hyst = +90.0°C)
[...]
附注
When vga_switcheroo displays the status DynOff, sensors will display the temperature as N/A or as something else, which may not make sense; for example: -128°C.

To use the discrete graphics card globally, one can set the environment variable in the /etc/environment file:

文件 /etc/environment
DRI_PRIME=1

One might export it in the ~/.bashrc file as an alternative:

文件 /home/larry/.bashrc
export DRI_PRIME=1

Or individually in front of the command, like above using glxgears:

user $DRI_PRIME=1 /usr/bin/chromium
user $DRI_PRIME=1 /usr/bin/vlc

Prime Synchronization

The x11-drivers/xf86-video-amdgpu driver does not support Prime Synchronization. This might cause tearing on monitors connected to the integrated GPU if the AMD GPU is set as the primary GPU. One possible workaround is to use the modesetting driver instead, to do this remove amdgpu from the VIDEO_CARDS variable. Or use a xorg configuration file to force the use of the modesetting driver. That being said, other issues may be encountered with the modesetting driver[21].

文件 /etc/X11/xorg.conf.d/force-modesetting.conf
Section "Device"
  Identifier "modesetting"
  Driver "modesetting"
EndSection

Another possible workaround is to set the integrated GPU as the primary GPU. This will not enable Prime Synchronization. However, tearing will be prevented nonetheless through AMD's TearFree. In this case it will be necessary to use the DRI_PRIME=1, VDPAU_DRIVER=radeonsi(for VDPAU) and LIBVA_DRIVER_NAME=radeonsi(for VAAPI) variables on applications that should be rendered on the AMD GPU.

Fallback driver

If having no other machine to browse web pages for solutions, the vesa or fbdev drivers can be used to start X without 3d and 2d acceleration.

  • Vesa for classic BIOS systems
  • Fbdev for UEFI booted systems
文件 /etc/portage/make.conf
VIDEO_CARDS="... vesa fbdev"
root #emerge --ask --update --newuse --deep @world

Kernel

Older kernels

Older kernels that do not support the amdgpu driver will not provide the AMDGPU option. For VEGA and newer chips there is no video output without DC (Display Code), which was first included in vanilla Kernel 4.15. In both cases a fairly recent kernel can provide the required drivers. For very new AMD graphics cards and APUs trying an unstable (denoted by a ~) kernel may provide the required kernel-sources.

AMD Secure Memory Encryption

If amdgpu fails to load or the screen stays frozen, it might be an incompatibility of the amdgpu module with AMD Secure Memory Encryption (SME).

SME can be temporarily disabled on the kernel command line (using GRUB, or in /etc/default/grub or as part of GRUB_CMDLINE_LINUX) by adding mem_encrypt=off. If this fixes the issue, a permanent solution is to configure the kernel accordingly.

内核
Processor type and features  --->
    [*] AMD Secure Memory Encryption (SME) support
    [ ]   Activate AMD Secure Memory Encryption (SME) by default

AMD_MEM_ENCRYPT may remain enabled, but either AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT must remain unset or the kernel command line option mem_encrypt=off must be used in order to turn Memory Encryption off. Likewise, with mem_encrypt=on SME can be activated for unaffected systems on the kernel command line or more permanently using GRUB_CMDLINE_LINUX in /etc/default/grub for GRUB.

AMDGPU/RadeonSI drivers do not work

If the graphics card is not supported by including amdgpu and radeonsi alone in VIDEO_CARDS, try adding radeon to make.conf's VIDEO_CARDS definition. For example:

文件 /etc/portage/make.conf
VIDEO_CARDS="amdgpu radeonsi radeon"

After the values have been set update the system so the changes take effect:

root #emerge --ask --changed-use --deep @world

Full-screen windows perform poorly

The installed version of sys-devel/llvm may be too old. Try emerging an unstable/testing version.

GPU Name shows up as id

The installed version of x11-libs/libdrm may be too old. Try emerging an unstable/testing version. This might also improve performance.

Xrandr doesn't see HDMI port with hybrid system

On hybrid system with AMD iGPU and dGPU xrandr can show only eDP port, but not HDMI:

user $xrandr
Screen 0: minimum 320 x 200, current 1920 x 2160, maximum 16384 x 16384
eDP connected primary 1920x1080+0+1080 (normal left inverted right x axis y axis) 382mm x 215mm
   1920x1080    144.03*+  60.01  
   1680x1050    144.03  
   1280x1024    144.03  
   1440x900     144.03  
   1280x800     144.03  
   1280x720     144.03  
   1024x768     144.03  
   800x600      144.03  
   640x480      144.03  

Whereas Xorg log shows that port was detected and EDID of the monitor decoded without issues:

user $cat /var/log/Xorg.0.log
[...]
[     8.282] (II) AMDGPU(G0): Output HDMI-A-1-0 has no monitor section
[     8.294] (II) AMDGPU(G0): EDID for output HDMI-A-1-0
[     8.295] (II) AMDGPU(G0): Manufacturer: DEL  Model: a11e  Serial#: 843731010
[...]
[     8.295] (II) AMDGPU(G0): Supported established timings:
[     8.295] (II) AMDGPU(G0): 720x400@70Hz
[...]
[     8.295] (II) AMDGPU(G0): EDID (in hex):
[     8.295] (II) AMDGPU(G0):   00ffffffffffff0010ac1ea142504a32
[     8.295] (II) AMDGPU(G0):   0c1f010380351e78ea05f5a557529c27
[...]
[     8.296] (II) AMDGPU(G0): Printing probed modes for output HDMI-A-1-0
[     8.296] (II) AMDGPU(G0): Modeline "1920x1080"x60.0  148.50  1920 2008 2052 2200  1080 1084 1089 1125 +hsync +vsync (67.5 kHz eP)
[...]

If it doesn't then it is different issue. And it should be addressed first.

Xrandr would have 2 providers since there are 2 GPUs:

user $xrandr --listproviders
Providers: number : 2
Provider 0: id: 0x54 cap: 0xf, Source Output, Sink Output, Source Offload, Sink Offload crtcs: 4 outputs: 1 associated providers: 1 name:Unknown AMD Radeon GPU @ pci:0000:07:00.0
Provider 1: id: 0x84 cap: 0xf, Source Output, Sink Output, Source Offload, Sink Offload crtcs: 5 outputs: 1 associated providers: 1 name:Radeon RX 5500M @ pci:0000:03:00.0

And we need to link source with output:

user $xrandr --setprovideroutputsource provider source

In provided example it would be this:

user $xrandr --setprovideroutputsource 1 0

After this xrandr shows HDMI and can manipulate layout properly:

user $xrandr
Screen 0: minimum 320 x 200, current 1920 x 2160, maximum 16384 x 16384
eDP connected primary 1920x1080+0+1080 (normal left inverted right x axis y axis) 382mm x 215mm
   1920x1080    144.03*+  60.01  
[...]
HDMI-A-1-0 connected 1920x1080+0+0 (normal left inverted right x axis y axis) 527mm x 296mm
   1920x1080     60.00*+  50.00    59.94  
[...]

Screen Tearing

One method to prevent screen tearing on Xorg is to enable the TearFree option in X11 like so:

文件 /usr/share/X11/xorg.conf.d/10-amdgpu.conf
Section "OutputClass"
	Identifier "AMDgpu"
	MatchDriver "amdgpu"
	Driver "amdgpu"
    Option "TearFree" "true"
EndSection

Flickering and white screens

附注
This issue has already been reported in the Gentoo forums: https://forums.gentoo.org/viewtopic-t-1160883.html

The suggested fix at upstream level is to set the sg_display module parameter like this: amdgpu.sg_display=0

As an alternative apply the following patch to the kernel source code: https://patchwork.freedesktop.org/patch/519023

Seems to concern Linux kernels >= 6.1.4.

Frequent and Sporadic Crashes

Some users may be experiencing frequent and seemingly random graphics card crashes while using the AMDGPU drivers. Checking the kernel log may show many different errors, some common ones involving *ERROR* Waiting for fences timed out! and *ERROR* ring gfx timeout. This is usually followed by a reset of the graphics device/drivers.

This may be caused by an unintentional overclocking of the hardware, either by the AMDGPU driver or the device firmware. The following steps will show how to check the current system configuration and state. If discrepancies are found, reference the Power management section above for details on how to modify these values.

附注
This is a specific example discovered by one user. The examples below assume the GPU is CARD0 and the output is specific to a Radeon™ RX 6650 XT EAGLE 8G.
附注
This only applies to hardware that has Dynamic Power Management (DPM) enabled. DPM is turned on by default for most modern AMDGPUs.

Begin by looking up the graphics card specifications, using a database such as the TechPowerUp GPU Database or the manufacturer's specifications.

Radeon™ RX 6650 XT EAGLE 8G
Base Clock Game Clock Boost Clock Effective Memeory Clock Effective VRAM Bus Bandwidth
Specification 2055 MHz 2410 MHz 2635 MHz 2190 MHz (17.5 Gbps) 128-bit 280.3 GBps

In this example:

  • Base Clock is the default clock rate.
  • Game Clock is the expected clock rate when running typical gaming applications.
  • Boost Clock is the maximum clock rate when running a burst (infrequent) workload.
  • Memory Clock is the effective memory clock rate; the base DRAM clock rate multiplied by the number of channels.
  • VRAM Bus is the effective data bus bit width.
  • Bandwidth is the rate of data transfer (data_rate * bus_width / 8)


Navigate to the device's AMDGPU sysfs directory.

user $cd /sys/class/drm/card0/device/

View the defined core and memory clock rates listed in the pp_dpm_sclk and pp_dpm_mclk files. The system uses these values to automatically adjust the clock rates under various loads. The current rate is denoted with an asterisk.

user $cat pp_dpm_sclk
0: 500Mhz 
1: 700Mhz *
2: 2765Mhz 
user $cat pp_dpm_mclk
0: 96Mhz *
1: 541Mhz 
2: 675Mhz 
3: 1094Mhz 

Verify the engine clock SCLK is within the limits of the hardware. In this case, the minimum rate is 500 MHz, it increments up to 700 MHz under load, and then up to a maximum of 2765 MHz.

Verify the DRAM memory clock OD_MCLK is within the limits of the hardware. In this case, the minimum reported rate is 96 MHz and the maximum is 1094 MHz.

This device uses GDDR6 (G6), which is dual channel Double Data Rate (DDR) memory[22][23][24].

  • The maximum data transfer rate in transfers per second = clock cycles per second * transfers per clock cycle * data frequency multiplier; 1,094 MHz * 2 T (double data rate) * 8 = 17,504 MT/s. This corresponds to 17.5 Gbps.
  • The bus width is per chip per channel. In this case there are 4 physical ICs on the card[25], each with an I/O width of 16 bits per channel; 16 bits * 2 channel * 4 IC = 128 bits total effective VRAM bus width.
  • The bandwidth = transfers per second * bus width; 17,504 MT/s * 128 bits/T = 2,240,512 Mb/s (280,064 MB/s).


Now view the over drive (boost) clock and voltage configuration in the pp_od_clk_voltage file.

user $cat pp_od_clk_voltage
OD_SCLK:
 0: 700Mhz
 1: 2744Mhz
OD_MCLK:
 0: 97Mhz
 1: 1094MHz
OD_VDDGFX_OFFSET:
 0mV
OD_RANGE:
 SCLK:     500Mhz       3150Mhz
 MCLK:     674Mhz       1200Mhz

The overdrive (boost) engine SCLK range is set to allow values between 500 and 3150 MHz for the engine clock. With discrete steps at 700 MHz and 2744 MHz.

The overdrive (boost) memory MCLK range is set to allow values between 674 MHz and 1200 MHz for the actual memory clock (1348 MHz and 2400 MHz effective). With discrete steps at 97 MHz and 1094 MHz.

附注
The relationship between OD_SCLK, OD_MCLK, and OD_RANGE is not well documented. The values presented above were as-found on the system in question.

Combining all of this information together and comparing the reported and specified values shows a discrepancy for clock rates, with some well above the recommended values. Adjusting these limits (under-clocking the defaults) has resulted in zero crashes, and improved thermal and FPS performance.

Radeon™ RX 6650 XT EAGLE 8G
Base Clock Game Clock Boost Clock Effective Memeory Clock Effective VRAM Bus Bandwidth
Specification 2055 MHz 2410 MHz 2635 MHz 2190 MHz (17.5 Gbps) 128-bit 280.3 GBps
sysfs - MHz 2765 MHz 2744/3150 MHz 2400 MHz (19.2 Gbps) 128-bit 307.2 GBps (@1200 MHz)

Missing cursor on RDNA3 GPUs

Hardware cursor doesn't work on new GPUs. To make cursor visible you should add software cursor option into 20-amdgpu.conf .

文件 /etc/X11/xorg.conf.d/20-amdgpu.conf
Section "Device"
	Identifier "AMD"
	Driver "amdgpu"
	Option "SWCursor" "True"
EndSection

See also

  • AMDGPU-PRO — the next generation closed source graphics component that operates on top of the open source AMDGPU drivers for newer AMD/ATI Radeon graphics cards.
  • AMDVLK — an open-source Vulkan driver for AMD Radeon™ graphics adapters on Linux

External resources

References

  1. https://www.phoronix.com/news/AMDGPU-DC-Accepted
  2. https://lkml.org/lkml/2017/11/16/899
  3. https://github.com/torvalds/linux/commit/f6705bf959efac87bca76d40050d342f1d212587
  4. AMD 曾经将该微架构称为 Display Core(DC)。GCN 是 Graphics Core Next 的缩写,随着 Radeon HD7000 系列(GCN1.0)的发布而引入市场。后来,GCN 微架构被 RDNA(Radeon DNA 的简称)所取代,RDNA 随着 2019 年发布的 Radeon RX 5000 系列(NAVI)显卡进入市场并成为主流。
  5. 实际的指令集架构(ISA)是由显示核心引擎(Display Core Engine, DCE)定义的,而该DCE后来被随着Raven Ridge APU(移动版Vega图形核心)一起推出的Display Core Next(DCN)所取代。
  6. Phoronix - Report: Ryzen "Raven Ridge" APU Not Using HBM2 Memory
  7. Phoronix - 25 More AMDGPU DC Patches, Mostly Focused On Raven DCN
  8. 8.0 8.1 Phoronix - AMD Navi 10 Firmware Finally Lands In The Linux-Firmware Tree
  9. https://www.phoronix.com/scan.php?page=news_item&px=Radeon-RX-6900-XT
  10. https://cateee.net/lkddb/web-lkddb/DRM_AMD_DC_DCN3_0.html
  11. Linux kernel commit 39bdb32 with added firmware files to POLARIS10 and POLARIS11
  12. https://kernelnewbies.org/Linux_3.11#AMD_Radeon_experimental_dynamic_power_management_support
  13. https://www.phoronix.com/news/AMDGPU-PP-4.5-Steps
  14. https://www.kernel.org/doc/html/latest/gpu/amdgpu/thermal.html
  15. https://www.kernel.org/doc/html/latest/gpu/amdgpu/module-parameters.html
  16. https://www.phoronix.com/news/AMDGPU-PP-4.5-Steps
  17. https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/amd/include/amd_shared.h
  18. https://www.kernel.org/doc/html/latest/gpu/amdgpu/module-parameters.html?highlight=ppfeaturemask
  19. https://github.com/torvalds/linux/blob/be8454afc50f43016ca8b6130d9673bdd0bd56ec/drivers/gpu/vga/vga_switcheroo.c#L653-L660
  20. https://help.ubuntu.com/community/HybridGraphics
  21. https://gitlab.freedesktop.org/xorg/driver/xf86-video-amdgpu/-/issues/11
  22. https://www.micron.com/-/media/client/global/documents/products/technical-note/dram/tned03_gddr6.pdf
  23. http://monitorinsider.com/GDDR5X.html
  24. http://monitorinsider.com/GDDR6.html
  25. https://csstalker.gjisland.net/blog/archives/6848