Difference between revisions of "Ryzen"

From Gentoo Wiki
Jump to:navigation Jump to:search
(Added 5950x)
(→‎Kernel: Add P-State driver and suggest using schedulutil CPUFreq governor.)
Line 338: Line 338:
 
}}
 
}}
  
For Zen 3 (or newer) APUs (e.g. in notebooks or Chromebooks), add additionally:
+
For Zen 3 (or newer) APUs (e.g. in notebooks or Chromebooks), additionally select:
{{KernelBox|title=Kernel 5.11 or newer|1=
+
 
 +
{{KernelBox|title=Kernel 5.11 or newer (<var>CONFIG_AMD_PMC</var>)|1=
 
Device Drivers  --->
 
Device Drivers  --->
 
   [*] X86 Platform Specific Device Drivers  --->
 
   [*] X86 Platform Specific Device Drivers  --->
 
     <*>  AMD SoC PMC driver
 
     <*>  AMD SoC PMC driver
 +
}}
 +
 +
For Zen 3 (or newer) CPUs, an alternative AMD P-State driver can be used instead of the traditional ACPI driver. AMD P-State supports the schedutil and ondemand governors for dynamic frequency control. See [https://www.kernel.org/doc/html/latest/admin-guide/pm/amd-pstate.html upstream's documentation] for more information.
 +
 +
{{KernelBox|title=Kernel 5.17 or newer can optionally use the AMD P-State driver (<var>CONFIG_X86_AMD_PSTATE</var>)|1=
 +
Power management and ACPI options  --->
 +
  CPU Frequency scaling  --->
 +
      Default CPUFreq governor (schedutil)  ---
 +
<*>  AMD Processor P-State driver
 
}}
 
}}
  

Revision as of 06:15, 9 July 2022

Resources

Ryzen is a multithreaded, high performance processor manufactured by AMD. The first generation based on the Zen microarchitecture (µarch) was released in Q1, 2017 as Ryzen 1000 series. A refresh called Zen+ was released in Q2, 2018, as the Ryzen 2000 series. The second generation is the Ryzen 3000/4000 series, based on the Zen 2 microarchitecture, and was released in Q3, 2019. The third generation aka Zen 3 microarchitecture was released in Q4, 2020. The Ryzen 5000 series features processors of the Zen 2 and the Zen 3 microarchitectures.

Hardware

Tip
Microcode version can be inspected by running dmesg | grep -i microcode from a terminal.

Ryzen Threadripper

Device µarch Status Bus ID Kernel driver(s) Kernel version Latest microcode Notes
Ryzen TR 1900X Zen Unknown N/A N/A ? ?
Ryzen TR 1920X Zen Works N/A N/A 4.19.44+ 0x08001137 For stability use kernel parameters: processor.max_cstate=1 rcu_nocbs=0-11 idle=nomwait
Ryzen TR 1950X Zen Works N/A N/A 4.19.52+ 0x8001137 Firmware Blob:

amd-ucode/microcode_amd_fam17h.bin

Ryzen TR 2920X Zen+ Unknown N/A N/A ? ?
Ryzen TR 2950X Zen+ Works N/A N/A ? 0x0800820d Firmware Blob:

amd-ucode/microcode_amd_fam17h.bin

Ryzen TR 2970WX Zen+ Unknown N/A N/A ? ?
Ryzen TR 2990WX Zen+ Unknown N/A N/A ? ?

Ryzen 9

Device µarch Status Bus ID Kernel driver(s) Kernel version Latest microcode Notes
Ryzen 9 5950X Zen 3 Works N/A N/A ? 0xa20120a Tested with 5.15.41.

Firmware Blob: amd-ucode/microcode_amd_fam19h.bin & linux-firmware-20220509

Ryzen 9 5900X Zen 3 Works N/A N/A 5.10+ ? System booted with 5.10.15 with minor issues, but would recommend 5.13+.

Tested with latest 5.15.2 also without issues.

Firmware Blob: amd-ucode/microcode_amd_fam17h.bin & linux-firmware-20211027

Ryzen 9 3950X Zen 2 Works N/A N/A 5.4.0-rc5 ? Earlier version of kernel were not tried(experimental kernel due to AMDGPU drivers are needed for newest AMD video card). Booting with keyboard that requires 2 USB connectors fails at GRUB time. Workaround - disconnect keyboard until kernel is loading. Once kernel started, keyboard can be connected as usual. At this time it's not clear if problem is related to CPU or motherboard of keyboard or combination. On same motherboard/keyboard but with Ryzen 5 - it worked fine.
Ryzen 9 3900X Zen 2 Works N/A N/A 5.4.38 ? Earlier kernel versions not tested

Ryzen 7

Device µarch Status Bus ID Kernel driver(s) Kernel version Latest microcode Notes
Ryzen 7 PRO 5850U Zen 3 Works N/A amdgpu 5.10+ 0x0a50000c "Cezanne" APU; kernel 5.11 recommended, full support since 5.13
Ryzen 7 5700U Zen 2 Works N/A amdgpu 5.15.32+ 0x8608103 "Lucienne" APU
Ryzen 7 4800H Zen 2 Works N/A 5.16.15+ 0x08600106
Ryzen 7 3700X Zen 2 Works N/A N/A ? ?
Ryzen 7 2700X Zen+ Works N/A N/A 4.4.10+ 0x08008206 AGESA 1002c
Ryzen 7 1800X Zen Works N/A N/A 4.4.10+ 0x08001138 AGESA 0072
Ryzen 7 1700X Zen Works N/A N/A 4.4.10+ 0x08001129 ?
Ryzen 7 1700 Zen Works N/A N/A 4.4.10+ 0x08001138 ?

Ryzen 5

Device µarch Status Bus ID Kernel driver(s) Kernel version Latest microcode Notes
Ryzen 5 3600X Zen 2 Works N/A N/A 4.19.66+ 0x08701013 1.0.0.4B
Ryzen 5 1600X Zen Works N/A N/A ? ? ?
Ryzen 5 1600 Zen Works N/A N/A 4.4.10+ 0x08001137 1.0.0.4C
Ryzen 5 1500X Zen Works N/A N/A ? ? ?
Ryzen 5 1400 Zen Works N/A N/A ? ? ?

Installation

Firmware

To install the Zen microcode, emerge sys-kernel/linux-firmware:

root #emerge --ask sys-kernel/linux-firmware

The firmware blobs will need to be added to the kernel in order to be loaded.

Kernel

Enable support for Ryzen hardware in kernel 4.11.0 and newer:

KERNEL Kernel 4.11.0 or newer
Processor type and features  --->
  [*] Symmetric multi-processing support
  [*] Support x2apic
  [*] AMD ACPI2Platform devices support
  Processor family (Opteron/Athlon64/Hammer/K8)  --->
    (X) Opteron/Athlon64/Hammer/K8
  [*] Supported processor vendors  --->
    [*]   Support AMD processors (NEW)
  [*] SMT (Hyperthreading) scheduler support
  [*] Multi-core scheduler support
  [*] Machine Check / overheating reporting
  [*]   AMD MCE features
  Performance monitoring  --->
    <*> AMD Processor Power Reporting Mechanism
  [*]   AMD microcode loading support
Power management and ACPI options  --->
  CPU Frequency scaling  --->
      Default CPUFreq governor (ondemand)  --->
    <*>   ACPI Processor P-States driver
    [ /*]     Legacy cpb sysfs knob support for AMD CPUs
    < >   AMD Opteron/Athlon64 PowerNow!
    <*>   AMD frequency sensitivity feedback powersave bias
Device Drivers  --->
  Generic Driver Options --->
    (amd-ucode/microcode_amd_fam17h.bin) External firmware blobs to build into the kernel binary
    (/lib/firmware) Firware blobs root directory
  [*] IOMMU Hardware Support  --->
    [*]   AMD IOMMU support
    <*>     AMD IOMMU Version 2 driver
  [*] Hardware Monitoring support --->
    <*>   AMD Family 10h+ temperature sensor
    <*>   AMD Family 15h processor power

For Zen 3 (or newer) APUs (e.g. in notebooks or Chromebooks), additionally select:

KERNEL Kernel 5.11 or newer (CONFIG_AMD_PMC)
Device Drivers  --->
  [*] X86 Platform Specific Device Drivers  --->
    <*>   AMD SoC PMC driver

For Zen 3 (or newer) CPUs, an alternative AMD P-State driver can be used instead of the traditional ACPI driver. AMD P-State supports the schedutil and ondemand governors for dynamic frequency control. See upstream's documentation for more information.

KERNEL Kernel 5.17 or newer can optionally use the AMD P-State driver (CONFIG_X86_AMD_PSTATE)
Power management and ACPI options  --->
  CPU Frequency scaling  --->
      Default CPUFreq governor (schedutil)  ---
<*>   AMD Processor P-State driver

While configuring the kernel, it is a good idea to build in any appropriate AMD microcode updates needed by the CPU.

Those using sys-kernel/gentoo-sources with the experimental USE flag will have additional Processor family options made available:

KERNEL Kernel 4.11.0 (gentoo-sources) – choose one of:
Processor type and features  --->
  Processor family  --->
    (X) AMD Zen (MZEN)
    ( ) AMD Zen 2 (MZEN2)
    ( ) AMD Zen 3 (MZEN3)

This enables -march=znver1 (MZEN), -march=znver2 (MZEN2) or -march=znver3 (MZEN3) to be set for the kernel's make process.

Tip
Alternatively, Generic-x86-64 can be set in the Processor family for more generic CPU support. In theory this would make the kernel binaries portable in the event that it would be use on CPUs other than AMD Ryzen.
Note
For APUs, processors which include graphics, additional configuration is required. See AMDGPU for further information.

Configuration

GCC

GCC 10.3 and newer

The znver3 compiler optimization for Zen 3 was introduced in GCC 10.3[1].

FILE /etc/portage/make.confZen 3 compiler optimization
CFLAGS="-O2 -march=znver3"

GCC 9.2 and newer

The znver2 compiler optimization for Zen 2 was backported from GCC 10.

FILE /etc/portage/make.confZen 2 compiler optimization
CFLAGS="-O2 -march=znver2"

GCC 6.3 and newer

GCC 6.3+ has support for the znver1 compiler optimization. For optimal performance, this can be enabled in make.conf.

FILE /etc/portage/make.confZen compiler optimization
CFLAGS="-O2 -march=znver1"
Warning
GCC 6.3/6.x is presently not optimized for Ryzen,[2] neither is GCC 7.[3] GCC 8 brings some "znver1" optimization,[4][5] as does GCC 9.[6] You may experience issues when compiling.

GCC 5.4

While GCC 5.4 does not support Zen core specific optimization, -march=bdver4 has been shown to be functional and stable. However, since Zen dropped the instruction set extensions FMA4, TBM, XOP and LWP, they should be disabled accordingly:

FILE /etc/portage/make.confZen compiler optimization for GCC 5.4 and lower
CFLAGS="-O2 -march=bdver4 -mno-fma4 -mno-tbm -mno-xop -mno-lwp"
Important
Previously -march=haswell was said to be functional with Zen[7], but a Gentoo developer experienced various SEGVs with this option.
Important
The use of bare -march=bdver4 was said to be functional without issues, nevertheless it may still produce faulty code due to the lack of before mentioned instruction set extensions. Bulldozer has them, Zen does not.

Optional, but may produce better code: Add new instruction set extensions introduced with Zen individually (ADCX, RDSEED, MWAITX, SHA, CLZERO, XSAVEC, XSAVES, CLFLUSHOPT, POPCNT), using -march=bdver4 (Bulldozer Version 4 i.e. Excavator) as the starting point:

FILE /etc/portage/make.confEXPERIMENTAL compiler optimization for GCC 5.4 specifying new extensions for Zen
CFLAGS="-O2 -march=bdver4 -mno-fma4 -mno-tbm -mno-xop -mno-lwp -mclzero -madx -mrdseed -mmwaitx -msha -mxsavec -mxsaves -mclflushopt -mpopcnt"

Drivers for lm-sensors

The HWMON (lm-sensors) driver for ASUS motherboards of this class are currently not part of the kernel. Consider using the asus-wmi-sensors driver available under https://github.com/electrified/asus-wmi-sensors. Please also check if the motherboard is supported or not.

An ebuild is available using an overlay. Add it with either:

root #eselect repository add nightdragon_layman git https://github.com/NightDragon1/nightdragon_layman.git

or:

root #layman -o https://github.com/NightDragon1/nightdragon_layman/master/layman.xml -f -a nightdragon_layman

Alternatively there is a community ebuild repository available:

root #mkdir -p /etc/portage/repos.conf
root #wget -O /etc/portage/repos.conf/gyakovlev.conf https://raw.githubusercontent.com/gyakovlev/gentoo-overlay/master/gyakovlev.conf

or:

root #curl -Lo /etc/portage/repos.conf/gyakovlev.conf --create-dirs https://raw.githubusercontent.com/gyakovlev/gentoo-overlay/master/gyakovlev.conf

Troubleshooting

Ryzen 1700 series

Segmentation faults during compilation

If segmentation faults (segfaults, short SEGVs) are encountered frequently on Zen it might be anything from a software bug to a hardware bug. Since the CPU is under heavy load during a compilation process, this is most commonly the very time to discover such recurring SEGVs. With certain adjustments it may be possible to mitigate these segfaults—there have been reports of success and failure.

When encountering frequent SEGVs, please first ensure the most recently compiled binutils is selected via

user $eselect binutils list
 [1] x86_64-pc-linux-gnu-2.27
 [2] x86_64-pc-linux-gnu-2.28 *

If an early CPU batch is affected (2017), it should be/have been replaced through RMA (Return Merchandise Authorization) which AMD provides on its website. The recommendation to disable all overclocking and set proper timings for RAM is only for systems that were overclocked—at the designated speed CPU and RAM will not produce recurring SEGVs!

Faulty hardware

As of 2017-08-08 AMD confirmed a problem residing inside the Ryzen processor itself. This problem should only affect the very few early Ryzen batches that were produced (available and sold mid-2017). AMD confirmed the issue[8][9] and RMA was possible within the warranty period.

Note
The following was only applicable to mitigate CPUs that produced segfaults due to faulty hardware. For replaced CPUs and newer revisions, none of the following is recommended!
  • Consider downgrading or upgrading the BIOS/UEFI to the most stable.
  • Some motherboards' BIOS/UEFI setups have an option to disable OPCache. This has been observed to limit or stop segfaults albeit with a 5-7% performance cost.
  • Some users have reported that disabling ASLR resolves the segfault issues. This can be done at runtime by issuing echo 0 > /proc/sys/kernel/randomize_va_space and to make it permanent:
FILE /etc/sysctl.confDisabling ASLR
kernel.randomize_va_space = 0

Related forum topics: 1 and 2. And a Phoronix forum topic.

Note
No longer necessary, but left here as general information on the issue:

Ryzen users could fill out the Gentoo and Ryzen config and stability questionnaire to help out collecting data.

See also the datasheet generated from above questionnaire.

Soft freezes on 1st gen Ryzen 7

Problem: First generation Ryzen 7 systems will mysteriously soft freeze after a period of time.[10] Keyboard and mouse do not respond to input, output on the display freezes. System requires a hard reset (pressing the reset button on the case, pressing and holding the power button for 5 seconds, or pulling the power cord) in order to unfreeze. This is specifically an issue with freezing kernel, and not segfaulting.[11]

Solution: This issue may be correctable by disabling c6 power states in the motherboard's firmware or by adjust adding the following kernel symbol, and passing the following kernel cmdline parameter at boot time:

KERNEL Enable CONFIG_RCU_EXPERT, CONFIG_RCU_NOCB_CPU
'"`UNIQ--pre-00000026-QINU`"'

Then pass the following kernel commandline parameter at boot time rcu_nocbs=0-15. This is generally performed with secondary bootloaders such as GRUB or systemd-boot. Alternatively, when booting via EFI stub, parameters set within the kernel's .config file and built into the kernel binary. Refer to the appropriate article for details on updating the system's bootloader configuration.

Another possible fix is to add the following kernel cmdline parameters: clearcpuid=514 rcu_nocbs=0-15 pci=noaer idle=nomwait

Overclocking or wrong settings

When experiencing segfaults on an otherwise healthy system, the following could help to solve the problem:

  • Ensure using the newest binutils; an older instance of binutils could be built against older opcode facilitating crashes due to poor linkage.
  • Ensure RAM voltage and timing are correct for the RAM; BIOS/UEFI implementations are conservative while performing autosetting.
  • Consider downgrading the BIOS/UEFI to the most stable version. ASUS and ASRock have been known to push very beta BIOS/UEFI versions that have shown to be quite unstable.

Random reboots with mce events

Likely due to errata 1109 a Ryzen system may encounter random, spontaneous reboots with MCE hardware errors being logged on startup. An example MCE event looks like this:

Oct 31 11:46:23 fire kernel: [    0.677235] [Hardware Error]: System Fatal error.
Oct 31 11:46:23 fire kernel: [    0.677439] [Hardware Error]: CPU:10 (17:1:1) MC5_STATUS[-|UE|MiscV|PCC|AddrV|-|-|SyndV|TCC]: 0xbea0000000000108
Oct 31 11:46:23 fire kernel: [    0.677798] [Hardware Error]: Error Addr: 0x0001ffff810796c0
Oct 31 11:46:23 fire kernel: [    0.678003] [Hardware Error]: IPID: 0x000500b000000000, Syndrome: 0x000000004d000000
Oct 31 11:46:23 fire kernel: [    0.678356] [Hardware Error]: Execution Unit Extended Error Code: 0
Oct 31 11:46:23 fire kernel: [    0.678562] [Hardware Error]: Execution Unit Error: Watchdog timeout error.
Oct 31 11:46:23 fire kernel: [    0.678562] [Hardware Error]: cache level: RESV, tx: GEN, mem-tx: GEN

Suggested workarounds include:

  • Adding the kernel boot parameter idle=nomwait. Note that any solution that prevents the kernel from executing the MWAIT instruction will not prevent the issue from occurring 100% of the time, as other code could execute the instruction.
  • Modifying the "Power Supply Idle Control" setting in the BIOS.
  • Consider disabling C-States. This can be done the BIOS/UEFI or with the boot parameter processor.max_cstate=5.

See also this kernel Bugzilla entry, this AMD forum discussion, and many other discussions.

See also

  • AMDGPU — the next generation family of open source graphics drivers using the new Display Core (DC) framework for Vega, Raven Ridge and later GPUs. It is however also capable of handling newer AMD/ATI Radeon graphics cards based on GCN1.0+, namely the Southern Islands, Sea Islands, Volcanic Islands, and Arctic Islands chipsets.
  • AMDGPU-PRO — the next generation closed source graphics component that operates on top of the open source AMDGPU drivers for newer AMD/ATI Radeon graphics cards.
  • AMD microcode — describes updating the microcode for AMD processors.

External resources

References