Ryzen

From Gentoo Wiki
Jump to: navigation, search
Resources

Ryzen is a multithreaded, high performance processor released by AMD in Q1, 2017. It is the first CPU released based on the Zen microarchitecture. Its goal is to directly compete with Intel's Broadwell-E processor line, primarily the Core i7-6900K.

Hardware

Ryzen 7

Device Make/model Status Bus ID Kernel driver(s) Kernel version Latest microcode Notes
AMD Ryzen 7 2700X AMD Works N/A N/A 4.4.10+ 0x08008206 AGESA 1002c
AMD Ryzen 7 1800X AMD Works N/A N/A 4.4.10+ 0x08001137 AGESA 1002a
AMD Ryzen 7 1700X AMD Works N/A N/A 4.4.10+ 0x08001129  ?
AMD Ryzen 7 1700 AMD Works N/A N/A 4.4.10+ 0x800111c  ?

Ryzen 5

Device Make/model Status Bus ID Kernel driver(s) Kernel version Latest microcode Notes
AMD Ryzen 5 1600X AMD Works N/A N/A  ?  ?  ?
AMD Ryzen 5 1600 AMD Works N/A N/A 4.4.10+ 0x08001137 1.0.0.4C
AMD Ryzen 5 1500X AMD Works N/A N/A  ?  ?  ?
AMD Ryzen 5 1400 AMD Works N/A N/A  ?  ?  ?

Installation

Firmware

To install the Zen microcode, emerge sys-kernel/linux-firmware:

root #emerge --ask sys-kernel/linux-firmware

The firmware blobs will need to be added to the kernel in order to be loaded.

Kernel

Enable support for Ryzen hardware in kernel 4.11.0:

KERNEL Kernel 4.11.0
Processor type and features  --->
  [*] Symmetric multi-processing support
  [*] AMD ACPI2Platform devices support
  Processor family (Opteron/Athlon64/Hammer/K8)  --->
    (X) Opteron/Athlon64/Hammer/K8
  [*] Supported processor vendors  --->
    [*]   Support AMD processors (NEW)
  [*] SMT (Hyperthreading) scheduler support
  [*] Multi-core scheduler support
  [*] Machine Check / overheating reporting
  [*]   AMD MCE features
  Performance monitoring  --->
    <*> AMD Processor Power Reporting Mechanism
  [*]   AMD microcode loading support
Power management and ACPI options  --->
  CPU Frequency scaling  --->
    <*>   AMD Opteron/Athlon64 PowerNow!
    <*>   AMD frequency sensitivity feedback powersave bias
Device Drivers  --->
  Generic Driver Options --->
    (amd-ucode/microcode_amd_fam17h.bin) External firmware blobs to build into the kernel binary
    (/lib/firmware) Firware blobs root directory
  [*] IOMMU Hardware Support  --->
    [*]   AMD IOMMU support
    <*>     AMD IOMMU Version 2 driver

While configuring the kernel, it is a good idea to build in any appropriate AMD microcode updates needed by the CPU.

Those using sys-kernel/gentoo-sources with the experimental USE flag will have additional Processor family options made available:

KERNEL Kernel 4.11.0 (gentoo-sources)
Processor type and features  --->
  Processor family (MZEN)  --->
    (X) AMD Zen

This enables -march=znver1 to be set for the kernel's make process.

Tip
Alternatively, Generic-x86-64 can be set in the Processor family for more generic CPU support. In theory this would make the kernel binaries portable in the event that it would be use on CPUs other than AMD Ryzen.

Configuration

GCC

GCC 6.3+

GCC 6.3+ has support for the znver1 compiler optimization. For optimal performance, this can be enabled in make.conf.

FILE /etc/portage/make.confZen compiler optimization
CFLAGS="-O2 -march=znver1"

GCC 6.3/6.x is presently not optimized for Ryzen,[1] neither is GCC 7.[2] GCC 8 brings some "znver1" optimization.[3][4]

GCC 5.4

While GCC 5.4 does not support Zen core specific optimization, -march=bdver4 has been shown to be functional and stable. However, since Zen dropped the instruction set extensions FMA4, TBM, XOP and LWP, they should be disabled accordingly:

FILE /etc/portage/make.confZen compiler optimization for GCC 5.4 and lower
CFLAGS="-O2 -march=bdver4 -mno-fma4 -mno-tbm -mno-xop -mno-lwp"
Important
Previously -march=haswell was said to be functional with Zen[5], but a Gentoo developer experienced various SEGVs with this option.
Important
The use of bare -march=bdver4 was said to be functional without issues, nevertheless it may still produce faulty code due to the lack of before mentioned instruction set extensions. Bulldozer has them, Zen does not.

Optional, but may produce better code: Add new instruction set extensions introduced with Zen individually (ADCX, RDSEED, MWAITX, SHA, CLZERO, XSAVEC, XSAVES, CLFLUSHOPT, POPCNT), using -march=bdver4 (Bulldozer Version 4 i.e. Excavator) as the starting point:

FILE /etc/portage/make.confEXPERIMENTAL compiler optimization for GCC 5.4 specifying new extensions for Zen
CFLAGS="-O2 -march=bdver4 -mno-fma4 -mno-tbm -mno-xop -mno-lwp -mclzero -madx -mrdseed -mmwaitx -msha -mxsavec -mxsaves -mclflushopt -mpopcnt"

Troubleshooting

Segmentation faults during compilation

If segmentation faults (segfaults, short SEGVs) are encountered frequently on Zen it might be anything from a software bug to a hardware bug. Since the CPU is under heavy load during a compilation process, this is most commonly the very time to discover such recurring SEGVs. With certain adjustments it may be possible to mitigate these segfaults—there have been reports of success and failure.

If you encouter frequent SEGVs, please first ensure the most recently compiled binutils is selected via

user $eselect binutils list
 [1] x86_64-pc-linux-gnu-2.27
 [2] x86_64-pc-linux-gnu-2.28 *

If an early CPU batch is affected (2017), it should be/have been replaced through RMA (Return Merchandise Authorization) which AMD provides on its website. The recommendation to disable all overclocking and set proper timings for RAM is only for systems that were overclocked—at the designated speed CPU and RAM will not produce recurring SEGVs!

Faulty Hardware

As of 2017-08-08 AMD confirmed a problem residing inside the Ryzen processor itself. This problem should only affect the very few early Ryzen batches that were produced (available and sold mid-2017). AMD confirmed the issue[6][7] and RMA was (is) possible within the warranty period.

Note
The following was only applicable to mitigate CPUs that produced segfaults due to faulty hardware. For replaced CPUs and newer revisions, none of the following is recommended!
  • Consider downgrading or upgrading the BIOS/UEFI to the most stable.
  • Some motherboards' BIOS/UEFI setups have an option to disable OPCache. This has been observed to limit or stop segfaults albeit with a 5-7% performance cost.
  • Some users have reported that disabling ASLR resolves the segfault issues. This can be done at runtime by issuing echo 0 > /proc/sys/kernel/randomize_va_space and to make it permanent:
FILE /etc/sysctl.confDisabling ASLR
kernel.randomize_va_space = 0

Related forum topics: 1 and 2. And a Phoronix forum topic.

Note
No longer necessary, but left here as general information on the issue:
Ryzen users could fill out the Gentoo and Ryzen config and stability questionnaire to help out collecting data. See also the datasheet generated from above questionnaire.

Overclocking or wrong settings

If you experience segfaults on an otherwise healthy system, the following could help to solve the problem:

  • Ensure using the newest binutils; an older instance of binutils could be built against older opcode facilitating crashes due to poor linkage.
  • Ensure RAM voltage and timing are correct for the RAM; BIOS/UEFI implementations are conservative while performing autosetting.
  • Consider downgrading the BIOS/UEFI to the most stable version. ASUS and ASRock have been known to push very beta BIOS/UEFI versions that have shown to be quite unstable.

Random reboots with mce events

If your system runs 24x7 and you encounter random, spontaneous reboots with MCE hardware errors being logged on startup, consider disabling C-States. This can be done the BIOS/UEFI or with the boot parameter processor.max_cstate=5. An example MCE event looks like this:

Oct 31 11:46:23 fire kernel: [    0.677235] [Hardware Error]: System Fatal error.
Oct 31 11:46:23 fire kernel: [    0.677439] [Hardware Error]: CPU:10 (17:1:1) MC5_STATUS[-|UE|MiscV|PCC|AddrV|-|-|SyndV|TCC]: 0xbea0000000000108
Oct 31 11:46:23 fire kernel: [    0.677798] [Hardware Error]: Error Addr: 0x0001ffff810796c0
Oct 31 11:46:23 fire kernel: [    0.678003] [Hardware Error]: IPID: 0x000500b000000000, Syndrome: 0x000000004d000000
Oct 31 11:46:23 fire kernel: [    0.678356] [Hardware Error]: Execution Unit Extended Error Code: 0
Oct 31 11:46:23 fire kernel: [    0.678562] [Hardware Error]: Execution Unit Error: Watchdog timeout error.
Oct 31 11:46:23 fire kernel: [    0.678562] [Hardware Error]: cache level: RESV, tx: GEN, mem-tx: GEN

See also this AMD forum discussion (and many other discussions).

See also

  • AMDGPU — the next generation family of open source graphics drivers using the new Display Core (DC) framework for Vega GPUs and Raven Ridge APUs. It is however also capable of handling newer AMD/ATI Radeon graphics cards based on GCN1.1+, namely the Southern Islands, Sea Islands, Volcanic Islands, and Arctic Islands chipsets.
  • AMDGPU-PRO — the next generation closed source graphics component that operates on top of the open source AMDGPU drivers for newer AMD/ATI Radeon graphics cards.
  • AMD microcode — describes updating the microcode for AMD processors.

External resources

References