Project:Toolchain/724314-gcc-10-and-znver1

gcc-10 and znver1 stack smash.

= Symptom =

On systems with AMD Ryzen (znver1) with znver-specific optimizations enabled (CFLAGS=-mach=znver1 or CFLAGS=-march=native) attempt to build {{{c|boost-1.73.0}} crashes with stack protection violation:

= Workaround =

Remove arch-specific flags from CFLAGS/CXXFGAFS and rebuild. Example: change CFLAGS="-march=native -O2" to CFLAGS="-O2".

= Affected systems =

= Hypotheses =

Bad hardware
Early CPUs had a problem serving under the load: https://community.amd.com/thread/219812

This bug looks very deterministic though: the same boost file triggers the failure with roughly the same backtrace. If it's a CPU bug it's more likely a problem in instruction sequence than in load.

GCC bug
Most probable. Somehow gcc manages to corrupt it's stack. Here is an example backtrace: https://724314.bugs.gentoo.org/attachment.cgi?id=645596

Somewhere (possibly in GIMPLE plase?) gcc managed to corrupt it's stack. 300 calls deep is huge stack depth. We need to reduce input sample first to make stack traces more useful.

kernel bug
Could be kernel version or kernel configuration that makes stack corruption more likely?

glibc bug
= More info =

What you can do:


 * Attempt to reproduce the failure in fresh stage3 chroot by change minimal amount of settings from defaults. Ideally only CFLAGS/CXXFLAGS for gcc. If that is not enough add more.
 * Attempt to minimize the source file that exhibits the stack smash using https://wiki.gentoo.org/wiki/Gcc-ICE-reporting-guide. You will need to reduce based on presence of "*** stack smashing detected ***: terminated"
 * Attempt to find exact source line where stack smashing happens using https://wiki.gentoo.org/wiki/Stack-smashing-debugging-guide
 * Give slyfox@ access to hardware where it's reproducibe.