Project:Toolchain/SFrame
Motivation
Users want to be able to trace and profile applications. To obtain backtraces, frame pointers are often advocated as a solution because they don't require debug information, are fast to unwind via, and (somewhat) reliable. But using -fno-omit-frame-pointer means the compiler loses a general purpose register (GPR) and might cause more spills to the stack.
Distributions have come under pressure to go against the GCC and Clang default (-fomit-frame-pointer with optimization) to facilitate profiling and in some cases debugging. Notably Fedora and Ubuntu changed their defaults a few years ago.
Florian Weimer recently found that even with -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer, unwinding can be defeated by optimisations like shrink-wrapping (which GCC 16 trunk will do more often on x86-64).
Users are generally more willing to accept slightly increased disk space than they are decreased runtime performance for a usecase they may not care about.
SFrames provide the needed & minimal information to allow fast unwinding in a compact representation, without costing a GPR.
Patches
GNU Binutils
Modern versions of sys-devel/binutils already support SFrames but sys-devel/binutils-2.45, to be released on 2025-07-27, will contain support for relocatable links which is useful for the kernel.
Indu Bhagat has additional patches currently being upstreamed on a branch (aim is to get them all in for 2.45).
For multilib (32-bit x86 builds on amd64), it can be awkward to pass specific flags, so to make gas only warn (not error) on -Wa,--gsframe:
binutils-warn-on-no-sframe.patch
--- a/gas/dw2gencfi.c
+++ b/gas/dw2gencfi.c
@@ -2617,8 +2617,10 @@ cfi_finish (void)
alignment);
output_sframe (sframe_seg);
}
- else
- as_bad (_(".sframe not supported for target"));
+ else {
+ as_warn (_(".sframe not supported for target"));
+ return;
+ }
}
if ((all_cfi_sections & CFI_EMIT_debug_frame) != 0)
Bugs
- https://sourceware.org/PR33125 ("SFrame: provide a way to negate --gsframe in gas")
- https://sourceware.org/PR33126 ("SFrame: add a --enable-sframe (or similar) configure option to default-enable --gsframe in gas")
- https://sourceware.org/PR33127 ("FAIL: Link eh-group.o to eh-group ...")
- https://sourceware.org/PR33131 ("Failed assertion when linking gccgo")
Kernel
The patches are available combined at https://github.com/thesamesam/linux/tree/sframe-combined. A hacked up sys-kernel/vanilla-kernel with patches applied is available in sam's overlay.
- unwind_user: x86: Deferred unwinding infrastructure
- unwind_deferred: Implement sframe handling
- x86/vdso: VDSO updates and fixes for sframes
- v6: https://lore.kernel.org/all/20250425023750.669174660@goodmis.org/ (pending refresh AFAIK)
perf
dev-util/perf will be one of the main consumers of SFrames.
The patches are available combined at https://github.com/thesamesam/linux/tree/sframe-combined. A hacked up dev-util/perf with patches applied is available in sam's overlay.
- perf: Support the deferred unwinding infrastructure
glibc
glibc needs a way to unwind for the purposes of backtrace() (and some other more complex uses).
sys-libs/glibc-2.42 will be released on 2025-08-01. The plan is to get the unwinder changes in for that.
- glibc: Add SFrame support for stack backtracing
Impact
TODO: disk space measurements
Testing
To check, using perf maintainer Namhyung Kim's suggestion, whether perf is asking the kernel for a deferred callchain:
user $
perf record -g -vv true |& grep defer
defer_callchain 1 defer_callchain 1 defer_callchain 1 defer_callchain 1 defer_callchain 1 defer_callchain 1 defer_callchain 1 defer_callchain 1
As long as "switching off deferred callchain support" doesn't appear, it should be fine.
To check whether the kernel is actually providing a deferred callchain:
user $
grep -A5 CALLCHAIN_DEFERRED
82795152792808 0x8a0 [0x38]: PERF_RECORD_CALLCHAIN_DEFERRED(IP, 0x2): 2554639/2554639: 0 ... FP chain: nr:0 ... thread: true:2554639 ...... dso: <not found> 0x8d8@perf.data [0x48]: event: 9 [...] 82795153058898 0xe10 [0x50]: PERF_RECORD_CALLCHAIN_DEFERRED(IP, 0x2): 2554639/2554639: 0 ... FP chain: nr:0 ... thread: true:2554639 ...... dso: /usr/lib64/ld-linux-x86-64.so.2 0xe60@perf.data [0x30]: event: 4 -- CALLCHAIN_DEFERRED events: 8 (25.0%) FINISHED_ROUND events: 1 ( 3.1%) ID_INDEX events: 1 ( 3.1%) THREAD_MAP events: 1 ( 3.1%) CPU_MAP events: 1 ( 3.1%) EVENT_UPDATE events: 1 ( 3.1%)
Another example, with it working, looks like:
user $
perf record -g -- perf bench sched messaging
user $
perf report -s dso,sym -g none | grep -F -e Children -e '[.]' | head
Warning: 1630 out of order events recorded. # Children Self Shared Object Symbol 12.36% 12.36% libc.so.6 [.] __cxa_finalize 2.71% 2.71% ld-linux-x86-64.so.2 [.] do_lookup_x 1.35% 1.35% ld-linux-x86-64.so.2 [.] _dl_lookup_symbol_x 1.00% 1.00% ld-linux-x86-64.so.2 [.] _dl_relocate_object_no_relro 0.64% 0.64% perf [.] receiver 0.62% 0.62% perf [.] bench_sched_messaging 0.58% 0.58% libc.so.6 [.] __syscall_cancel 0.53% 0.53% libc.so.6 [.] __run_exit_handlers 0.51% 0.51% libc.so.6 [.] cfree@GLIBC_2.2.5
And with it broken, it looks like:
user $
perf record -g -- perf bench sched messaging
user $
perf report -s dso,sym -g none | grep -F -e Children -e '[.]' | head
# Children Self Shared Object Symbol 32.54% 29.70% libc.so.6 [.] __cxa_finalize 31.99% 0.00% libstdc++.so.6.0.34 [.] __gxx_personality_v0 25.81% 0.00% libc.so.6 [.] __libc_start_call_main 25.81% 0.00% perf [.] 0x000055871450e0ec 25.81% 0.00% perf [.] 0x000055871459545e 25.80% 0.00% perf [.] 0x000055871459513e 25.80% 0.00% perf [.] 0x000055871460324c 19.42% 0.00% [unknown] [.] 0000000000000000 18.83% 0.53% libc.so.6 [.] __syscall_cancel