Project Talk:Quality Assurance/Backtraces

From Gentoo Wiki
Jump to:navigation Jump to:search
Note
Before creating a discussion or leaving a comment, please read about using talk pages. To create a new discussion, click here. Comments on an existing discussion should be signed using ~~~~:
== Discussion title ==

{{Talk|date = 2024-05-13}}

A comment [[User:Larry|Larry]] 13:52, 13 May 2024 (UTC)
: A reply [[User:Sally|Sally]] 04:04, 3 June 2024 (UTC)
:: Your reply ~~~~

Core dump naming and save path

Talk status
This discussion is still ongoing.

Written:

«…a core file that might be called either "core" or "core.pid"
(where pid is replaced with the actual pid of the program that died).»

Should be written something like:

By default core dump file is written to the current directory (usually, but not always — $HOME of user, running program). The name of core dump file is managed by "kernel.core_pattern" sysctl setting. The default value is "core". Rememeber: if you want to perform permanent change, you should put your value into /etc/sysctl.conf.

The second notable setting is the clearest, but not only, way to add pid suffix: the "kernel.core_uses_pid", by default disabled (so, if process provides several crashes in the same directory, see bug #504760 as example, you'll see only the last one).

There is another, better way to customize core dump file name.

I prefer the following value, specifying not only crashed program name and signal killed in filename, but writing cores into dedicated directory, not to search for cores, that also allows to catch otherwise missed crashes (again see, and maybe follow bug #504760):

kernel.core_pattern = /tmp/cores/core_%e-%s.%p

see man 5 core for detailed explanation and alternatives. The dedicated directory for cores must have full access for everybody (if user has no write access to the current directory, core dump probably willn't bew written). To create it in /tmp I use the following script:

/etc/local.d/mk_core_dir.start 
#!/bin/sh
#
mkdir -m 0777 /tmp/cores

You may want to use permanent directory instead. For example /var/tmp/cores.

Alternatively you can handle a directory using utils from sys-apps/opentmpfiles.

After configuring you could (and probably should) want to verify its correctness. To do it just run a program (for example — text editor), terminate it with followed with writing core signal, for example 6 (see man 1 kill and man 7 signal for details and alternatives) and see the result core file.

Performing described check I was wondered with the fact of necessity of execution bit on dedicated core dump directory (needed for cd?).

--Anarchist 2014

I have emailed the QA team. Hopefully they can look this over and (potentially) fix up the article... --Maffblaster (talk) 09:16, 5 March 2017 (UTC)

systemd/coredumpctl

Talk status
This discussion is still ongoing.

If anyone gets around to do it: some instructions and details on how to configure and use systemd's coredump handling and coredumpctl would fit in nicely here.

--Eliasp (talk) 18:07, 29 December 2014 (UTC)

Getting useful backtraces for X errors [Please add this as a section]

Talk status
This discussion is still ongoing.

X errors have the problem, that the program that uses X likely just exits normally or with an exit code. This makes it not obvious how to get a real backtrace in GDB. To achieve this, a number of additional steps have to be performed.

Also, a table of X error codes and request codes might come in handy.

As normally, make sure your program contains debug symbols:

root #CFLAGS="-Og -march=$YOUR_ARCH -pipe -ggdb" CXXFLAGS="$CFLAGS" FEATURES="nostrip" emerge --oneshot --nodeps $YOUR_PACKAGE

(--nodeps is used to save time. It is assumed the package has been installed normally, and does not lack any dependencies.)

Run gdb as follows: (Using hexchat as an example.)

CODE
someuser@somemachine ~ $ export GDK_SYNCHRONIZE=1
someuser@somemachine ~ $ gdb hexchat
[...]
(gdb) break _XError
(gdb) run --sync
Starting program: /usr/bin/hexchat --sync
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
[New Thread 0x7fffe35d1700 (LWP 4563)]

(hexchat:4553): Gdk-WARNING **: GdkWindow 0x56000ed unexpectedly destroyed

Thread 1 "hexchat" hit Breakpoint 1, _XError (dpy=dpy@entry=0x73f080, rep=rep@entry=0x14e0b70)
    at /var/tmp/portage/x11-libs/libX11-1.6.3/work/libX11-1.6.3/src/XlibInt.c:1396
1396    /var/tmp/portage/x11-libs/libX11-1.6.3/work/libX11-1.6.3/src/XlibInt.c: File or directory not found.
(gdb) set logging file backtrace.log
(gdb) set logging on
Copying output to backtrace.log.
(gdb) thread apply all bt full

Thread 2 (Thread 0x7fffe35d1700 (LWP 5197)):
#0  0x00007ffff55ae80d in poll () from /lib64/libc.so.6
No symbol table info available.
#1  0x00007fffe8e2558b in ?? () from /usr/lib64/libpulse.so.0
No symbol table info available.
[...]

(gdb) set logging off
Done logging to backtrace.log.
(gdb) quit
A debugging session is active.

        Inferior 1 [process 32061] will be killed.

Quit anyway? (y or n) y
someuser@somemachine ~ $

Setting a breakpoint at _XError is the key!

As you can see, the backtrace might not be particularly useful yet, due to the libraries not having been compiled with debug information yet. The simplest way to solve that, is by finding the packages those libraries belong to, with (e.g. for libpulse):

user $equery belongs /lib64/libc.so.6 /usr/lib64/libpulse.so.0 [...]
sys-libs/glibc-2.23-r2 (/lib64/libc.so.6 -> libc-2.23.so)

sys-libs/glibc-2.23-r2 (/lib64/libc-2.23.so) media-sound/pulseaudio-8.0 (/usr/lib64/libpulse.so.0.19.0) media-sound/pulseaudio-8.0 (/usr/lib64/libpulse.so.0 -> libpulse.so.0.19.0

[...]

and then re-emerging all those libraries with debug information too:

root #CFLAGS="-Og -march=$YOUR_ARCH -pipe -ggdb" CXXFLAGS="$CFLAGS" FEATURES="nostrip" emerge --oneshot --nodeps =sys-libs/glibc-2.23-r2 =media-sound/pulseaudio-8.0 [...]

Of course, splitdebug can be used too.

Now re-running gdb will result in a useful stack trace in the backtrace.log file, which can be used in bug reports. — The preceding unsigned comment was added by Evi1M4chine (talkcontribs)

Article description property

Talk status
This discussion is done as of 2018-04-12.

I would add an "Article description" property right at the beginning of this article, so we can use the See also templates in other articles linking to this one. Fturco (talk) 16:56, 12 April 2018 (UTC)

Done. Thanks, Fturco! --Maffblaster (talk) 18:46, 12 April 2018 (UTC)

Using compiler flags with portage

Talk status
This discussion is still ongoing.

There is content in the package.env wiki article that illustrates building packages with custom compiler flags. This article suggests these happen, but has no practical examples on how to achieve this, either manually, using ebuild or with portage (emerge).

Should that content be replicated here, or linked to if it is useful? — The preceding unsigned comment was added by Liamdennehy (talkcontribs)

kde-plasma/drkonqi instead of kde-base/drkonqi

Talk status
This discussion is still ongoing.

In the paragraph Project:Quality_Assurance/Backtraces#KDE_crash_handler.27s_notes, it is said "KDE-based applications runs [or better "run" ??] by default with their own crash handler, which is presented by the user by the means of "Dr. Konqi" if it's installed (the package is either kde-base/kdebase or kde-base/drkonqi". It seems to refer to older versions of kde packages since currently kde-base/kdebase and kde-base/drkonqi kde-plasma/drkonqi do not match with emerge, and on the contrary kde-plasma/drkonqi and kde-apps/kdebase-meta do. Akar (talk) 15:21, 26 June 2020 (UTC)

system-wide ulimit that works even when ssh-ing to it

Talk status
This discussion is still ongoing.

Just place a line in /etc/rc.conf like rc_ulimit="-c unlimited" assuming you don't want to UsePAM yes in /etc/ssh/sshd_config . Now it doesn't show 0 when I run ulimit -c after ssh-ing to the box. Note: /etc/security/limits.conf would work with PAM, but /etc/limits.conf had no effect either way. --Okwtwnow (talk) 19:23, 3 December 2020 (UTC)

Adding links to relevant resources

Talk status
This discussion is still ongoing.

This page is referenced by equery u foo, and doesn't link to other relevant documentation:

  • Debugging — general advice on how to enable debugging
  • /etc/portage/package.env — can contain files to be called during the installation of specific packages, or files used to set Portage's environment variables on a per-package basis.

Senoraraton (talk) 21:56, 11 April 2023 (UTC)Senoraraton

Change sys-devel/gdb to dev-debug/gdb

gdb changed category in January from sys-devel to dev-debug. Maybe this wiki page should reflect the change?

See commit https://gitweb.gentoo.org/repo/gentoo.git/commit/dev-debug/gdb?id=7b205f67aa1e81d5665d2d88132ee9ce195f852a of the gentoo repo

--MarmotteCurieuse (talk) 10:03, 26 February 2024 (UTC)

Typo

Talk status
This discussion:
  • provides a proposal that can be merged as is, but as of 2024-05-14,
  • that proposal has not remained unchanged and uncontested for 30 days.

"gdb" should be "gcc":

The first is to tell Portage to not strip binaries at all, by adding nostrip to FEATURES. This will leave the installed files exactly as gdb created them, with all the debug information and symbol tables, which increases the disk space occupied by executables and libraries.

Proposed changes to section Stripping - Please make edits here until a final revision is agreed upon.

This will leave the installed files exactly as gcc created them

Waldo Lemmer 15:30, 14 May 2024 (UTC)