Package testing

Like all healthy software projects, developing Gentoo requires lots of testing. This article Article description::provides information for ebuild developers on testing ebuilds.

It is preferred that arch-specific ebuild testing takes place on a real system, inside a chroot, or within another type of non-virtualized container. Virtualization may be acceptable in situations where it is not possible or practical to test on real hardware.

Testing can be separated to generic "version bump" tests, stabilization work and keywording work. For stabilization it's important that the test environment only has stable packages installed, with no unstable ebuilds present (keyworded or unmasked). In any case, the test environment should be up-to-date, and it is recommended to have as few packages installed as possible. This will aid in finding missing dependencies. You should always aim to test each version bump as thoroughly as possible.

make.conf & test.conf
The file should have settings similar to the following:

It's generally recommended to run your own/test system with very basic settings, then manipulate the package that's being tested via. Unless you run a tinderbox trying to catch all errors.

This is easily achievable with some arch-testing tools, such as pkg-testing-tools, but even by editing manually with little effort.

Then our

The QA_ checks are from. All flags and features can be found on make.conf.5 (or man 5 make.conf) where they are explained.

Note that iwdevtools can be used separately with provided scripts, it doesn't need to be integrated with portage hooks to be effective.

Testing
Each ebuild in Gentoo is different and therefore requires a slightly different approach to stabilization. Consider the following guidelines for each class of package, and use common sense when in doubt.

USE flags
While it is preferable to test every USE flag combination, this is not always possible or appropriate. The ebuild may have a large number of USE flags, a long compile time, or the stabilization in question may just not call for it.

In cases where all USE flags combinations are not being tested, it is still recommended to test:


 * With all USE flags enabled.
 * With all USE flags disabled.
 * The default USE flag settings.

Note that arch-testing tools, such as tatt and pkg-testing-tools, provide functionality to test random USE flags.

Runtime testing
Consider the level of runtime testing that is required for the target package. Remember, the focus of stabilization is to integrate a testing ebuild into stable and not to identify routine upstream bugs or regressions - that is the purpose of the ebuild's 30 day wait time in while it's marked ~ (unstable).

The level of runtime testing required will vary wildly based on a variety of factors. Consider the following examples:


 * Multiple days of "normal use" testing may be appropriate for a new version of.
 * Basic functionality testing, such as browsing some web pages, may make sense for a new version of.
 * Passing tests might be enough for.
 * A leaf package such as may not require any runtime testing at all.

Libraries
A new library version may introduce incompatibles with reverse dependencies. Where there is a risk of such breakage, each stable reverse dependency must be rebuilt. Beware of reverse dependencies that only use the library conditionally (eg. ).

Kernel
Kernel ebuilds referenced in the Handbook have certain exemptions from the usual stabilization policy, so stabilization requests are normally only filed for the first version in a long term stable branch (subsequent versions can be stabilized at the discretion of the maintainer).

First, test all available kernel options:

If that succeeds, build with a normal kernel configuration:

After reboot, check for anything strange and use the system as normal, trying to get a bit of uptime.

If stabilizing a special feature variant such as, try to test relevant features.

Toolchain
New versions of toolchain packages can often introduce major changes and widespread breakage into the Gentoo ebuild repository. The purpose of a stabilization request for a toolchain package is to test the package itself on each architecture - not to detect build failures in miscellaneous packages. It is expected that such failures are managed and resolved by the maintainer (normally through tracker bugs and tinderboxing) prior to filing a stabilization request.

See the Toolchain Project's notes for more.

Once the normal testing is successful, rebuild  (or   if the hardware permits) and once successful, observe the system in normal operation for abnormalities.

pkg-testing-tools
Install the tool.

is an alternative to tatt. Its strengths are ease-of-use and JSON format reports. Please refer to upstream for documentation, or use.

doesn't require any configuration to be usable. It's very simple and therefore has a shallow learning curve. Integrating it to daily ebuild dev work is strongly encouraged.

At minimum you should have a file with some test-specific settings defined. Please see.

By default disables some USE flags. Such as debug, doc, etc, that aren't too important when making sure the source code compiles. However when using the tool on version bumps, to make sure everything in your ebuild works, it may be a good idea to expand the USE flag pool that's being tested. It's easy to edit the source afterwards, but there are also example patches available, that can be dropped into. By default the tool also disables --autounmask feature, which makes sense on stabilizing work since it can accidentally autounmask ~unstable packages, but when the tool is used along with version bumps, you may want autounmasking feature to solve different USE flag combinations automatically. The patches link, also mentioned above, has an example to enable autounmasking with pkg-testing-tools.

One downside compared to is that  doesn't have a way to test reverse dependencies. has a built-in feature for that. You can always find reverse dependencies of a package from https://qa-reports.gentoo.org/ or simply from https://packages.gentoo.org/. There's a simple script to pair with https://qa-reports.gentoo.org/ but it's not suitable for stabilization workflow. can be further paired with the two to find stable reverse dependencies of an atom.

Example commands:

Uses specific settings only for the testable atom, does a run with FEATURES="test" once, and does 6 different use-combination runs at max.

Uses specific settings only for the testable atom, never does a run with FEATURES="test, does 12 runs with different USE flag combinations, appends USE="-profile" globally and writes a report into.

tatt
is a tool designed to automate some of the repetitive tasks involved in arch testing. Currently only version 9999 supports working with a git ebuild repository and the Bugzilla atom field.

For each job, tatt produces a series of scripts allowing the user to control exactly what is performed:

Configuration
tatt has a variety of configuration options (see ), but there is a few that must be set to ensure useful operation of all functions.

Sample workflow
First, start a new job:

Now the various scripts are available for use:

Next, build the package and perform whatever testing is necessary:

A report is also produced summarizing the build status of each USE flag combination:

Once everything looks good, commit the keyword change:

Finally, update the bug and cleanup the job:

QA violations
Most of these violations will be detected automatically using the testing tool, but are also described here for completeness.


 * Does not respect the CC variable (see Removing native symlinks, ).
 * Does not respect CFLAGS variable (see ).
 * Does not respect LDFLAGS variable (see ).
 * Bundled symbols (see Why not bundle dependencies, ).
 * Insecure symbols.
 * Installs documentation outside of
 * ELF files found in

Architecture-specific notes
A number of items described in earlier sections, such as checking of reverse dependencies and miscellaneous QA checks, are architecture-neutral. At a stabilization level, the primary responsibility for carrying out these checks rests on the first architecture to stabilize an ebuild. Subsequent architectures may assume that these checks have been completed and skip them if they wish.

The devmanual also covers this topic.

amd64

 * Any developer may perform stabilization - it is not necessary to be on the arch team.
 * must be added to the FEATURES variable in the file.

arm
The ARM project supports four variants - armv4, armv5, armv6, and armv7. Where possible for fragile packages, trying  for each variant is encouraged.

x86

 * Any developer may perform stabilization - it is not necessary to be on the arch team.
 * It is acceptable to stabilize in an chroot on.
 * It is generally acceptable to stabilize a package with only a build test on if it is already stable on.