GLEP:73

Abstract
This GLEP proposes using automated solving to satisfy REQUIRED_USE constraints. It lists the problems with the current handling of REQUIRED_USE, and explains how auto-solving would solve them. It specifies the algorithms that can be used to verify the constraints, automatically solve them and check whether they can be solved. It provides the rationale for all the design decisions, and considers the compatibility with the PMS, and between the constraints and the package managers before and after the GLEP is used.

The issues with REQUIRED_USE
REQUIRED_USE has been introduced in EAPI 4 as a solution to the problem of enforcing specific relations between USE flags. According to the Portage documentation on REQUIRED_USE, it has been specifically targeted as a more data-oriented and machine-friendly alternative to verifying the validity of USE flag choice in ebuild phases.

At the moment of writing, REQUIRED_USE is used in around 25% of the ebuilds in Gentoo. It is an obligatory part of some eclasses, e.g. in the Python ecosystem. Its uses include improving clarity of user choices, simplifying ebuilds via copying upstream feature dependencies and enforcing valid data for USE dependencies. Nevertheless, a number of developers raise strong arguments against using REQUIRED_USE.

The commonly noted disadvantages of REQUIRED_USE are:
 * 1) Unsatisfied REQUIRED_USE constraints unnecessarily (and sometimes frequently) require explicit user action, even if there is no real gain from the user explicitly selecting. For example, if a package supports building either against Qt4 or Qt5, and user has enabled the flags for both, the package manager would request him to disable one of the flags for the package. For most of the cases, just using the newer version would be more friendly.
 * 2) Satisfying REQUIRED_USE usually requires altering flags via permanent configuration. Those alterations can become obsolete over time and without proper maintenance can put system into a suboptimal configuration. For example, if a Python package requires enabling a non-default Python target, then the leftover flag can keep forcing an obsolete Python version when the package gains support for the default target.
 * 3) The machine-oriented form of REQUIRED_USE constraints can result in confusing and unreadable output to the user, especially for complex constructs and cross-dependent constraints. Bad output can result in the user being unable to solve the problem at all, or to solve it in a satisfactory way (i.e. without disabling the features he needs). It can also cause frustration if satisfying REQUIRED_USE requires more than one attempt.

Those arguments have resulted in a number of developers avoiding REQUIRED_USE. For example, the Qt team policies discourage using it unless absolutely necessary. The attempts of avoiding REQUIRED_USE sometimes result in suboptimal descriptions of USE flags or even inconsistent use of them.

The providers problem
A very specific case of a problem where REQUIRED_USE has some use is the providers problem. That is, whenever a package has a feature that can be supplied by more than one library of choice, and the user needs to choose between the providers. The exact form of this problem depends on the number of providers and whether the feature is optional.

The commonly used solutions include:
 * Using one or more binary flags to toggle between the providers (with number of the flags < number of providers). This is most readable with only two providers, e.g. with USE=libressl meaning use LibreSSL instead of OpenSSL, and USE=-libressl meaning use OpenSSL. For packages with optional SSL/TLS feature, there is also an additional USE=ssl to toggle that feature, and with USE=-ssl, the libressl flag is meaningless (ignored). This is usually the least intrusive method but it's unreadable and causes the flags to be confusing.
 * Using unary flags for providers along with REQUIRED_USE. In this case, each provider gets an explicit flag and REQUIRED_USE is used to force selecting exactly one of them. For optional feature, there is either an additional feature flag or it is disabled when all providers are disabled. This is usually the most readable solution although it frequently requires adjusting flags.
 * Using unary flags without REQUIRED_USE. In this case, if user selects more than one provider (or does not select any), the package decides which one is preferred and uses that. For optional feature, again there could be either an additional feature flag or it could be disabled by disabling all the providers. This is less intrusive than the previous solution but it's less readable (it is unclear which provider is actually used) and unsuitable for USE dependencies.

As noted, all of the mentioned solutions have their specific disadvantages. This causes different developers to use different solutions for specific problems. Sometimes, it could go as bad as to have more than one solution applied to a single problem, or different concepts used inconsistently by different developers.

Automatic solving as the solution
This GLEP focuses on the idea of establishing automated solving of REQUIRED_USE as a solution to the forementioned issues. In this context, REQUIRED_USE is extended not only to specify what combinations of USE flags are valid but also how to proceed from a disallowed flag set to one that satisfies the constraints.

This clearly resolves the first two issues with REQUIRED_USE. Since REQUIRED_USE mismatches are solved automatically, there is no explicit user interaction required. No changes are done in configuration files — since the solving is meant to be deterministic, the package manager can recalculate the effective USE flag set using the input USE flag set and the REQUIRED_USE constraint.

The third disadvantage is partially solved. Since there is no necessity for the user to perform any action, there is also no necessity of explaining the constraints to the user. However, for practical uses it may be deemed appropriate to explain to the user why a particular flag has been enabled or disabled.

Solving the most common problems with REQUIRED_USE makes it possible to extend its use cases to the areas where developers so far rejected to use it, or did not even think of using it. This includes working towards a single solution to the providers problem. Given that REQUIRED_USE no longer requires altering the configuration to select between multiple allowed providers, we can reasonably work towards using the middle solution consistently — that is, having clear unary flags for every provider, and using REQUIRED_USE to automatically transform inconclusive input into a single implementation.

Furthermore, the non-intrusive version of REQUIRED_USE could be used extensively to conditionally mask meaningless flags and map equivalent flag sets into a single common set of choice. This can further improve readability (by making flags clearly indicate what it used, e.g. by disabling all SSL/TLS provider flags when SSL/TLS is disabled) and improve compatibility between binary packages (by reducing the number of incompatible USE flag sets).

Restrictions on REQUIRED_USE format
The REQUIRED_USE format is defined by the PMS. This specification requires the following additional restrictions being enforced:


 * An any-of group (||), at-most-one-of (??) and an exactly-one-of (^^) group can contain only flat USE flag items. In particular, no other group can be nested inside it.
 * All-of groups are forbidden inside REQUIRED_USE (they have no use now).

As a result, unlimited nesting is allowed only for use-conditional groups. All other constructs are kept flat. This serves the following goals:


 * avoiding unpredictable results of automatic flag adjustments,
 * improving readability of REQUIRED_USE constraints,
 * keeping the specification and implementation relatively simple.

Processing algorithm
The existing package managers have to validate REQUIRED_USE constraints while evaluating the dependency graph. The current validation action is replaced by the following algorithm:


 * 1) Check whether the REQUIRED_USE constraint is satisfied by the USE flags enabled by the current user configuration. If it is, accept the package (the algorithm stops).
 * 2) Check whether the REQUIRED_USE constraint matches restrictions set in . If it does not, report a REQUIRED_USE mismatch and abort.
 * 3) Find all any-of (||), at-most-one-of (??) and exactly-one-of (^^) groups inside REQUIRED_USE and reorder (sort) them according to the algorithm defined below.
 * 4) Attempt to solve the REQUIRED_USE constraint using the algorithm defined below. If the attempt succeeds, accept the package with the set of USE flags determined by the solver.
 * 5) If the attempt at solving failed, report a REQUIRED_USE mismatch and abort.

REQUIRED_USE verification algorithm
The verification algorithm is implied by the meanings of REQUIRED_USE constructs as defined by the PMS. It is repeated here for completeness and for reuse in further algorithms.

The REQUIRED_USE constraint is considered satisfied if all the top-level items evaluate to true. An item evaluates to true if, depending on the item type:
 * A USE flag name that is not prefixed by an exclamation mark evaluates to true if the named flag is enabled. Accordingly, a USE flag name that is prefixed by an exclamation mark evaluates to true if the named flag is disabled.
 * For a USE-conditional group the condition needs to be tested first (according to the same rule). If the condition evaluates to true, the USE-conditional group is true only if all items in it evaluate to true. If the condition evaluates to false, the USE-conditional group always evaluates to true and the items inside it need not to be tested.
 * An any-of group (||) evaluates to true if at least one of the items in it evaluates to true.
 * An exactly-one-of group (^^) evaluates to true if exactly one of the items in it evaluates to true, and all the remaining items evaluate to false.
 * An at-most-one-of group (??) evaluates to true if at most one of the items in it evaluates to true.

Constraint group reordering algorithm
The constraint solving algorithm is built on prefer leftmost assumption for all any-of, exactly-one-of and at-most-one-of groups. That is, if the constraint is not satisfied by the current set of enabled USE flags, the algorithm prefers enforcing the leftmost constraints and disabling rightmost.

Due to different system profiles, it might be impossible to automatically solve the constraint using the leftmost flag specified by ebuild (e.g. when it is masked). In order to account for this, the specification provides a group reordering (sorting) phase before the solving algorithm.

The reordering applies to any-of, exactly-one-of and at-most-one-of groups. Per the format restriction, each group can only contain flat USE flags.

For each of the items in the group, if the item names a forced/masked USE flag:
 * if the item evaluates to true according to the flag's value, it is moved to the leftmost position in the group,
 * if the item evaluates to false according to the flag's value, it is moved to the rightmost position in the group,

Relative positions of multiple forced/masked flags are of no relevance since those flags are not altered.

This reordering ensures that if a flag is forced, it is always preferred over other choices; and if it is masked, it is never preferred. This makes it possible to easily account for all possible cases without having to provide a detailed algorithm to handle various possible results.

REQUIRED_USE solving algorithm
If the REQUIRED_USE constraint is not satisfied according to the initial set of USE flags implied by the configuration, the package manager attempts to alter the USE flags according to REQUIRED_USE.

Before solving, a set of immutable flags is determined based on forced and masked USE flags. If a flag is either forced or masked, it is marked immutable and the algorithm can not alter its value. If a particular rule would cause the flag to be altered, the solving is aborted and an error is reported.

The solving algorithm is applied at least once, and the REQUIRED_USE is rechecked after each application. The package manager may support running multiple iterations of the algorithm, in which case it needs to either limit the allowed number of iterations or abort after obtaining one of the values previously given by the algorithm (hitting an infinite loop).

In order to enforce REQUIRED_USE, each top-level item in REQUIRED_USE that did not evaluate to true needs to be enforced. All items are enforced in order, left to right. Depending on the item type, enforcing implies:
 * For a USE flag name that is not prefixed by an exclamation mark, the named flag is enabled. If it is prefixed by an exclamation mark, the named flag is disabled.
 * For a USE-conditional group, the condition (LHS) is evaluated first. If the condition evaluates to true, all the items inside the group are enforced, in order. If it evaluates to false, the group is skipped.
 * For an any-of group that did evaluate to false, the first (left-most) item in the group is enforced.
 * For an at-most-one-of group that did evaluate to false, the first (left-most) item that evaluates to true needs to be determined first. Afterwards, all items following it are negatively-enforced (forced to evaluate to false).
 * An exactly-one-of group is equivalent to a conjunction of an at-most-one-of group and an any-of group. That is, if all items evaluate to false, the rule for any-of is applied. If more than one item evaluates to true, the rule for at-most-one-of is applied.

The negative enforcing action can be applied to plain USE flag names only. If the name is not prefixed by an exclamation mark, then the flag is disabled. If the name is prefixed by an exclamation mark, it is enabled appropriately.

Context to QA checks
All of the QA checks are performed in context of a specific set of forced and masked USE flags, called immutable flags. All of the checks need to be repeated for every set. Since they can alter the preferences inside any-of, at-most-one-of and exactly-one-of groups, it may also be necessary to perform a separate transformation for each set.

The complete set of immutable flag combinations can be obtained using the following algorithm:
 * let U be the set of all USE flags (both explicit IUSE and implicit) that are used in REQUIRED_USE,
 * for every enabled profile:
 * let I1 be the effective, , , values that apply to the package and affect flags in U,
 * let I2 be the effective, , , values that apply to the package and affect flags in U,
 * add I1 to the result set,
 * if package has any stable keywords, combine I1 and I2, and add the result to the result set.

Afterwards, all checks should be performed for all unique values in the result set.

Requirements for REQUIRED_USE constraints
In order to verify the ability to solve REQUIRED_USE reliably, the QA check tools should ensure that the following conditions are met:


 * 1) no valid combination of USE flags can result in the constraint requesting the same flag to be simultaneously both enabled and disabled;
 * 2) no valid combination of USE flags (that is, not prohibited by immutable flags) can attempt to alter immutable flags;
 * 3) no constraint in REQUIRED_USE may alter flags in such a way that any of the constraints preceding it would start to apply and change the resulting flags in a second iteration.

Concept for transforming REQUIRED_USE into implications
The algorithms used to verify REQUIRED_USE rely on them being expressed in a flat implication form. In this form, the constraints are expressed as zero or more implications. Each implication specifies zero or more conjunctive conditions, and one or more effects. It is equivalent to a nested USE-conditional group. If all of the conditions are met, the effects are applied.

If a constraint is valid, then the solutions of its transformation are the same as of the original.

By idea, the transformation consists of the following steps:
 * 1) Reordering all any-of (||), at-most-one-of (??) and exactly-one-of (^^) groups according to the reordering algorithm.
 * 2) Replacing all any-of (||), at-most-one-of (??) and exactly-one-of (^^) groups according to the following transformations:
 * 3) Creating an ordered directed graph linking all nested conditions to their effects.
 * 4) Traversing all the paths from the topmost graph nodes to the deepest, in order.
 * 1) Creating an ordered directed graph linking all nested conditions to their effects.
 * 2) Traversing all the paths from the topmost graph nodes to the deepest, in order.
 * 1) Traversing all the paths from the topmost graph nodes to the deepest, in order.

For example, an ordered graph is provided for the following REQUIRED_USE constraint:

a b? ( c? ( d !b ) d? ( e ) ) b? ( f )

Nodes and edges are numbered to explain the ordering. Furthermore, the final (effect) nodes are colored red.



Traversing this graph produces the following paths, in order:
 * 1) a(1)
 * 2) b(2) → c(3) → d(4)
 * 3) b(2) → c(3) → !b(5)
 * 4) b(2) → d(6) → e(7)
 * 5) b(8) → g(9)

Those paths are roughly equivalent to the following USE-conditional group constructs:

Except that the value of b for constraint 4 is considered from the initial value rather than the one possibly altered by constraint 3. Constraint 5 uses a separate condition, and so uses the new value of b.

Algorithm for transforming REQUIRED_USE into implications
Steps 2 through 4 of the fore-mentioned transformation can be performed using the following recursive function. It should be applied to every top-level REQUIRED_USE item, in order.

It should be noted that for the purpose of distinguishing separate branches, all the condition objects need to have an unique identity. In Python this occurs naturally via instantiating an object. In other languages an explicit unique identifier may need to be included.

function transform(item, conditions=[]): if item is a USE flag: append (conditions, item) to the results if item is a USE-conditional group: new_conditions := conditions + [item.condition] for subitem in item.subitems: call transform(subitem, new_conditions) if item is an any-of (||) group: n := len(item.subitems) - 1 # (last index) new_conditions := conditions for f in item.subitems[1..n-1]: new_conditions += [!f] append (new_conditions, item.subitems[0]) to the results if item is an at-most-one-of (??) group: n := len(item.subitems) - 1 # (last index) for i := 0 .. n-1: new_conditions := conditions + [item.subitems[i]] for f in item.subitems[i+1..n]: append (new_conditions, !f) to the results if item is an exactly-one-of (^^) group: apply the logic for an any-of (||) group apply the logic for an at-most-one of (??) group

QA check logic
The logic for the reference algorithm is split into four split functions:
 * 1) verifying that the constraints do not alter immutable flags,
 * 2) verifying that the conditions for the constraints are not self-conflicting,
 * 3) verifying that no two constraints will attempt to force opposite values for a single flag,
 * 4) verifying that no constraint will meaningfully enable any of the constraints preceding it.

All checks but the second are an exact mapping of the requirements. The second check additionally catches meaningless constraints, and enforces a precondition for sanity of the checks following it.

The first two checks can be performed on every flat constraint separately. The third check is symmetric and needs to be performed on all unique pairs of flat constraints. The fourth check needs to be performed for all combinations of constraints and the constrains preceding them. For practical reasons, the same pairs can be used for the third and fourth check.

Verifying that constraints do not alter immutable flags
This check is performed separately for every flat constraint in the list.

A flat constraint is determined to alter immutable flags if both of the following conditions occur:
 * the condition for the constraint can be true,
 * that is, none of the sub-conditions references an immutable flag whose immutable state causes the expression to unconditionally evaluate to false,
 * the enforcement for the constraint references an immutable flag whose immutable state is the negation of the enforcement.

Verifying that constraints are not self-conflicting
This check is performed separately for every flat constraint in the list.

A flat constraint is determined to be self-conflicting if the following condition occurs:
 * for any sub-condition C in condition:
 * there is at least one sub-condition in condition that is the negation of C.

Restrictions for allowed REQUIRED_USE syntax
The specification imposes a number of arbitrary restrictions to REQUIRED_USE syntax, in particular by restricting the possible nesting and disallowing other complex constructs. The main goal is to simplify the algorithms used and make the results more obvious. This is at cost of prohibiting constructs that are rarely used, and usually could be replaced by simpler and more readable constructs.

Nested any-of, at-most-one-of, exactly-one-of groups
The first and most important restriction is that nesting of any-of, at-most-one-of and exactly-one-of groups is forbidden. While technically such constructs could work, some of them are not really meaningful and others are really confusing. At the time of writing, nested ||/??/^^ groups were used in exactly two Gentoo packages. The specific uses were:


 * 1) app-admin/bacula:
 * 2) dev-games/ogre:

The first use is not very complex, and indicates that either exactly one of the database providers need to be selected, or the bacula-clientonly flag needs to be used. However, at a first glance a user might be confused that the database ^^ constraint needs to be applied independently of the bacula-clientonly flag. The same construct can be expressed in a more straightforward way:

!bacula-clientonly? ( ^^ ( mysql postgres sqlite ) )

The second use is much more confusing. It means that both gl3plus and either of the gles2 or gles3 flags can not be enabled at the same time. However, gles2 and gles3 can be enabled simultaneously. The same construct can be expressed in a more straightforward way as:

gl3plus? ( !gles2 !gles3 )

As can be seen, in both cases the alternative constructs were both more readable and shorter than the nested expressions. In the first case, it is also the more natural way of expressing the problem. While replacing expressions that have more than two subexpressions would be harder, there were no uses of such expressions so far, and the potential ambiguity makes them unlikely to appear.

All-of groups
The second restriction imposed by this GLEP is disallowing all-of groups. The PMS allows them anywhere but in reality they are only meaningful inside ||, ?? and ^^ groups (elsewhere they do not have any effect, and can be inlined into parent block). Inside those groups, they imply that the item is considered matched only if all items inside the all-of group match.

The meaning of all-of groups inside || is pretty clear. However, inside ?? and ^^ some confusion may occur. In particular, for a general case of:

?? ( a ( b c ) )

the constraint only affects the combination of all flags inside the all-of group. In this case, enabling a prohibits having the combination of both b and c enabled. However, either b or c can be enabled separately without affecting a. This makes this constraint unlikely to have real use cases, and if it has, they are unlikely to be the most natural way of expressing the problem.

Furthermore, automatic solving of such constraints forces some implicit ambiguity. Since both (multiple) flags have to be enabled together to cause a particular item to match, there are multiple solutions of forcing an item not to match. For the fore-mentioned sample, having a enabled would require the solver to force ( b c ) not to match. To do this, the solver could either disable b, disable c or disable both flags.

There are arguments for both options — disabling only one flag follows the idea of 'smallest change needed'. Disabling both can be considered more consistent. In either case, there will be developers and user confused by the package manager relying on either behavior.

The all-of groups inside || do not suffer from the same issue since solving them does not require disabling anything. However, they also have seemingly low value and banning all-of groups altogether improves symmetry between the different group types.

Furthermore, the nested all-of groups make transformation into implication graph much more complex. Without them, the conditions are purely conjunctive. If we were to support all-of groups inside ||, ??, ^^ we would have to support disjunctive conditions, and transform them into conjunctive form.

The all-of groups were used in 5 different packages at the time of writing. Two of them were outside ||, ??, ^^, rendering them meaningless and probably accidental. The three remaining cases were:


 * 1) sci-chemistry/icm:
 * 2) media-sound/snd:
 * 3) app-i18n/ibus:

Of those cases, the first two can be replaced by pure, flat || and ?? groups appropriately. It furthermore indicates that all uses of all-of groups inside ^^ in Gentoo were purely mistaken.

The third case is potentially valid. It indicates that either deprecated or both gtk3 and introspection flags need to be enabled. However, it does not clearly indicate the preferred course of action. After investigating the ebuild in question, it is most likely that the following constraint would be more correct, and clearer to the user:

|| ( deprecated gtk3 ) gtk3? ( introspection )

That is, if user enables gtk3 and gtk3 requires introspection, then it seems more reasonable to enable introspection than to ignore the gtk3 flag and force deprecated module instead.

USE-conditionals inside ||, ??, ^^ groups
The last restriction forbids using USE-conditional groups inside any-of, at-most-one-of and exactly-one-of groups. Those indicate that some of the items inside the group are to be considered its members only if the relevant flags are enabled. They are logically equivalent to all-of groups, i.e.  and , except they have a different semantic — the latter form suggests enabling both flags, the former suggests considering bar only if foo is already enabled.

Supporting USE-conditional groups properly would most likely require splitting the parent group into multiple variants for different initial values of USE conditionals. Considering the above equality, it would also be inconsistent with the ban on all-of groups. Finally, those groups have little real value.

The only use case in Gentoo was in media-video/mpv:

opengl? ( || ( aqua egl X raspberry-pi !cli? ( libmpv ) ) )

It indicates that the OpenGL video output requires selecting one of the variants, with the libmpv variant being allowed only without CLI enabled. While this may be technically valid, it is confusing. Furthermore, other REQUIRED_USE constraints already require that either cli' or libmpv is enabled, making !cli imply libmpv. Therefore, the USE-conditional in the constraint is redundant.

Solving algorithm
The solving algorithm attempts to enforce REQUIRED_USE in the most natural way, interpreting the constraints as developer suggestions on how to make the constraint apply.

Application of different types of constraints
The algorithm aims to solve mismatched constraints in the most natural way, presuming that this interpretation is the most likely to be correct.

For the USE-conditional groups, it assumes that they mean if X is true, then Y should also be true. Appropriately, the algorithm does not alter the flag in the condition (X); instead, if the condition is true, it enforces the expression inside the group (Y).

For other groups, the algorithm applies the natural interpretation presuming that the items in group are stated in decreasing preference order, with the left-most item being most preferred. That is, if the group evaluates to false, it enforces a solution that either disables all enabled items except for the left-most already enabled, or enables the first item if no item is enabled.

Reordering of ||, ??, ^^ groups
The left-most-preferred assumption about the groups results in the solving algorithm relying on the ability to enable the item and disable other items. This is not possible if the relevant flag is masked, or (in cases of ??, ^^) some other flag is forced. If that were the case, the ordering inside those groups would have to be strictly limited by the 'common denominator' between the profiles. This would sometimes result in less preferred options being encouraged, or even impossible to express constraints — e.g. if the preferred implementation would not be stable but the package were stabilized.

To account for this, the groups are transformed to account for forced/masked (immutable) flags. The transformation is done through reordering the items because this keeps the specification as simple as possible. It does not to cover specifically how to interpret immutable flags in different kind of groups, and how to handle the groups afterwards. Instead, reordering results in the forced flags being preferred naturally, and the masked flags being discouraged naturally.

It also naturally handles the case when forced/masked flags result in impossible to satisfy constraints. Those cases do not need to be detected by the reordering algorithm implicitly, and instead just cause solver to fail early.

Left-to-right constraint application
The solving algorithm applies all changes necessary to enforce the constraints in order, left to right. Enforcing a specific ordering, combined with the PMS specifying how ebuild and eclass values for REQUIRED_USE are combined, makes the algorithm deterministic. Applying left-to-right is also the most natural way of doing it, making it easy for developers to predict the results.

Originally I had considered making the algorithm work independently of constraint order. However, this would clearly defining what the desired solution is, and finding an algorithm to enforce that. To achieve a deterministic solution, we would most likely have to require developers to provide groups that do not overlap. That is, for example:

a? ( !b ) b? ( c )

would be unacceptable since with both a and b flags enabled, the constraint would either enforce c or not, depending on the processing order. The developer would have to write:

a? ( !b ) !a? ( !b? ( c ) )

While this is a possible solution, expressing complex constraints would be very hard. Developers would no longer be able to naturally express the constraints, and instead would have to determine the correct sets of conditions for each requested result.

Single vs multiple iterations
This GLEP does not specifically restrict the implementations to doing simple or multiple iterations. Both options have their advantages.

A single iteration can successfully solve all valid REQUIRED_USE constraints, as long as they are properly ordered. An implementation using a single iteration has simpler error handling — it is only necessary to verify whether the REQUIRED_USE actually matches after enforcing it. It is also reasonable to request developers to order their constraints for a single iteration solving.

The advantage of using multiple iterations is that they can also solve wrongly ordered constraints. However, the implementation needs to account for the possibility of invalid (circular) constraints putting the solver in an infinite loop. For this reason, the solver needs to either limit the maximum number of iterations or store previous results and detect when the algorithm gives one of the previous results again.

For most of the real-life use cases, two iterations should be able to solve all the constraints. A large number of iterations is unlikely to be required by naturally written REQUIRED_USE constraints. It could be artificially caused by writing constructs like:

c? ( d ) b? ( c ) a? ( b )

The necessity of verification
The purpose of REQUIRED_USE constraint verification is to ensure that for all valid combinations of input USE flags, the solver will be able to find a valid solution. This needs to be done explicitly since complex REQUIRED_USE constraints may trigger solving issues with non-obvious USE flag combinations, causing the developers to miss the issue.

Since the solver must be able to deal with non-solvable constraints (by reporting them and letting the user deal with them), verification is not a strict necessity for enforcing REQUIRED_USE. However, it improves the user experience, and so is a worthwhile addition to the QA tools in place.

To provide the best coverage, it is beneficial to integrate the verification into the tools commonly used by developers — repoman and pkgcheck, including the CI runs. For this to be possible, the algorithm must meet two requirements:
 * it must be fast enough not to cause significant increase in repoman/pkgcheck run time for the full repository,
 * it must not trigger a large number of false positives, and if any are triggered, they should be easy to work around.

Context to the checks
As noted in the specification part, all of them checks need to be repeated for all possible sets of the immutable flags. This is necessary since the immutable flags can alter the solutions significantly. In particular:
 * they can alter the preferred choices in the any-of, at-most-one-of and exactly-one-of groups,
 * they can cause some of the constraints to be unable to be satisfied,
 * they can cause some of the USE-conditional groups to be disabled entirely.

To account for that and avoid the case where REQUIRED_USE solving would fail on some of the profiles, the verification should be performed for all combinations of immutable flags found throughout the enabled classes of profiles. Only the flags that apply to the REQUIRED_USE constraint in question need to be considered.

Due to the EAPI 5 stable masking, the immutable flags have to separately be calculated for ~arch and stable keywords. The stable variant does not need to be considered unless the package is actually stable or being stabilized, to avoid unnecessarily cluttering up and/or  for packages that are going to stay in ~arch.

The requirements for REQUIRED_USE
The rules imposed for verification aim to cover most of the common cases of unsolvable constraints. In particular:

no valid combination of USE flags can result in the constraint requesting the same flag to be simultaneously both enabled and disabled
 * If the effective REQUIRED_USE constraint (after collapsing all the groups) contains both foo and !foo, the verification will never consider the constraint met (since logically x ∧ ¬x is always false).

no valid combination of USE flags (that is, not prohibited by immutable flags) can attempt to alter immutable flags
 * This is implied by the immutability of masked/forced flags. An attempt to toggle those flags while solving should be considered a fatal error since //… always takes precedence over regular configuration and package-level toggles. Therefore, if such flags are enforced by an USE-conditional group, their condition should also be masked or forced appropriately.

no constraint in REQUIRED_USE may alter flags in such a way that any of the constraints preceding it would start to apply and change the resulting flags in a second iteration.
 * This is required for reliable single-pass solving. While the solving may work correctly with multiple iterations, the constraints can be reliably (and usually easily) fixed via reordering. More importantly, this also catches the constraints that can not be solved due to circular toggling between the constraints.
 * The addition condition for the second iteration change has been added to account for the common case of . While technically the second clause causes the first to start to apply, the second one already covers that case explicitly, so a second iteration would not change the result.

Transformation into implication form
The transformation of REQUIRED_USE into implication form is used to provide a form of the original constraint that is more convenient for analysis.

Firstly, the diverse (convenience) item types are all converted into a combination of implications and plain USE flags. The latter can express all the original constraints exactly, provided that the any reordering necessary is done prior to the transformation. As a result, we gain both simplified set of items that need to be considered, and a clear logical mapping of behavior associated any-of, at-most-one-of and exactly-one-of groups.

All of the transformed forms are built by definition, from the verification and solving algorithm:
 * Any-of group constraints are satisfied if at least one of the items match. Therefore, the solving only applies if none of them does, in which case the first item is enforced. Appropriately, the result of transformation is the enforcement of first item conditional to the negation of all other items (the condition for the first item is omitted as redundant — enforcing a flag that is already enabled does not change anything).
 * At-most-one-of group constraints are satisfied if no more than one item matches. The solving is applied if more than one item is enabled, in which case all but the first enabled item are forcibly disabled. Since disabling an already disabled flag does not change anything, this can be simplified to disabling all the remaining items if the left-most item is matched. The transformation does exactly that, for each item that can be possibly enabled, left-to-right.
 * Exactly-one-of group constraints are satisfied if exactly one item matches. Logically this is equivalent to both having at least one item and no more than one item matching. Therefore, this constraint can be converted into a combination of an any-of group and an at-most-one-of-group, for which the transforms are already defined.

Secondly, having limited the set of item types to just two, of which only one can be nested, the constraint can be easily converted into a graph. The resulting graph provides a clean visualization of the structure of the nested conditions. All nodes but the final (bottom-most) ones represent conditions, while the final nodes represent enforcements.

A plain graph could be used to visualize relations between different conditions and enforcements. However, the specifics of REQUIRED_USE processing, especially left-to-right processing, require that the transform preserves exact structure of the constraints.

Thirdly, having the graph (tree) of conditions, we can easily traverse them. In doing so, we construct paths that precisely express which conditions need to be met for a particular enforcement to apply. Since the constraints are applied in order, we need to traverse the graph in this specific order, and write the paths down in the same order.

In doing the two last steps, it is important that we preserve the identity of the original condition nodes. This is necessary to distinguish between two cases:

Since the solving algorithm is applied recursively to USE-conditional groups, in the first case the outer a condition is not reevaluated between processing b and c. In the latter case, the use of separate groups causes reevaluation of the condition.

While in this specific example there is no technical difference between the two forms, it becomes apparent when dealing with the following corner case:

In both cases, applying the first sub-item disables a. However, only in the second case will the solver reevaluate the value of a and omit the second group. A plain flattening would cause the same to incorrectly happen for the first case, rendering the transform not equivalent to the original form.

In order to prevent that from happening, the verification algorithms need to be able to determine that the a condition is the same node in both resulting flattened expressions, and appropriately account for the fact that it is not affected by the enforcement. In the reference implementation, this is done via preserving the identity of the node, and doing identity-wide node comparison.

Compliance with the PMS
This GLEP does not break the PMS compliance in any way. The syntax used by the constraints is a subset of the REQUIRED_USE syntax allowed by the PMS. The semantic extends the one defined in the PMS in non-conflicting way.

The PMS does not require a very specific behavior for REQUIRED_USE. The USE state constraints section requires that the package manager does not use (build/install) package versions where REQUIRED_USE constraints are not met.

However, it does not require the package manager to verbosely report the conflict which the package managers actually do. That considered, it should not cause any non-compliance if this verbose reporting is (partially) replaced by automatic solving. If the solving succeeds, the constraints are met and the package manager can proceed with building/installing the package. If it does not, the existing behavior of reporting the issue is preserved.

New constraints vs non-compliant package managers
This GLEP preserves full syntax compatibility with the existing package managers. The constraints written for auto-solving will still work correctly in the package managers not supporting it, resulting in regular REQUIRED_USE mismatch. Furthermore, the extended semantic meaning can result in improved readability of constraints, and therefore the messages issued by the package managers. Users aware of the auto-solving rules will have a suggested algorithm for satisfying REQUIRED_USE.

The only potential danger is that the auto-solving will result in more extensive use of REQUIRED_USE and less concern for whether they are satisfied by default, resulting in more frequent REQUIRED_USE mismatches. Avoiding this problem should be done on policy level, requiring the developers not to rely purely on auto-solving through a migration period.

Old constraints vs auto-solving
Most of the existing REQUIRED_USE constraints are already compatible with auto-solving. There are three problematic cases:


 * 1) constraints that are disallowed per the restrictions on REQUIRED_USE format,
 * 2) constraints that can not be solved by the algorithm,
 * 3) constraints that give sub-optimal (non-preferred) solutions.

While the impact and details differ for each case, it can be commonly noted that all of them can be reliably fixed before implementing auto-solving, and — as noted above — the fixes will not break existing package managers.

Constraints disallowed in this GLEP
For simplification, this GLEP will reject some of the REQUIRED_USE forms that are valid per the PMS. They will be rejected for all combinations of USE flags that do not satisfy the constraint. However, this is not a major issue for three reasons:


 * 1) The unsupported constraints are extremely rare, of low value and fixing them improves readability. As listed in rationale for the restrictions, there were a total of 8 packages being affected at the time of writing, and fixing them was already in progress.
 * 2) The constraints are only rejected for auto-solving but are still supported for REQUIRED_USE verification. The package manager will therefore just report the unsolvable REQUIRED_USE to the user, making this not a regression from the previous state.
 * 3) This GLEP does not strictly disallow the package manager from solving those constraints, only does not specify the solutions for them. Therefore, the package managers may implement custom extensions to solve them. However, they should still warn that this is non-portable and unreadable.

Constraints that can not be solved
Not all valid REQUIRED_USE constraints can be reliably solved. There are two major cases for that:


 * 1) Constraints that toggle flags that caused previous conditions not to apply. Solving those may require more than one iteration of the solving algorithm. However, they usually can be fixed easily by reordering.
 * 2) Constraints that have conflicts between flags. Solving those will result in repeated results where the constraint is unsatisfied. With multi-iteration solving, they can cause infinite loops. They have no trivial solution.

However, the problem usually applies to only some of the disallowed USE flag combinations. The verification algorithm should be able to detect most of those cases.

Constraints with sub-optimal solutions
While this specification uses an algorithm that attempts to read REQUIRED_USE constraints in the most natural way, not all constraints in Gentoo are written in this manner. Especially, many any-of, at-most-one-of and exactly-one-of groups are written with no specific ordering in mind. In some cases, they are used interchangeably with USE-conditional groups. Some USE-conditional groups are written without concern for clearly establishing the relation between the condition and the items inside the group.

While the auto-solving algorithm is able to solve many of those constraints, the solution can be considered sub-optimal as they do not follow the solution that the developers would knowingly suggest. For example, per the current rules the two following constraints are equivalent:

feature? ( dep ) !dep? ( !feature )

However, per the auto-solving semantic the first one will favor enabling the dependency, while the second one will favor disabling the feature.

This is probably the most important issue since there is no easy way to automatically detect that.

Proof-of-concept code
The reference implementation of various algorithms and the scripts used to test them are included in the required-use project on GitHub. However, it needs to be noted that they are still work-in-progress.

PkgCore
The implementation of the following parts of the specification have been submitted to the PkgCore package manager for inclusion:
 * validation of REQUIRED_USE constraints for compliance with the restricted syntax: pkgcheck PR#58.

All of those bits are actively used in the pkgcheck fork used for the Repository mirror & CI project for CI.

Copyright
This work is licensed under the Creative Commons Attribution-ShareAlike 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/3.0/.