Future EAPI/Alternate EAPI mechanisms

This is a working page for collecting the different proposals how the EAPI should be specified in future. See also the previous discussion.

= Parse the EAPI assignment statement =

This first proposal would require that the syntax of the EAPI assignment statement in ebuilds matches a well defined regular expression. A scan of the Portage tree shows that the statement only occurs in the following variations (using EAPI 4 as example):
 * EAPI=4
 * EAPI="4"
 * EAPI='4'

Sometimes this is followed by whitespace or a comment (starting with a # sign). Also, with very few exceptions the EAPI assignment occurs within the first few lines of the ebuild. For the vast majority of ebuilds it is in line 5.

Written in a more formal way, appropriate for a specification:
 * Ebuilds must contain at most one EAPI assignment statement.
 * It must occur within the first N lines of the ebuild (N=10 and N=30 have been suggested).
 * The statement must match the following regular expression (extended regexp syntax):
 * ^[ \t]*EAPI=(['"]?)([A-Za-z0-9._+-]*)\1[ \t]*(#.*)?$

Note: The first and the third point are already fulfilled by all ebuilds in the Portage tree. The second point will require very few ebuilds to be changed (9 packages for N=10, or 2 packages for N=30).

The package manager would determine the EAPI by parsing the assignment with above regular expression. A sanity check would be added. Citing Zac Medico: "The fact that we can compare the probed EAPI to the actual EAPI variable after the ebuild is sourced seems like a perfect sanity check. We could easily detect inconsistencies and flag such ebuilds as invalid, providing a reliable feedback mechanism to ebuild developers."

Upgrade path

 * For a transition period, new ebuilds must still be parseable by old package managers. Changes of global scope behaviour must account for this.

Pros

 * No modifications to the tree required.
 * Uniform style of assignment throughout all EAPIs.
 * Once the transition period is past and this is the required method, it allows deploying new bash version requirements, changes in default bash shopt settings, etc.

Cons

 * Cannot be used in other languages than bash (unless they allow for multiline comments).
 * Bash allows a double assign of the EAPI var; this requires this mechanism to thus specify search rules (for instance, last EAPI that matches is whats used), and requires each PM to do a sanity check after sourcing that the bash level EAPI setting matches the parsed setting. Definite potential for implementation bugs, and divergent behaviour between the three PMs in this case.
 * File/libmagic would have to rely on duplicating the bash regex to pull EAPI out of the content to identify this as an ebuild. Not the simplest thing, and definite potential for bugs.

= EAPI in header comment =

A different approach would be to specify the EAPI in a specially formatted comment in the first line of the ebuild's header. The syntax could be one of the following:
 * The word "ebuild", followed by whitespace, followed by the EAPI, followed by end-of-line or whitespace. Corresponding regexp:
 * ^\# *ebuild[ \t]([A-Za-z0-9._+-]*)($|[ \t])
 * The word "EAPI", followed by an equals sign ("="), followed by the EAPI, followed by end-of-line or whitespace. Corresponding regexp:
 * ^\# *EAPI=([A-Za-z0-9._+-]*)($|[ \t])

Upgrade path

 * The usual EAPI assignment statement in the ebuild would be still required for a transition period. A sanity check similar to the one mentioned above would be added.
 * Alternatively, the EAPI variable could be made read-only immediately, if a statement like one of the following was added to ebuilds:
 * : ${EAPI=5}
 * ${EAPI} || { EAPI=-1; return; }

Pros

 * Clean and well-defined solution when the transition is finished.
 * Can be used not only for bash, but also for other languages that use '#' based commenting.
 * New atom syntax can be deployed.
 * Once the transition period is past and this is the required method, it allows deploying new bash version requirements, changes in default bash shopt settings, etc.
 * File/libmagic would be able to reliably look for the header marker to identify the content as an ebuild.

Cons

 * Duplicate EAPI specification (header comment and bash assignment) in ebuilds during a transition period.
 * Current-day PMs can invalidly process this and assume they know the EAPI- under current rules, lack of an EAPI assignment means EAPI=0. After the transition period of duplicate assignments is over, this will break any older PMs that try processing ebuilds using just an EAPI header comment.
 * It may be counter intuitive to developers if the comment isn't appropriately formated, that messing with the EAPI line can break things.
 * Backwards incompatible modifications to version specification cannot be deployed via this.
 * EAPI is currently defined as a string; meaning it can have spaces, random characters, etc. EAPI would now have to be limited to just what the regex allows.  Not a huge con, but it's a reduction from existing behaviour (and some dumb ass user may have EAPI='my special eapi' in use).

= EAPI in header comment and one-time change of file extension =

As before, but combined with a one time change of the file extension, like .ebuild → .eb. (Note: It was pointed out that "eb" should be avoided because of its meaning in Russian.)

The EAPI variable could be made read-only in bash before sourcing the ebuild.

Pros

 * Allows for changes of global scope behaviour,
 * Can be used not only for bash, but also for other languages.
 * The transition period is effectively close ended, rather than open ended as the previous form is.
 * New atom syntax can be deployed.
 * File/libmagic would be able to reliably look for the header marker to identify the content as an ebuild.

Cons

 * Two different file extensions for ebuilds. After a lengthy transition period, the old .ebuild extension could be phased out reducing it to just one.
 * Per the same as previous form, it may be counter intuitive to a developer that modifying a comment (if not appropriately structured) can break things.

= EAPI specified by a function =

For new eapis (5 and up) the syntax used is eapi 5 || die

For existing EAPIs, we leave it as is, or require them to convert over after a couple of years if we truly care.

Upgrade Pathway
For existing EAPIs, no modification is required. Longer term we may wish to convert them over, but it's not strictly required.

For getting this deployed/usable, the '|| die' snippet ensures that current-day PMs won't invalidly use EAPIs that use this mechanism. For portage for example, this results in user visible warnings thrown due to the die- uglyness, but portage ultimately will mask that ebuild thus behaving correctly.

For future versions of PMs, prior to EAPI5, they should deploy the eapi function so that the PM exits cleanly from sourcing thus precluding the warnings. Not required to convert to this form, but is advisable.

Pros

 * Uses standard bash syntax.
 * For PMs that support the eapi function, the manager can exit out quietly w/out issue if they don't support that EAPI.
 * If the PM does support that EAPI (the common path), no extra work was expended since it's checked during metadata regeneration (which already requires sourcing the ebuild).
 * Current-day package managers will properly skip over this when they see it- there is no required transition period, nor potential for older PMs to see it and invalidly think they know the EAPI.
 * New atom syntax can be deployed.
 * Bash shopt settings, new bash version requirements, etc, can be deployed w/out a transition period as long as that functionality is used *after* the eapi function invocation.

Cons

 * While current-day package managers will properly handle this and mask the ebuild, it does induce ugly warnings. Not required, but it would be advisable to deploy the eapi function before eapi5 to minimize people seeing those warnings.
 * Backwards incompatible modifications to raw version specification cannot be deployed via this.
 * If a developer uses latest-bash version syntax prior to the eapi invocation, the syntax complaints can be visible to users. Addressable via enabling 'set -e' till the eapi function has been called (although this solution requires discussion/commentary from the community at large).
 * File/libmagic would have to look for the bash invocation '^ *eapi [^\n ]( *|| *die *$)' to flag this as an ebuild. Not the simplest check.

= Glep55/EAPI in filename =

The relevant specification is glep55. For the sake of accuracy, this is included since it is a technically viable alternative.

Summarizing, the proposal is that EAPI be folded into the file name. Rather than portage-2.2.ebuild with an internal variable assign of EAPI=3, we would instead name the file portage-2.2.ebuild-3.

Upgrade Pathway
Assuming auditing of portages manifest verification doesn't turn up anything, this is deployable effectively immediately with a few naggles (see cons).

Pros

 * While an audit is required to verify past portage behaviour for manifest validation, this should be deployable for users immediately.
 * New atom syntax can be deployed.
 * New versioning specifications can be deployed; for example, having '-r1' be allowed to be '-r1.0'.
 * New bash version requirements, changes in default bash shopt settings, etc, can be immediately deployed in an EAPI.
 * Paludis/exherbo have been using a variation of this for ~3 years (they use a .exheres prefix rather than .ebuild). The basics have been deployed for paludis for a long while.

Cons

 * Breaks current-day repoman manifest support. Fixable, but there is a flag day there.  Audit required to verify this (specifically introducing new files into $PN directory) doesn't break older versions of portage doing verification.
 * ebuild is a well known extension.
 * Was already rejected by a previous council.
 * Subjectively speaking, there is no agreement as to whether or not EAPI in a filename is 'correct' or not. Best summed up as a heated difference of subjective design views.
 * While this has been proposed for 5 years, discussion is fairly heated and divided on this proposal to this day. Achieving a community level agreement for this is exceedingly unlikely- this mechanism is likely to be agreed to in gentoo only via council decree.
 * Existing PMs that are able to report "unable to use ebuild xyz due to eapi" no longer would report it; more generally, current-day PMs wouldn't even know the ebuild is there. There may be unknown implications to this lingering in the implementations of the 3 main PMs.
 * File/libmagic would have to look for metadata vars other to identify this as an ebuild. Doable, but it would make the patterns sensitive to our variable naming and has the potential for bugs due to it relying on regex'ing bash source.
 * While paludis has supported for this a long while, it's worth noting that exherbo actually has only one version- exheres-0, which they've modified rather than cutting new versions (akin to the pre-EAPI days for portage where new functionality was deployed unversioned). Real world usage and implications of this is likely limited (paludis/exherbo devs should expand this section as appropriate).