Project:Python/Adding and Removing Python implementations

From Gentoo Wiki
Jump to:navigation Jump to:search

Adding new implementations to PYTHON_COMPAT

Initial notes on testing Python packages

To understand the problems of testing Python packages for new implementations, a few major points should be noted regarding to the nature of the language:

  1. Python rarely catches issues at compile time. In fact, compilation errors out only on major syntax errors and exceptions thrown during the execution of the global-scope code, and only regarding the code branches actually visited. This applies only to modules that are actually loaded — e.g. byte-compiled Python modules and scripts that are actually run. Other Python files are usually not syntax-checked at all.
  2. Some degree of static code checking can be done using dev-python/pyflakes. This package can detect some common mistakes without actually executing any code. However, it is only limited to the mistakes that can be obviously detected — it can't cover e.g. wrong types, incorrect parameters, invalid attribute references.
  3. Most of the issues with Python code can be detected only through exceptions thrown while the particular code path is executed. Some of the issues happen only with a very specific input (e.g. non-ASCII data, non-UTF8 data) and/or with very specific environment (e.g. non-UTF8 locale, Estonian, or Turkish locales).

Furthermore, it should be noted that the minor CPython releases often contain multiple hard-to-catch incompatibilities which rarely can be caught through 1. or 2.

Implications of adding an implementation to PYTHON_COMPAT

The following implications of adding a new implementation to PYTHON_COMPAT should be noted:

    • In python-r1 and python-single-r1 packages new USE flags are added to allow selecting the implementation. Depending on user's configuration and invocation of package manager, this may imply an unnecessary rebuild. A revision bump is unnecessary since --changed-use or --newuse will detect the change and cause rebuild if necessary.
    • In python-any-r1 packages, the change is not directly visible to the user; however, the dependency graph is altered to allow the new implementation. A revision bump is generally considered unnecessary since the dependencies should be build-time only.
  1. The PYTHON_USEDEP variable (if applicable) and the results of python_gen_*dep start including the new implementation, enforcing it to be consistently present on all dependencies. pkgcheck should be used to check that the dependencies are still satisfied (i.e. that all dependencies support the new implementation).
    • In python-r1, the per-implementation code will be executed with the new implementation if it is enabled. Furthermore, the common code (python_setup, python_*_all()) may use it. Implementation restrictions (to python_setup and python_gen_cond_dep) may need to be revised.
    • In python-single-r1, the new implementation will be used if it is selected via PYTHON_SINGLE_TARGET.
    • In python-any-r1, the new implementation may be used if it happens to satisfy the dependencies and is preferred by the eclass logic. You should not rely on any particular implementation being used.
  2. The package will expose the support of new implementation to its reverse dependencies, making it possible for them to depend on it. Once they do, removing the implementation will break the reverse dependencies. If you are in doubt whether the implementation actually should be added, think twice before other packages start relying on it.

Testing support for the implementation

All additions to PYTHON_COMPAT should be tested. As explained in detail above, the changes between different versions of CPython and different interpreters can be tricky and mistakenly added implementation can cause a lot of havoc once others packages start to depend on it.

While testing, you should ensure that all code is reliably tested with the newly-added implementation. The best way to do this is to explicitly disable all other implementations, as this ensures that also the common blocks of ebuild code are tested. For python-r1 and python-single-r1, this can be done using USE flags. For python-any-r1, this can be done via setting the EPYTHON variable to the desired implementation and ensuring that the dependencies will be satisfied. Alternatively, a single implementation can be forced using the special PYTHON_COMPAT_OVERRIDE variable:

root #PYTHON_COMPAT_OVERRIDE="python3_6" emerge -1v dev-python/foo

Using PYTHON_COMPAT_OVERRIDE causes the packages to ignore their value of PYTHON_COMPAT at build time, and use the one specified (split on whitespace). Please note that this affects all packages built in a batch (i.e. dependencies too, if any of them are being built). It does not enforce correct dependencies, and the eclass prints a big warning about it (if you don't see the warning, it did not work for some reason).

When testing installed scripts or modules, please make sure to use the correct interpreter. In case of python-exec wrapped scripts, EPYTHON variable can be used to override the implementation used, or the underlying script can be called directly:

user $EPYTHON=python3.6 pyfrobnicate
user $/usr/lib/python-exec/python3.6/pyfrobnicate

The actual tests depend on the package in question, and may include:

  • Running the test suite if one is provided for the Python part of the package. It is preferable that the test suite passes; however, it is acceptable if the new implementation does not cause any new issues compared to the existing implementations. If the test suite is restricted for some reason (e.g. because it relies on network access or external resources), it is usually reasonable to run it manually outside ebuild environment.
  • If the package installs Python modules that are not covered by tests, it is reasonable to at least attempt a few common API calls. If the package has reverse Python module dependencies, it is reasonable to test at least one of them as well.
  • If the package installs scripts that are not covered by tests, it is reasonable to attempt to use the scripts in the common way.

Adding the new implementation

If the testing proved that the new implementation can be added, you can add the implementation into the PYTHON_COMPAT variable. Usually the new implementation is only added to the newest ebuild. As detailed above, a revision bump is usually unnecessary since all runtime dependency changes are conditional to new USE flags; therefore, --changed-use or --newuse will catch the change if it is relevant to the user.

If the package in question is stable but the implementation is not, then the USE flag is in the use.stable.mask file already. You can therefore add it without worrying about adding dependencies on ~arch (unstable) packages. Furthermore, since the flag does not affect the users of stable, you do not have to worry about new features being added to a stable package (and you don't have to revbump it to ~arch).

If you are adding a stable implementation to a stable package, a revbump to ~arch might be desired. However, that decision is left to the developer's common sense.

When adding implementations to PYTHON_COMPAT for multiple packages, the gpy-upgrade-impl tool (with --fix option) from app-portage/gpyutils may be considered useful.

Removing implementations from PYTHON_COMPAT

Dead implementation cleanup

Normally old Python implementations are removed as a global change. It is therefore not necessary to manually clean packages up from old implementations. The Python team handles all the work related to that.

In this case, the implementation is marked as dead directly in the eclass. As a result, the listings of this implementation in PYTHON_COMPAT are ignored and all packages consistently stop using it. The PYTHON_COMPAT entries can be safely removed afterwards without risking breaking the dependency graph.

However, this is also unnecessary since the Python team will afterwards use gpy-drop-dead-impls tool from app-portage/gpyutils to automatically remove them from PYTHON_COMPAT.

Removing unsupported implementations

If a new package version stop supporting a particular Python implementation, a special care needs to be taken while it is being removed. Since the package exposed support for the implementation, its reverse dependencies may have relied on it. It is therefore necessary to handle the reverse dependencies first.

Usually, it is possible to just remove the support for the old implementation from the reverse dependency as well; in which case it is important to also recursively check its reverse dependencies. The PYTHON_COMPAT updates should proceed from the final reverse dependencies and proceed dependency-wide, as to ensure that the consistency is maintained throughout the commits (if multiple commits are used).

In some cases, removing the support for the old implementation in a reverse dependency is not deemed feasible if the package in question is optional. In this case, the alternative might be to either mask (and eventually remove) the relevant USE flag unconditionally, or to restrict it to other Python implementations (using reduced python_gen_usedep + REQUIRED_USE).

Please note that the failure to update reverse dependencies will not result in an immediate error from pkgcheck. However, the package managers will reject to upgrade the dependency for users who have the reverse dependency with support for the old implementation installed. The explicit dependency graph errors will be reported only when all versions of the dependency still supporting the old implementations are removed.