Project:Infrastructure/Mirrors/Distfile Mirroring System

From Gentoo Wiki
Jump to:navigation Jump to:search

This guide describes how our distfile process flow works, including how a new tarball gets added to the distfile mirrors

Placing files on the Gentoo Mirror system

The mirror system will automatically fetch any distfile that is in the ebuild tree. Developers don't have to do anything unless an error occurs. The mirror system is designed to propagate to all nodes within 4 hours of the files hitting the Master Private Dist Mirror using cron jobs to pull from the Master. Due to various issues the nodes may take as long as 24 hours for your file to propagate. If you suspect that your file is not being fetched simply check the failure report .

If you're ebuild contains restrict="mirror" the file will not be mirrored. The only exception to that is mirror://gentoo/ . This is automatically done by the mirror system, no manual intervention is required.

Files referenced in* will be retained for as long as it is referenced in a properly formatted whitelist file. Mention any file that you want to be retained on the mirror system, even if no ebuild refers to it (anymore), here. Keep in mind that the mirror system will retain files for three weeks after it is last referred to in an ebuild so only use distfiles-whitelist if absolutely necessary.

All entries in MUST come with a comment in the same format as profiles/package.mask. If you wish to whitelist a lot of files, you should create a separate file in the same directory instead.

Placing files in the distfiles-whitelist takes them out of the control of the Mirror System. If you remove the file the system will automatically take back control and clean the file like normal.

Automatic fetch failure

When the automatic fetch fails it is the responsibility of the package maintainer to manually retrieve the file from the original location and place it in /space/distfiles-local on This file is published as an rsync directory, to which the private master distfile mirror connects to and retrieves any files in the directory. These files are synchronized to the private master distfile mirror. Files placed in are automatically removed after 7 days and the Mirror System takes control of the file.

Files placed in distfiles-local will override existing files of the same name

The mirror system only downloads the first instance of a file name. If subsequent ebuilds reference this file name the checksums of the two URI's are compared, if they do not match the second file will not be fetched. The mirror system will produce an error and human intervention is required. Please check file names carefully.

Common fetch errors:

  • URI port must be 80, 443, or 23
  • URI is malformed (mirrors:// is a common mistake, mirror:// is proper)
  • Mirror target isn't valid (doesn't specify a valid tier)
  • Checksum conflict with another ebuild in the tree - check your file name
  • Upstream host timeout while attempting to connect - Mirror System will reattempt at next pass
  • Upstream host isn't valid - check your URL name.

Technical Details

master private distfile mirror

Every hour, a script that roughly contains this command is ran.

/usr/bin/emirrordist \
        --distfiles=${DATADIR}/distfiles/ \
        --delete --jobs=10 --repo=gentoo \
        --deletion-delay=${DELAY} \
        --failure-log=${LOGDIR}/failure.log \
        --success-log=${LOGDIR}/success.log \
        --scheduled-deletion-log=${LOGDIR}/deletion.log \
        --deletion-db=${LOGDIR}/deletion-db.bdb \
        --distfiles-db=${LOGDIR}/distfile-db.bdb \
        --temp-dir=${DATADIR}/tmp/ \
        --whitelist-from=${DATADIR}/tmp/whitelist-master.txt \
        --distfiles-local=${DATADIR}/distfiles-local \