Replace distfiles.g.o and bouncer.g.o with a new service redirector.g.o, that solves the problems of both old service, due to their large overlap.
- Replace distfiles.g.o with a CDN service
- This may happen in parallel in future, but very high bandwidth costs should be considered
- Replace with a caching proxy that serves distfiles from a LRU cache
- High bandwidth cost and hosting costs
- bouncer.g.o needs many things fixed:
- Bouncer is very obsolete and not maintained upstream
- Bouncer DID do a good job checking w/ HTTP HEAD requests that files existed before redirecting users to them.
- Bouncer itself has a very high latency from some parts of the world.
- Prior Bouncer replacement attempts were a failure
- Mirrorbrain is tightly coupled with Apache and has specific requirements for mirrors to run checks locally
- Configuration for bouncer does not scale:
- Mirror configuration manual & painful
- File configuration painful
- distfiles.g.o needs many things fixed:
- How it works: distfiles.g.o is run as a DNS-round robin, with a small subset of Gentoo mirrors that agreed to respond to HTTP Vhost requests for Host: distfiles.gentoo.org.
- Despite the name, it serves distfiles, releases, experimental, snapshots
- Performance is terrible: very small set of mirrors involved
- HTTPS is becoming mandatory in Browsers, breaking access to http://distfiles.gentoo.org
- Unless we provide & maintain SSL certificates for each participating mirrors, the service will break in the near future. Mirrors have indicated an unwillingness to raise their maintenance cost.
A user should be able to use a single service that can redirect to a local mirror for fetching distfiles, releases, snapshots exist.
- Lightweight on-demand HTTP redirection
- Scale-to-zero required
- checking objects:
- validate existence of objects: most important
- validate size of objects: important
- validate content of objects: not important, other validation paths exist
- validate non-existence of old objects: least important, mirrors sometimes delete at a slower rate
- re-validate prior objects
- Able to efficiently check large numbers of objects on every mirror
- distfiles: ~73000 files, ~73000 symlinks-to-files, ~300 directories
- releases,experimental,snapshots: ~2400 files, 350 directories, ~70 symlinks-to-files, ~80 symlinks-to-directory
- Able to handle a large number of mirrors
- ~60 mirrors (excluding protocol differences)
- ~155 different mirror access points (~40 ftp, ~60 http, ~30 https, ~25 rsync)
TL;DR: Run a service at CDN-edge-like locations that generates HTTP temporary redirects to objects, with a backing service that checks existence.
- Populate storage w/ expected state of all objects
- Type (file/symlink**/directory)
- Optional: Checksums (**etag format for some mirrors, expensive to check)
- Retain knowledge of old objects, do not delete
- Probably runs on the master distfiles central node w/ emirrordist
- Concept: fetch metadata for every object and compare to expected state
- Run HTTP or rsync requests for every object.
- HTTP HEAD must be done for every object. HTTP Pipelining will have large benefits here.
- rsync -n: should be able to just run for the entire mirror in one pass and parse output.
- Must: Prioritize checking new objects on all mirrors
- Should: Run checkers regionally/close to each mirror, because checks are latency-bound
- Redirection Data Builder:
- For each region, build a redirection map, object -> nearest mirror(s)
- Rebuild maps on some regular cadence (hourly?)
- Store check data
- Store redirection maps
- Run using AWS Lambda@Edge OR OpenFaaS at many points (prefer Lambda@Edge for better scaling & locality)
- Load redirection map for that region
- If object is known in map, pick one of the valid mirrors in that region to send traffic to
- Valid: object exists on that mirror && mirror online
- If object is not known in map, redirect to special fallback host
- Maybe: rank mirrors?
- Maybe: set headers for caching proxies to not cache the redirect?