Upstream repository shutdowns/Google code
A wrong URL in an ebuild is a bug. It is necessary to prevent a flood of manually written tickets in Bugzilla. We do not have the manpower and the tools to manage 500 new, manually written bugs plus duplicates efficiently. Hence it is important that the ebuilds are fixed soon.
"These tarballs will be available throughout the rest of 2016."
List of packages
Use the md5 cache:
cd /var/db/repos/gentoo/metadata/md5-cache; grep "URI=" -R | grep "googlecode\.com"
Output (googlecode-shutdown.txt) of the command as run on 2016-11-05.
Who maintains how many broken packages and what are their names?
( curl https://dev.gentoo.org/~jstein/googlecode-shutdown.txt | cut -f1 -d":" | while IFS="" read -r arg; do echo -n "$arg: " ; equery meta -mH $arg 2>/dev/null | tr "\n" " "; echo ; done ) | tee /tmp/package_owners.txt; sed 's/:\s*$/: maintainer-needed/;s/\@/<at>/g' < /tmp/package_owners.txt > /tmp/owners_obfu.txt
cut -d" " -f 2- < /tmp/owners_obfu.txt | tr " " "\n" | grep "^\w" | sort | uniq -c | sort -n -k 1 > /tmp/owner_histogram.txt
sort -k 2 < /tmp/owners_obfu.txt > /tmp/packages_by_owner.txt
- Who maintains how many broken packages? owner_histogram.txt (NumberOfPackages Contact)
- Who maintains which broken package? packages_by_owner.txt (PackageName Contact)
- Updated list online: http://gentoo.levelnine.at/wwwtest/sort-by-filter/code.google.com.txt
This script only checks the SRC_URI fields and nothing else.
Download the script:
chmod +x ./src_uri_match.py
To list all the ebuilds with maintainers:
To list all the packages for a single maintainer:
./src_uri_match.py -n -m email@example.com
- First step: just 'clone' all SRC_URI to a random web server, and fix the current one to point at it.
- Second step: figure out if the project has moved or has been frozen.
Q: Are there sources on googlecode which we must not mirror?
cd /var/db/repos/gentoo/metadata/md5-cache; grep "URI=.*googlecode\.com" -R -l | xargs grep "RESTRICT"
Returns all RESTRICT variables of the googlecode ebuilds. They do not use restrictions which would forbid mirroring in general.
In the case of BerliOS it took about one year to fix 50% of the ebuilds. Perhaps it is interesting to keep some statistics.
Command to collect date and number of packages with googlecode in the *URI variable:
printf "%s,%s\n" "$(date --iso)" "$(grep "URI=.*googlecode\.com.*" -R "$(portageq get_repo_path / gentoo)"/metadata/md5-cache | wc -l)"
- DONE 2015-03-22 bug #544092 create tracker bug on bugzilla (jonasstein)
- DONE 2016-10-28 Find out which packages need to be fixed exactly. Need bash scripts. (jonasstein)
- DONE 2016-10-28 Create statistics on the progress. (dilfridge,jonasstein)
- DONE 2016-11-05 send mail to gentoo-dev (jonasstein)
- DONE 2016-11-05 create script to generate maintainer:package lists (kentnl)
- DONE 2016-11-12 create the statistics on some historical dates for the plot. (dilfridge)
- DONE 2016-11-24 send the mail to maintainers of remaining broken ebuilds (jonasstein)
- DONE 2016-12-20 find out, which packages do not allow a mirror in the license: none (jonasstein)
- TODO after 2016-12-01 create bug tickets for remaining broken ebuilds
- DONE verify, that all ebuilds are mirrored. Mirror all source files (tar balls...), add link here, add link to log which packages failed to mirror.
- TODO find out how to proceed with mirrors if upstream is dead. Do we have a policy on that?
- TODO a google code repository makes repoman really sad, so repoman should tell the user about its feelings ;-) bug #601476