Project:Infrastructure/Git migration

Resources

 * rich0's validation code: https://github.com/rich0/gitvalidate
 * ferringb's generation code: git://pkgcore.org/git-conversion-tools

People
This is in a roughly chronological order, and apologies to anybody that was left out.


 * Alec Warner (antarus) - did the GSoC 2006 migration tests
 * Robin H. Johnson (robbat2) - infra guy, herding this project
 * Nguyen Thai Ngoc Duy (pclouds) - Former Gentoo developer, wrote Git features for the migration
 * Michael Haggerty - upstream cvs2svn author
 * Brian Harring (ferringb) - wrote much python to improve cvs2svn
 * Michael G. Schwern - Perl hacker, fixed git-svn for SVN 1.7 support
 * Rich Freeman (rich0) - validation scripts
 * Patrick Lauer (patrick) - Gentoo dev, running new 2014 work in migration

Goals

 * Each Git commit should be mapped to one or more CVS commits
 * Portage two-phase commits (commit 1: ebuilds/files/Manifest, commit 2: Manifest regenerated from $Header$ changes, optionally GPG-signed) should be mapped to a single commit
 * Portage trailer data in CVS commit log should be converted to newline format Git logs
 * As the validation settles, it should become possible to have CVS commits generate known Git commit IDs
 * Start list of validated commit IDs

Pseudocode
do { do { adjust conversion scripts do test conversion validated all newly converted commits } while (not validation passed on all commits) switch CVS to read only do final conversion final validation if(final validation passed) { activate Git repo for public commits lock CVS permanently } else { unlock CVS } } while(still using CVS)

Validation
Quick notes on how to test: Source for the validation scripts at: https://github.com/rich0/gitvalidate.git Clone the git bundle into a directory Extract the cvs root into a directory Checkout the cvs gentoo-x86 module into another directory Use git log to obtain the hash of the last git commit Point TMPDIR at a location with ~10GB of space (/tmp on tmpfs may not cut it and sort will fail). Run gitdump/gitprocesstree.sh > g Run cvsdump/cvsprocesstree.sh  . > c

2006

 * The first major work in VCS Migration was done as a Project:Infrastructure/Git_migration/GSoC2006 GSoC 2006 project by User:Antarus.
 * Git was mostly too resource intensive at this point for serious consideration, and was slower than CVS.
 * Conversion takes more than 7 days.
 * Decision to stay on CVS

2009

 * April:
 * Converting a recent CVS copy - Item 1: mailmap fun
 * Converting a recent CVS copy - Item 2: statistics
 * Conversion time: 18.5 hours
 * June:
 * Progress summary, 2009/06/01
 * Conversion time: 9 hours
 * Bug in cvs2svn/cvs2git causes lines of files to be lost
 * ExternalBlobGenerator module created by upstream author, originally closed source, and non-public: improves pass1 from 36204 seconds to 1598 seconds


 * October: Gentoo meeting at the GSoC Mentor Summit
 * All Gentoo developers present held a meeting, one of the major topics was blockers and plans for the Git migration.
 * Shawn Pearce, one of the major Git developers, and author of the Repo tool.
 * Decision of a monolith repo, per-category repo, per-package repos: monolith repo wins.

2010

 * User:ferringb takes on Python improvements with snakeoil and Unladen Swallow
 * Gentoo SCM conversion status report, 2010/01/27
 * Conversion time: 110 minutes
 * Commit Signing &amp; Sparse Trees identified as requirements

2011

 * August:
 * Re: gentoo-dev Progress on cvs->git migration (status report)
 * Unresolved items: commit signing, thin Manifests, merge policies
 * September:
 * Portage gets thin Manifest support
 * October:
 * commit: teach --gpg-sign option

2012

 * May-July:
 * Bug #418431: (git-svn is broken with SVN 1.7 and can corrupt data) causes a hassle for Git work (part of the migration process at this time relies heavily on the cvs2svn codebase)
 * October:
 * Email [gentoo-scm] Fwd: [gentoo-dev] CIA replacement on 2012/10/01 by rich0.
 * Bug #333531: portage migration to git (tracker bug)
 * Outstanding items: pre-upload hook, git2rsync scripts, validation, documentation
 * Email [gentoo-scm] CVS -> git, list of where non-infra folk can contribute on 2012/10/01 by ferringb
 * Lays out the many tasks well
 * http://git.stuge.se/?p=portage.git;a=commitdiff;h=thickandthin mentioned for merging, still not done?