Project:Infrastructure/Git migration/GSoC2006
This page was created to track progress regarding the migration of Gentoo's CVS repositories to another versioning control system. This is being done in conjuction with a Google Summer of Code project.
CVS Migration
Reasons for Migration
- CVS does not support branching in a sane manner.
- CVS commits are not atomic.
- CVS has a lot of overhead when working with a remote checkout.
Milestones for success
Task | Status | Due Date | Date of Completion | Notes |
---|---|---|---|---|
Select VCS systems for testing. | We have selected Mercurial, Subversion, GIT, and our control, CVS. | May 22nd, 2006 | May 22nd, 2006 | Other VCS systems may be added provided a repository snapshot is provided by June 19th 2006. |
Convert gentoo-x86 to each VCS System. | Completed git, svn, cvs | June 5th, 2006 | July 14th, 2006 | Completed a migration to svn and git, total time ~30 days! |
Design a set of stress tests and analysis tools for Version Control Systems. | Skipped, see notes | June 12th, 2006 | Completed June 19th, 2006 | I was hoping to have a design prior to starting, but work commitments forced me to accelerate my project a bit. This phase was moreso me scratching things out on a napkin ;) |
Implement a set of stress tests and analysis tools for Version Control Systems. | Dropped | June 19th, 2006 | July 14th | I basically broke down and used dstat, I got to the point where I had spent about 12 hours on the code for this tool, and then figured why spend more time when a superior tool exists. As such I decided use the better tool instead of writing a replacement. |
Run the stress tests on each VCS system in order to generate a useful data set. | July 26th, 2006 | August 3rd, 2006 | Completed Auguest 15th | This was completed around August 3rd |
Analyze the data and present this to the Gentoo Community. Attempt to have the community choose a VCS system. In the event that the community takes too long in determining their future VCS system; discuss with Lance and pick a VCS to continue the project with. | Started | Aug 10th, 2006 | September 4th | The code was ported to all three systems; isntead of choosing just one. |
Compose a Migration Plan to migrate to the new VCS System. | Started August 21st | August 30th | Sept 4th | GLEP XX is currently in the submittal process. |
Update and author developer documentation related to the VCS system. This will include updating any tools that are focussed on VCS systems such as echangelog, repoman, and the cvs->rsync scripts. | Start July 14th | Aug 8th, 2006 | Pending Completion | Repoman and echangelog been released but need thorough testing. Please see here |
Set up test environment and give developers a change to use the new VCS when it is not live. This is also a chance to make sure all tools work properly. | Not yet started | Sept 1st, 2006 | Pending Completion | GLEP XX must be approved before this can start. |
Test in the testing environment for up to one month, ensure sufficient hardware requirements and also ensure that real world data matches data collected during the data mining. | Not yet started | Oct 1st, 2006 | Pending Completion | GLEP XX must be approved before this can start |
Set up the live system and migrate to it. | Not yet started | Nov 1st, 2006 | Pending Completion | GLEP XX must be approved before this can start. |
Version Control Systems under consideration
System | Pros | Cons | Migration | Full Checkout | Space Considerations | Bandwidth Usage | Memory Usage | Others? |
---|---|---|---|---|---|---|---|---|
Subversion | Atomic Commits, Merging, Tagging, Branching is a copy operation, Versioned Metadata, Directory Versioning, Annotation | Twice the disk space | Migration Complete (cvs2svn) ( conversion stats) | 17 minutes, 3 seconds. | Server Usage (7.3gb) Client Usage (2.8gb) | 21.8mb/s | 20mb per checkout | server statsclient stats |
GIT | Annotation; Two, interchangeable, on-disk formats are used: An efficient, packed format that saves space and network bandwidth An unpacked format, optimized for fast writes and incremental work. Merging, tagging, branching, very fine grained control. | Being a distributed VCS means it may be difficult for us to use, has high server spec requirements. | Migration complete, minus the Authors file. |
Note: Smart clone being a checkout over a smart protocol, one that will generate the packs for you; this generally pains the server (lots of ram and cpu usage). However the cloned repo will be all ready for you to use. A Dumb clone is one over http, or rsync, where the server just tranfersfiles and the client does all the work to prep the repository. |
Packed (1.1gb), Unpacked (1.6gb) | 1.72mb/s (smart), 1.2mb/s (dumb) |
|
Smart clone StatsDumb clone Stats (Server)Dumb clone Stats (Client) |
CVS (Stats on the current usage) | Already converted, status quo, no migration, no training, does what we need 90% of the time. | Sucks at branching, merging branches back in. | Migration Unnecessary | 8 minutes, 54 seconds. | Server (1.6gb) Checkout(~880Mb) | 13.18mb/s | 15mb per checkout | server statsclient stats |
This page is based on a document formerly found on our main website gentoo.org.
The following people contributed to the original document: Alec Warner
They are listed here because wiki history does not allow for any external attribution. If you edit the wiki article, please do not add yourself here; your contributions are recorded on each article's associated history page.