Gentoo git workflow

This article Article description::outlines some rules and best-practices regarding the typical workflow for ebuild developers using.

Commit policy

 * Atomic commits (one logical change).
 * All commits must be repoman-valid (exceptions only on right-hand branches).
 * Commits may span across multiple ebuilds/directories if it's one logical change.
 * Every commit on the left-most line of the history (that is, all the commits following the first parent of each commit) must be GPG signed by a Gentoo developer.
 * must be run from all related ebuild directories (or related category directories or top-level directory) on the tip of the local master branch (as in: right before you push and also after resolving push-conflicts).

Atomicity

 * Commits in git are cheap and local, so use them often.
 * Split version bumps and ebuild cleanup in separate commits (makes reverting ebuild removals much easier).
 * One may break atomicity in order to make a commit repoman-valid (or you should reconsider the order you commit in, e.g. license first, then the new ebuild).

Commit message format

 * All lines max 75 characters.
 * If CATEGORY/PN is very long and 75 character limit is impossible to obey it is OK to exceed the limit in this case.
 * First line brief explanation.
 * Second line always empty.
 * Optional detailed multiline explanation must start at the third line.
 * Commits that affect primarily a particular subsystem should prepend the following code to the first line of the commit message:
 * Single package ->
 * Profile directory ->
 * Eclass directory ->
 * Licenses directory ->
 * Metadata directory ->
 * A whole category ->
 * If the change affects multiple directories, but is mostly related to a particular subsystem, then prepend the subsystem which best reflects the intention (e.g. you add a new license, but also modify profiles/license_groups).
 * It is also encouraged to use tag formats such as,  , and so on. Please review the kernel patch guideline for additional tag variants.

Example

app-misc/foo: version bump to 0.5

Bump to 0.5. This also fixes security bug 93829 and introduces the new USE flag 'bar'.

Acked-by: Hans Wurst  Reported-by: Alice Wonderland 

Branching model

 * The primary production-ready branch is master (users will pull from here), there are only fast-forward pushes allowed.
 * There may be developer-specific, task-specific, project-specific branches, etc.

Branch naming convention

 * Developer branches:
 * Project branches:
 * If in doubt, or if the branch could be useful to others, discuss the naming on-list beforehand.

About rebasing

 * Primary use case: in case of a non-fast-forward push conflict to remote master, try  first; if that yields complicated conflicts, abort the rebase and continue with a regular merge (if the conflicts are trivial or even expected, e.g. arch teams keywording/stabilizing stuff, then stick to the rebase).
 * To preserve merges during a rebase use  (if appropriate, e.g. for user branches).
 * Don't use --preserve-merges if you do an interactive rebase (see BUGS in git-rebase manpage).
 * Commits that are not on the remote master branch yet may be rewritten/squashed/splitted etc via interactive rebase, however the rebase must never span beyond those commits.
 * Never rebase on already pushed commits.
 * There are no particular rules for rebasing on non-master remote branches, but be aware that others might base their work on them.
 * There are no particular rules for rebasing non-remote branches, as long as they don't clutter the history when merged back into master.
 * Don't do complicated rebases to avoid a merge commit at all cost.

About merging

 * Do not ever commit implicit merges done by git pull. You may want to set  to avoid git implicitly creating those.
 * If a rebase fails or is too complicated, do a regular merge instead.
 * Do a merge if the information is useful (e.g. pulled from a foreign remote user branch or merged a non-trivial eclass conversion back into master) and force a merge commit (non-fast-forward merge via ).
 * To avoid a merge commit when merging local branches back to master (e.g. information is not useful), you may try to force a fast-forward merge by first rebasing the local branch against master and then merging it into master, see here.
 * Extend merge commit messages with useful information, e.g. how conflicts were solved.
 * Keep in mind that all commits of the first parent of the history must be GPG signed by a Gentoo developer, so you may want to force merge commits especially for user branches.

Remote model
We have a main developer repository where developers work and commit (every Gentoo ebuild developer has direct push access). For every push into the repository, automated magic things merge stuff into user sync repository and update the metadata cache there.

User sync repository is for power users that want to fetch via. It's quite fast and efficient for frequent updates, and also saves space by being free of the ChangeLog files.

On top of the user sync repository, rsync is propagated. The rsync tree is populated with all old ChangeLogs copied from CVS (stored in 30 MB git repo); new ChangeLogs are generated from git logs and Manifests are expanded.

Best practices

 * Before starting work on your local master, it's good to pull the latest changeset from the remote master.
 * It might be a good idea for projects/developers to accumulate changes either in their own branch or a separate repository and only push to remote master in intervals (that decreases the push rate and potential conflicts).

Cloning
Clone the repository. This will make a shallow clone and speed up clone time:

Using the git+ssh:// protocol of course requires the user's SSH public key on the server. If you want the full history, just omit  from the above command.

If you're going to make contributions via pull-requests, fork the repository (e.g. via GitHub) and then add it as a remote to your local git clone:

Configuration
All developers should at least have the following configuration settings in their local developer repository. These setting will be written to and can also be edited manually. Run these from within the repository you've just cloned in the step above:

GPG Configuration
To get your GPG key run this command. It should be the top line (starting with pub). If you have more than one key with the UID you will need to select the correct key yourself (from the list of returned keys).

In addition, you need to tell which key to use for committing:

Workflow walkthrough
These are just examples and people may choose different local workflows (especially in terms of when to stage/commit) as long as the end result works and is repoman-checked. These examples try to be very safe, but basic.

Before doing anything, make sure your git tree is in a correct state and you are in the branch you actually want to work on.

Common ebuild work

 * 1) Pull in the latest changeset, before starting your work:
 * 2) Do the work (including removing files)
 * 3) Make sure you are in the ebuild directory
 * 4) Create the manifest:
 * 5) Stage files (including removed ones), if any:
 * 6) Check for errors:
 * 7) If errors occur, fix them and continue from point 4
 * 8) Commit the files:  and enter your meaningful commit message
 * 9) Push to the dev repository:
 * 10) If updates were rejected because of non-fast-forward push, try:  first, then run:  and continue from point 8.
 * 11) If the rebase fails, but the conflicts are trivial and don't contain useful information (such as keyword stabilization), fix the conflicts and finish the rebase via:  and continue from point 4
 * 12) If the rebase fails and the conflicts are complicated or you think the information is useful, continue with a regular merge:
 * 13) If merge conflicts occur, fix them via:  and continue from point 4
 * 14) If no merge conflicts occur, run:  and continue from point 8 above.

using pram
is a tool to help merge pull requests easily.
 * 1) Open the pull request assigned for you, or whatever you want to work on. We use https://github.com/gentoo/gentoo/pull/12345 as an example.
 * 2) Make sure that everything is done correctly and that every commit has proper sign-off included.
 * 3) Apply and automatically close the pull request to ::gentoo repo via

git am method

 * 1) Find a pull request you wish to merge, https://github.com/gentoo/gentoo/pull/1 as an example.
 * 2) Ensure your checkout is up to date:
 * 3) Fetch and apply chosen commit:
 * 4) You can review the changes with:
 * 5) Make sure chosen commit didn't break things
 * 6) Push your changes

git cherry-pick method

 * 1) Identify the remote URL and the branch to be merged.
 * 2) Add a new remote:
 * 3) Fetch the changes:
 * 4) You may review the changes manually first:
 * 5) Checkout the remote branch:  (you are now in detached HEAD mode)
 * 6) Test the ebuilds and run repoman:
 * 7) If everything is fine, switch back to master:
 * 8) Ensure your master branch is curent:
 * 9) If it's just one commit, do:
 * 10) If there are multiple commits in one consistent range, do:
 * 11) Push your changes

Working with git
Especially staging files can be tedious on the CLI. So you may want to use the graphical clients (for browsing history) and  (for staging/unstaging changes etc.). You have to enable the  USE flag for.

Diffs for revision bumps
Suppose you are making a new revision of the  package&mdash;you are removing   and replacing it with the new revision. In that case, git (by default) sees two changes: removal of the file  and addition of the file. If you attempt to view your changes with  or , you'll get a useless diff containing the entirety of both files.

Git can be coerced into showing you the diff between the old and new revisions even though they live in separate files. The following command-line options specify how hard git should try to find files that have been copied or renamed:

In the common case, searching for renames is enough. You can make that behavior the default easily:

Using the gentoo git checkout as your local tree
If you want to use your git development checkout as your local tree, you have to do two things:
 * 1) make sure the directory has the correct access rights
 * 2) generate/get metadata-cache, dtd, glsa, herds.xml and news yourself

For the latter, there are example hooks for portage and paludis. You should probably skip the files ' and ' respectively, so that portage/paludis doesn't mess up your checkout (as in: disable auto-sync).

Github pull request made easy
Cd into your gentoo Git repository and add these lines to the repo's .git/config file:

[remote "github"] url = git@github.com:gentoo/gentoo.git fetch = +refs/heads/*:refs/remotes/github/* fetch = +refs/pull/*/head:refs/remotes/github/pr/*

Now type:

git should go and fetch all pull requests, and list them as branches.

Each pull request will now be displayed as a branch name, each with a different name such as "remotes/github/pr/123". For instance if you wish to fetch changes brought by https://github.com/gentoo/gentoo/pull/105, simply type:

to checkout the pull request.

Grafting Gentoo history onto the active repository
To graft the historical Gentoo repository onto your current one simply run:

The general syntax of the last command is:

(This may be useful if you're not using the official versions of the repositories.)

Once you've merged the history, git will behave as if it is one big repo, and you should still be able to push from that to Gentoo as the head is untouched. Merging the history will remove the gpg signature on the initial gentoo commit (since it is being modified to do the graft).

Preventing 'git repack -a' from touching huge packs
Normally git repack -a (invoked by git gc) tries to repack everything into a single pack. If your repository contains a few huge packs already (commits from initial clone, grafted history pack), the repack can take really long and be truly memory-consuming.

To avoid that, you can mark the huge pack files for keeping. In this case, git repack -a won't touch those packs and instead focus on repacking everything else. As a result, you can get most of the benefits of repacking with major time saving.

Once you identify the huge packs you'd like to keep (05c10cc8ef170fd182619ef32ec4007e5f32f46d and 412c2dda845a79854872afaf5f9cd4ea896aef38 in this case), create .keep files for them:

Retaining commit author information
It is important that appropriate credit is given when committing on behalf of others. Git can natively differentiate between who authored a commit and who pushed it (example), and most methods of merging handle this automatically.

If you have applied someone else's changes manually or made local edits to a PR, check that the author field is accurate:

If necessary, amend the commit to give credit where it is due:

If you need to credit several authors, add trailer lines like  (as currently supported by major Git hostings like GitHub and GitLab) to the long commit description part:

app-misc/foo: version bump to 0.5

Bump to 0.5. This also fixes security bug 93829 and introduces the new USE flag 'bar'.

Co-authored-by: Jane Doe 

Acked-by: Hans Wurst  Reported-by: Alice Wonderland 

A cli-tool to crawl through pull requests
gengee is a tool that makes queries through pull requests. It can show "outdated" pull requests, ie where a newer version is added to ::gentoo tree, where a linked bug is already closed, or where package in question has been removed from ::gentoo. You can use it to query pull requests assigned to certain maintainers, or opened by certain authors.

and so much more.

Check the homepage for an ebuild, info on how to set it up and documentation on how to use it.

External resources

 * Git Tutorial by Lars Vogel
 * Tips and Tricks on gitready
 * kernel patch guideline
 * gentoo gitweb
 * gentoo github mirror
 * https://github.com/gentoo-mirror/gentoo - Gentoo repository on Gentoo repository mirrors

And summarizing it all, ...
http://xkcd.com/1597/