User:Ulfox/GSoC-GSE

GSE: Gentoo Stateless Environment is a set of scripts and configuration files that take special advantage of catalyst and other Gentoo features for the purpose of creating a Gentoo system that can function under stateless conditions.

The project aims to automate the building and configuration process of a Gentoo system that will produce a flexible system that can be transformed to anything one wishes. Because the foundation is Gentoo, such a flexibility can easily be achieved and the process itself represents a way of unlimited possibilities and configurations, since Gentoo provides the means for those.

USE Flags
To install GSE, run:

To use distcc and ccache, one has to enable those flags, otherwise gse will read no option regarding distcc/ccache, with exeption the catalyst part which is independent of the process.

Building the System
In general, as stated above, GSE consists of scripts, meaning that most of the time, it reads configuration files. Those files can be accessed either by the script's main menu, or manually from /usr/lib64/gse/config.d. During the built process, two products are created, I) a Gentoo system being one, II) a set of a kernel with an initramfs being the second. The separation of those is important, since the kenrel and initramfs will be distributed alone, while the system will be fetched from the the machine that runs those, and further configured in the process.

Bellow are presented the part's of GSE building sequence. Those part's refer only to the system product(I), meaning that the controller has a completely different stage, separated from the process. This ensures that the controller is mutually excluded from the rest of the system. The kernel exists with it's optional(sometimes) initramfs, and everything around is controlled by it.

Description per part Part A. Verifies the sources and creates a stage 3 system using catalyst Part B. Creates the mount points and copies the essential files for the chroot stage Part C. Updates portage and Installs eix Part D. Prompts for a world rebuild, if the system is not created from catalyst(locally) Part E. This part reads all the configuration files and applies changes to the system Part F. Essential to make the system further functional and requested user packages Part G. Reads the conf file containing runlevel instructions and executes a loop of rc-update commands Part H. {optional} Reads the .config file, prompts for extra configuration and builds a modular kernel Part I. {optional} This part creates an initramfs with dracut or genkernel Part J: Removes all packages that are no longer required Part K: Exits chroot, further clean the system, archive it and sign it             Part L: Mark the products complete and ready to be distributed. From now, all clients that request a version check, will be notified for this product and prompted to begin fetching

The optional part's are included if someone wishes to built a system using the gse process, but does not wish to include the controller. This disables the functions provided by the controller. Before initiating a build sequence, one has to edit the configuration files at /usr/lib64/gse/config.d. The files can be edited from the GSE main menu, which is recommended, since all are listed there, under special categories, which gives a more condensed view and assures that no configuration areas will be missed.

To bring up the GSE main menu script, run:

or simply

The second entry instructs gse to just bring the main menu up, and from then on, all configurations and instructions will be enabled through the sub-menus entries, while the first entry instructs gse to bring up the main menu while at the same time exporting optional flags for the build process.

For a list of the available gse flags and arguments see gse manpage(1), which holds more details and examples.

After the configuration files have been edited, the building sequence is ready to begin. Currently there are two building options supported by gse. The first one is catalyst while the second one is precomp (as precompiled). The latest one is nothing more than a latest stage3 tarball that is distributed by the Gentoo Release Engineering team. It should be noted that the RelEng team uses catalyst to build the above tarball.

Because most of the gse scripts on stage A are built to support catalyst, we will assume that catalyst is going to be used as the base option for our examples.

The build can be initiated either from the main menu: System menu -> Build a system -> Catalyst -> Initiate Build or by:

The --base flag is the only flag with its argument, required to initiate the building sequence, everything else is considered optional.

Among the supported flags and arguments, the most famous are: --lawful-good="arguments" and --enforce="arguments". Both flags use the same arguments, which are simply hooks in the build sequence. They start with a [g] followed by a name regarding the part. Example: gcat refers to catalyst part inside the StageA, while the gupdate refers to the portage update part inside the {chroot} StageB. Please see man 1 gse for more parts.

The differences of --lawful-good and --enforce are two. First one is that they enable opposite functions. That is, the --lawful-good flag will automatically say 'pass' to the referred hook point (excluding it from the process) while --enforce flag will automatically say yes, and additionally use force on the referred part. The second difference is that --lawful-good suppresses --enforce, meaning that if someone uses:

then, gse will automatically pass all parts of partA. For more information about parts and stages see gse manpage(5):

Building with distcc, ccache
Given that gse has been emerged with distcc flag, the process can take advantage of those during the chrooting phase. To initiate gse build with distcc, run:

The above command will instruct gse to read the distcc/ccache configuration files and use distcc for the process.

If one wishes to use distcc with pump mode, then the command becomes

Pump implies distcc, therefore there is no need to use the on argument.

Requested Commands
Gse provides a way to source custom scripts, before and after each part. If --sdir flag is passed, then a set of functions is enabled that, injects those files at the requested hook points.

The basic syntax is:

The custom scrip functions are enabled by --sdir. Additionally --sdir's argument is the scripts directory. For the rest 2 flags, the --do export's the script's names while the -g indicates at which point a script will be sourced. The relation --do/-g is 1:1 between spaces and commas.

Example:

In the above example, gse will search for the scripts at /root/myscripts directory. If the directory exists, and the scripts do exists too, then gse will proceed with the hook points verification. As stated above, the relation of scripts between spaces and hook points between commas, is one to one. That means that gse will source script1 and then script2 before initiating Part: A. After that, it will source script3 before initiating the network seed function. Last, gse will source script4 and then script5 after the Part: Catalyst has finished. The above command generates a file inside "${CLOCALLG}/doscripts".

As we can see, each script is related to a hook point, exactly the way it's described above.

N Looking inside the ${sdir}.

The script files can be any script that can be sourced, however it is recommended to stay with bash, since it's the one supported by this project.

The hook points are the same with the --lawful-good/--enforce arguments. Therefore for our example (after portage update), the hook point would be gupdate with a + sign. The mines sign indicates that the command or chain of commands will be executed before the specific part is activated.

This feature is probably one of gse most important features, since it can transform gse from a predetermined building process, to a completely different process.

The next example will further reveal how strong this feature can be.

What is happening here is that, gse is instructed to automatically pass the portage update, installation and configuration parts. At the same time gse is instructed to execute the commands in file0 at the portage update part, the commands in file1 at the installation part and last the commands from file 2 at the configuration part.

Each of these commands could source other scripts, and the customization can go on and on. You can even go completely gse building process free while only using the hook points that are provided.

Part: A Fundamentals
From part A the only configuration files that are used, are the spec files that catalyst will use for building the stage3 tarball. Apart from those, the rest of the process will try to locate the portage snapshot or create it (for catalyst option) then build the stages. When done, extraction sub-part is enabled and a "${CDISTDIR}/workdir-catalyst" is created to host the extracted tarball. For precomp base, the process simply downloads the latest stag3 tarball, verifies the origin and the file's integrity before extracting the tarball to "${CDISTDIR}/workdir-precomp".

PART: B Preparing to enter the new system
Here, the chroot preparation function is enabled. This function create all the mount points required for the chroot process, copy the files required by the chroot scripts, copy extra files that are indicated inside the inject-custom configuration file and last pass all the parameters required for StageB before chrooting happens.

Part: C Syncing Portage
Chroot has happen, and the chroot_init script has been enabled. This part read the imported environmental variables and proceeds with probably the most important part of the script, updating portage.

Three Control Functions
Stage B scripts come together with three control functions. Those functions provide three basic tools for ensuring that failed parts will not break the process. The first one is called chroot_master_loop and as its name states, it does nothing more than initiating a loop when called. The loop can only call the other two functions or exit when one of those is satisfied.

The second function is called emerge_master_loop and its job is to read if there is a emerge resume list, and prompt for resume or if there is a last failed command and prompt for a reaction. Among the resume/reaction options, the user can select SHELL to manually resolve the issue (e.g. add USE Flags in the package.use directory to fix an emerge issue) and then exit the shell so the main script can continue resume the process by doing a resume/reaction.

The third function is called _ask_for_shell, and is a simple function which is called when a part that is not emerge related fails. Because this case is harder to resolve automatically (e.g. a copy action trying to copy a file that is missing), the user is asked instead to resolve this manually by being provided with the last failed command, the stderr log and the probably the items that where related with. For example, for the command that creates the fstab entries, the provided logs would indicate the failed command, the stderr and the entries that were intended to be created in the fstab.

The above functions are called by a function which checks the exit code of each step. If a fail is detected, then the related loop function is called.

Part Portage
This part is a sub-part of Part: C. Here the only entries worth mentioning are the GSE Profile creation and the apply new. The apply new entry will apply all the custom USE Flags and then emerge those changes.

The GSE Profile
The GSE profile is a custom experimental profile. The aim of the profile is to collect useful flags and packages for aiding systems intended to be used on Research labs and University computer labs. For more details about the flags and the packages it enables, see man 5 gse (GSE Profile section).

Part: D Rebuilding the system
This part simply prompts for a emerge -eq @system command, in case the precomp base is detected.

Part: E Configuration
This part enables the functions to read all and apply all configuration files. Locales, timeszone, consolefonts, fstab entries, ssh-pub keys and more configuration files under /etc are configured here.

Part: F Emerge Essentials
Here, packages indicated in the configuration files are emerged. Optional to those are genkernel, gentoo-sources, dracut, grub:2 since those parts are optional too.

Part: G Runlevels
This part belongs to the configuration Part: E, however it has been left as a separate entry because of the importancy the runlevels play in the systems boot and functionality.

Parts: H/I Kernel/Initramfs
The entries on these parts prompt for a kernel/initramfs build. If force is active on this part, the prompt becomes a yes and genkernel builds a general modular kernel. For manual configuration --enforce=gker should be avoided.

You can tell gse to use an already built kernel if one wishes to. To include a pre-built kernel, run:

Same holds for initramfs

Part: J Chroot cleanup
Deselection of tools that were used during the chroot configuration process.

Part: K Cleanup
The chroot has exited with 0 and everything looks fine. The last cleanup function is called. It will further clean the systems /var /tmp /usr/share/{requested entries}. If --build-minimal is enabled, this part will remove many more entries. For more information about --build-minimal see man 1 gse. After the cleanup returns 0, the system is put back in a tarball, sha512sum is created and last the tarball is signed.

Part: L Mark for Distribution
The generated archive is marked as completed and a prompt asks to either mark it as default or simply exit. Marking the system as default, instructs gse to create a new version related with the above system. Now all clients asking for version, will fetch this one and prompted to sync the new image.

The Controller
The controller consists from a set of scripts, functions and files that lie inside the initramfs. The concept of it, derives from the need to control and make changes to multiple systems that host the images created from the builder. By names definition, the controller is responsible making decisions before the system begins booting, that is, before the initramfs handles the control to the main system.

The most important part of the initramfs apart from the config.d directory, which is a way of indirect server-client communication, is the _etc_misc script, which is responsible for mounting /etc as tmpfs and other entries requested from the configuration files.

To build the controller, run:

Structure
Controller's functions {as described in man 1 gse}
 * Read /config.d/sources configuration files and export them
 * Check for network connection
 * Attempt to establish connection with the server
 * Fetch new /config.d files {if any} from the server
 * Export the newly fetched files
 * Check the health integrity of SYSFS and BACKUPFS
 * Apply new configuration files to the SYSFS
 * Update runlevels
 * Create new drive interfaces
 * Create filesystems
 * Create and modify LABELS
 * Switch bootflags
 * Mount /etc and other directories as tmpfs
 * Decide which partition will be named SYSFS
 * Create,delete and modify subvolumes
 * Even wipe the whole setup and start new

The above features can be accessed and modified, while not recommended from the controller modules. The modules are located at "$GSE/config.d/controller/modules" and are organized by categories.

Export config.d
At the very beginning of the init script, the first controller's script is sourced. This script will simply read the indicated configuration files inside /config.d directory and export important environmental variables for the rest of the process. Some of the most important variables are the gse sources {server's name, keys,...} and the devices labels. The last are probably the most important variables, since they indicate which partition is holding the real system and which the backup system.

Networking part
On this part the networking check/fetch scripts are sourced in the written order. The network check script will simply check if there is a network connection available. This step is quite crucial, since the return value will completely change the order of the sourced scripts to come. For example a return value of 1 means that there is no network connection, hence the fetch script, version check script and export new script will never be sourced. However a return value of 0 will enable the above scripts.

With network established, the client requests new configuration files from the server. Each time a new configuration file is fetched, it is immediately read and exported. After that, a version check script will compare the version that was fetched from the server and compare it with the one that lies in the SYSFS partition. A miss match will prompt for a new fetch.

Fetching a new system will wipe the SYSFS partition, create a new filesystem if it's indicated by the configuration files and last move the new system in it.

Health check
For this part, we will assume that the version check returned 0 or the answer for a fetch new was set to no. The health check script is a list of conditions that check whenever the system completed a healthy life cycle the last time.

Since the system is stateless, a life cycle for gse is considered the set of healthy startup and healthy shutdown. If all conditions return 0, then the last life cycle is marked as a healthy and the process continues. However if a condition fails, then the life cycle is marked as unhealthy.

An unhealthy mark will instantly enable new functions which will rename the SYSFS label of the rootfs to a DEPRIVED label. Since there is no system to be mounted, the BACKUPFS is renamed to SYSFS and the process resumes. The backup filesystem is the last cloned filesystem of the old system, which completed a healthy cycle.

Configurations
All (system) configuration files that were fetched during the network phase are applied here. The SYSFS is mounted as rw and changes are applied.

Clean up
After the configuration has finished, a clean up script is sourced. This script will clear the workdirs and remount SYSFS as ro.

Mount Directories
This is the last and final phase. The last script is sourced, which mounts /etc, /var/log, /tmp and any other indicated directories, as tmpfs and finally resumes the process of the initramfs.