This article tracks various techniques for making Gentoo consume less disk space (shrinking the size of the Gentoo base installation). This is useful when trying to trim the size of a Gentoo down to an irreducible size.
Limit installed packages
In order to prevent bloat, the system should contain the lowest number of packages possible. This image will be on a strict diet so that it can stay skinny.
Bare minimums may change based on the purpose of the image. Containers do not need to contain a kernel, whereas a kernel is a necessary part of virtual machine images.
No kernel necessary, which will reduce space requirements significantly.
Virtual machine images
Kernel configuration should be kept to the lower possible numbers of features... perhaps review the Gentoo profile structure.
/etc/portage/make.conf can be adjusted to remove unnecessary files.
FEATURES="nodoc noinfo noman"
Then recompile the @world set.
Replacing userspace binaries with a single binary implementation
Many of the functional core utilities in Gentoo can be replaced with a smaller, single binary implementation such as sys-apps/busybox (with
make-symlinks USE flag enabled) or sys-apps/toybox.
Most of the files in /bin, /usr/bin, and /usr/sbin should be symlinks to the single binary.
Changing out the C lib
Use sys-libs/musl instead. This is possible by selecting the default/linux/amd64/17.0/musl profile.
eselect profile set default/linux/amd64/17.0/musl
Removing transient cache files
The following directories are ethereal / transient and can be removed before shipping an image: /var/tmp/portage, /var/db/repos/*, /var/cache/distfiles/*
Experimental: Removing the package manager
Problem: Portage and its reverse dependencies (such as the python implementation) is likely one of the biggest disk space consumers by percentage in a skinny image. Is it possible to remove Portage, and then reinstall via simple a simple scripted function later? The idea would be to leave the existing package database in place (and potentially compress it to squashfs or another readable format), then reverse this operation if/when the package manager is reinstalled. This is essentially gutting Gentoo of what makes it Gentoo, but should be very beneficial in terms of disk space gains.
Size of image with Portage and reverse deps: TODO Size of image with Portage and reverse deps stripped: TODO
Experiments to explore:
- Can Portage remove its reverse dependencies and then remove itself?
- Would a secondary package manager be needed to perform this step? Something that can read the EAPI and the package database.
- Can Portage (or a similar tool) operate (mainly interested in
--search) from a read-only package database (/var/db/pkg)?
- Do vulnerability tools (such as OpenVAS) read version information /var/db/pkg directly, or do they mostly scan the filesystem and interact with in-image binaries directly?
- Can anything be done to shrink python code on disk? IE instead of .py files, compile/compress these files?
- Change out the interpreter to pypy, pyston, or?
- Can a new version of