User:Gso321/Benchmarks
These are my custom benchmarks that I run on my gentoo linux.
LLVM and Clang with BOLT-PGO.cmake
Building LLVM took more than 4 hours with 8 jobs for my machine. Uname -a was Linux tux 6.6.67-gentoo-gentoo-dist #1 SMP PREEMPT_DYNAMIC Wed Jan 8 20:53:16 EST 2025 x86_64 Intel(R) Core(TM) i5-1035G1 CPU @ 1.00GHz GenuineIntel GNU/Linux.
Install LLVM:
user $
git clone https://github.com/llvm/llvm-project.git
user $
cd llvm-project
user $
mkdir build
user $
cd build
Run the commands to build LLVM:
cmake -S ../llvm -G Ninja -C ../clang/cmake/caches/BOLT-PGO.cmake \ -DBOOTSTRAP_LLVM_ENABLE_LLD=ON \ -DBOOTSTRAP_BOOTSTRAP_LLVM_ENABLE_LLD=ON \ -DPGO_INSTRUMENT_LTO=Thin \ -DCMAKE_INSTALL_PREFIX="/home/kael/llvm-project/bin" -DLLVM_TARGETS_TO_BUILD=X86 -DLLVM_ENABLE_ASSERTIONS=OFF -DLLVM_ENABLE_PROJECTS="bolt;clang;lld;polly" -DCMAKE_C_FLAGS="-O3 -march=native -pipe -fmerge-all-constants -fpointer-tbaa" -DCMAKE_CXX_FLAGS="-O3 -march=native -pipe -fmerge-all-constants -fpointer-tbaa" -DCMAKE_C_COMPILER="clang" -DCMAKE_CXX_COMPILER="clang++" -DLLVM_ENABLE_LTO="Thin" -DLLVM_ENABLE_LLD="true"
ninja -j8
Both were clean and run with the commands:
user $
make clean
user $
time make LLVM=1
Original LLVM is llvm-core/llvm 19.1.4 build with -O2 -march=native -pipe First when building the 6.6.67-gentoo kernel, the time command showed:
real 53m3.855s user 55m38.382s sys 2m34.281s
Building the kernel using LLVM 20.0.0git 14b44179cb61dd551c911dea54de57b588621005 showed:
real 48m18.362s user 50m38.903s
This new LLVM saves about 9% real time.
BOLT Zig
user $
git clone https://github.com/ziglang/zig.git
user $
mkdir build
user $
cd build
user $
cmake .. -DZIG_NO_LIB=ON -GNinja -DCMAKE_BUILD_TYPE=Debug
user $
ninja install
Remove .zig-cache everytime zig is run.
Results:
Benchmark 1 (3 runs): ./zig-normal.sh
measurement mean ± σ min … max outliers delta wall_time 399s ± 13.8s 385s … 412s 0 ( 0%) 0% peak_rss 5.67GB ± 590MB 4.99GB … 6.01GB 0 ( 0%) 0% cpu_cycles 1.38T ± 4.15G 1.37T … 1.38T 0 ( 0%) 0% instructions 2.42T ± 591M 2.42T … 2.42T 0 ( 0%) 0% cache_references 24.7G ± 166M 24.6G … 24.9G 0 ( 0%) 0% cache_misses 2.05G ± 5.98M 2.05G … 2.06G 0 ( 0%) 0% branch_misses 3.57G ± 5.76M 3.56G … 3.57G 0 ( 0%) 0%
Benchmark 2 (3 runs): ./zig-bolt.sh
measurement mean ± σ min … max outliers delta wall_time 344s ± 3.62s 341s … 348s 0 ( 0%) ⚡- 13.8% ± 5.7% peak_rss 6.02GB ± 13.4MB 6.01GB … 6.04GB 0 ( 0%) + 6.2% ± 16.7% cpu_cycles 1.21T ± 309M 1.21T … 1.21T 0 ( 0%) ⚡- 12.0% ± 0.5% instructions 2.32T ± 249M 2.32T … 2.32T 0 ( 0%) ⚡- 4.1% ± 0.0% cache_references 16.7G ± 50.6M 16.6G … 16.7G 0 ( 0%) ⚡- 32.6% ± 1.1% cache_misses 2.05G ± 15.4M 2.03G … 2.06G 0 ( 0%) - 0.2% ± 1.3% branch_misses 3.36G ± 8.38M 3.35G … 3.37G 0 ( 0%) ⚡- 5.9% ± 0.5%