Project:Mozilla/Firefox Benchmarks 2025 Q1

From Gentoo Wiki
Jump to:navigation Jump to:search

Mozilla Firefox benchmarks 2025/Q1 (2025-03-25)

Motivation

Nowadays web browser is a big part of my system. Probably the biggest where I spend most of my time. And ever since I first laid my eyes upon the related SuSE benchmarks [1][2] over the subject, I've been curious as to how the browser performance develops.

Unfortunately SuSE has stopped providing results for Firefox, and it's not hard to find individual opinions on the subject. I wanted to find out and prove if there's any reason to compile Firefox when looking for better performance. User:Immolo's YouTube video about optimising Firefox gave me the final push (has it really been 11 months since the video was uploaded...)

Another point of interest has always been GCC vs. Clang. Since Firefox has been pulling Clang unconditionally for a long time now, due to Web Developer and Debugging Tools in Firefox relying on libclang, and due to upstream focusing their programming efforts Clang-first, it's interesting to compare the performance between GCC and Clang. It was indeed this unconditional dependency and the better performance on Firefox provided by Clang, that made me switch my desktop to a "fully-Clang'd" system where every package is mainly compiled with llvm, clang and lld. Doing the benchmarks initially last December also proved that in Gentoo, for some time, GCC with PGO was broken and didn't provide any performance boost. [3][4][5]. So testing the outcome regularly will make sure the outcome is working as supposed.

While it'd be great to have some sort of knowledge about whether the browser performance increases or decreases over time between versions, it must also be acknowledged that the testing platform updates and tests ran with different tool versions may not be comparable. There may be a case to compare toolchain versions, like clang-19 vs. clang-20, but it takes a long time to run these benchmarks and I don't know whether it's worth it. However I do intend to make this a yearly thing, and with that major toolchain updates should be visible.

In the end it's my own curiosity that drove me to finally do this, and I intend to make this a yearly thing. If there are no drastical differences, I'll at least try to archive results for myself.

Background, Technical specs

The main benchmarks were done on 2025-03-25, within a time period of 11 hours on the same testing desktop pc. Nothing else was being done on the host desktop. The testing setup was an up-to-date ~amd64 Gentoo desktop with Plasma-wayland running. The system was a regular system, no weird CFLAGS, no lto, no pgo, fully built with gcc and binpackages were utilized where available. Gentoo specs were:

  • Firefox 136.0.2
  • Firefox-bin 136.0.2
  • Google Chrome 134 (134.0.6998.117)
  • clang/llvm 19.1.7
  • gcc 14.2.1_p20250301
  • mesa 25.0.2
  • rust-bin 1.85.1
  • gentoo-kernel-bin 6.13.8

BrowserBench was used as the testing platform. All 3 major tests were ran there, 3 times individually (9 times in total) for a single browser build. Every round was started with a completely new browser profile. So for a single browser configuration the cycle goes:

Compile Firefox -> Run Firefox (with a new profile) -> Run SpeedoMeter, JetStream and Motionmark (individually, not simultaneously) -> Close browser -> Delete ~/.cache -> Delete ~/.mozilla -> Repeat 2 times before compiling the browser with different configuration. So for a single configuration, 3 results from each tests were recorded, and the "middle" result taken based on value. 3 rounds were done each time to boost confidence in results and to avoid any fluctuation.

This is just a hobbyist benchmark utilizing BrowserBench, not a professional one. room temperature wasn't recorded...

There are many other benchmarks out there, but to me this did prove insightful and I'm confident about my own results, with my system. I personally value browser responsiveness (SpeedoMeter) above the rest, so responsiveness is given a higher "rate" in the final value. I will not post the individual test scores since the point is simply to have a stable, same environment between each benchmark to make the comparison credible.

Benchmarks

GCC/Clang, lto, pgo and "optimized flags" testing were done on 2025-03-25 with Firefox 136.0.2. There are a set of miscellaneous test results as well (like using system libs vs. bundled libs) and those were carried in 2024 December. Install size comparison was done with Firefox 136.0.2.

Base settings are simply CFLAGS="-march=native -O2 -pipe.

"Optflags", or "optimized CFLAGS" translates into:

  • GCC:
* CFLAGS="-march=native -O3 -pipe -flto=auto -fno-sized-deallocation -fno-aligned-new -fno-strict-aliasing -fPIC -fno-exceptions -fno-rtti -fno-math-errno -fno-omit-frame-pointer"
  • Clang
* CFLAGS="-march=native -O3 -pipe -flto=thin -fno-sized-deallocation -fno-aligned-new -fno-strict-aliasing -fPIC -fno-exceptions -fno-rtti -fno-math-errno -fomit-frame-pointer -funwind-tables -ffunction-sections -fdata-sections"
* RUSTFLAGS="-C target-cpu=native -C opt-level=3 -Clinker=clang -Clinker-plugin-lto -Clink-arg=-fuse-ld=lld"

Note that since GCC can't do lto with rust, no RUSTFLAGS is provided when using GCC, while it could've benefited from e.g. -C target-cpu=native -C opt-level=3.

Use flags for most configurations are: USE="X clang dbus gmp-autoupdate jumbo-build system-av1 system-harfbuzz system-icu system-jpeg system-libevent system-libvpx system-webp telemetry wayland -eme-free -gnome-shell -hardened -hwaccel -jack -libproxy -openh264 -pgo -pulseaudio -sndio -system-png -wasm-sandbox -wifi". pgo use flag is enabled if +pgo is defined in the graphs. +lto is used by passing -flto to CFLAGS.

Result graphs are in percentages, except in the Install size comparison.

GCC

Firefox-136.0.2-benchmarks-2025-q1-gcc.png

Results, and the performance increase are in percentages. Base comparison is against default build with GCC.

The increase over either pgo or lto is quite mild, while the jump with "optflags" is more noticeable. After discussing this with sam, my most educated guess is that GCC's lto/pgo benefits most from specifying -O3 optimization through CFLAGS. SuSE's benchmark spec seems to confirm some of it (Figures 27 & 28).

While I didn't test -O3 separately to confirm this on my own, right now if compiling with GCC simply relying on PGO or LTO doesn't seem worth the effort. You want to add at least -O3 and probably the rest of CFLAGS="-march=native -O3 -pipe -flto=auto -fno-sized-deallocation -fno-aligned-new -fno-strict-aliasing -fPIC -fno-exceptions -fno-rtti -fno-math-errno -fno-omit-frame-pointer".

Clang

Firefox-136.0.2-benchmarks-2025-q1-clang.png

Unit increases are in percentages. Comparison is done against "default" build. Higher is better. Clang's performance gains are more steady between different configurations. Providing either LTO, or PGO, or both, are an easy way to get a noticeable performance boost when using Clang.

Clang's "optflags" resolve to:

  • CFLAGS="-march=native -O3 -pipe -flto=thin -fno-sized-deallocation -fno-aligned-new -fno-strict-aliasing -fPIC -fno-exceptions -fno-rtti -fno-math-errno -fomit-frame-pointer -funwind-tables -ffunction-sections -fdata-sections"
  • RUSTFLAGS="-C target-cpu=native -C opt-level=3 -Clinker=clang -Clinker-plugin-lto -Clink-arg=-fuse-ld=lld"

Clang vs. GCC

Firefox-136.0.2-benchmarks-2025-q1-gcc-vs-clang-comparison.png

Default configuration using clang works as the base comparison spec here. Unit increases are in percentages. Higher is better. GCC with every optimization used gets to the same level as Clang with none. Clang has a noticeable performance increase over GCC in total score with every configuration. GCC does hold its own when it comes to responsiveness tests (Speedometer), and the result from "LTO+PGO+optflags" configuration is even between Clang and GCC. GCC is just slightly behind in JavaScript/wasm tests, but the real difference comes from the graphic (MotionMark) test. While I don't know why - I can only guess rust plays a major part in that. Upstream also defaults to Clang and all coding efforts are mainly tested with Clang.

Responsiveness comparison (Speedometer). Units in percentage. Higher is better.

Responsiveness comparison (Speedometer) between Clang and GCC. Units in percentage. Higher is better.

Clang vs. Firefox-bin

Firefox-136.0.2-benchmarks-2025-q1-firefox-bin-vs-source-built-with-optflags-comparison.png

Firefox-bin was compared against the self-built Firefox with best performance, i.e. Clang+LTO+PGO+optflags. Comparison is in percentages. Higher is better.

Firefox-bin was just slighty better in responsiveness (Speedometer), JavaScript/Wasm tests (JetStream) were even and self-built version won over with graphics section (MotionMark). Firefox-bin is quite even to a regular build with Clang+lto+pgo.

Now I don't know all of the black magic that happens in upstream when they build a browser release, even though their release scripts are present in the source tarball, but one explanation for the difference could be the magic of RUSTFLAGS used in self-built version. However this is probably a subject to change really soon, since Mozilla will also pass optimization flags (including lto) to rust when building Firefox, starting in version 138. [6]

Firefox-bin vs. Google-chrome

Firefox-136.0.2-benchmarks-2025-q1-firefox-bin-vs-google-chrome-comparison.png

Comparison in percentages. Higher is better.

While Google-Chrome had ~10 % better performance in responsiveness (Speedometer), it absolutely demolished Firefox-bin in the JavaScript/Wasm tests (JetStream) with a ~25 % increase, but also somewhat massively losing in graphic tests (MotionMark). But since I value responsiveness over the others, in my rating approach Google-Chrome provides roughly ~19 % better performance overall compared to Firefox-bin.

Note that when comparing different browsers, make sure your browser window is of the same size (fullscreen) and that the monitor refresh rate is identified correctly and similarly in each browser and test.

Miscellaneous

All kinds of miscellaneous tests were also performed during December 2024. Here are some interesting results.

Bundled libs vs. System libs

The motivation for this comparison was to figure out if the bundled libs provide any kind of performance boost over the system libs, where USE flag for system-* lib option is available. Since Mozilla heavily patches some of the bundled libs, and you can inline function calls from them, I figured there could be a difference. However, there wasn't. Keep in mind the available system-* libs presents only a small subset of libraries used in the browser.

-O2 vs. -O3

While I mentioned earlier that when trying to gain performance increase with GCC, using -O3 seems pretty mandatory, as simply switching CFLAGS="-march=native -O2 -pipe" to CFLAGS="-march=native -O3 -pipe" did nothing. GCC's -O3 might indeed benefit hugely from LTO and/or PGO.

"Wasm-sandbox" use flag

There are rumors that enabling "wasm-sandbox" use flag will cause a performance decrease. There was no difference based on the status of this use flag.

Install size

This was recorded on March 2025 with Firefox 136.0.2.

Firefox-136.0.2-benchmarks-2025-q1-install-sizes-comparison.png

Units are in MiB as recorded by portage. Exact size logged.

GCC with full lto, and even pgo, reduces the install size noticeably. Clang with any and every optimization doesn't affect install size as much.

Conclusions, insights and summary

First of all while I've used the term "noticeable increase", I mean it with numbers. I used Firefox built with GCC +lto +pgo, during the time it was broken in Gentoo and when I switched to Clang +lto +pgo +optflags, I didn't feel or see any difference when actually using the browser. I used to build Firefox for my laptop (no lto, no pgo) before embracing Firefox-bin and didn't see or feel any difference between those two. So while numbers don't lie, avoid believing in placebo too.

Second big take and somewhat of a surprise, for me personally, is the current difference between Clang and GCC. And it doesn't look like things are getting better for GCC honestly.

Unfortunately Google-Chrome beats Firefox-bin when it comes to performance in areas that matter.

But when choosing to use Firefox, Firefox-bin is an easy way to get high performance with little investment. While the self-built version can currently provide a tiny edge over the -bin version, it will most likely even out in the near future.

References