GCC optimization/it

Questa guida costituisce un'introduzione all'ottimizzazione del codice compilato utilizzando valori sicuri per le variabili CFLAGS e CXXFLAGS. Inoltre viene descritta la teoria a fondamento dell'ottimizzazione in generale.

Cosa sono CFLAGS e CXXFLAGS?
CFLAGS e CXXFLAGS sono variabili d'ambiente che vengono utilizzate per comunicare alla GNU Compiler Collection (GCC) quali sono le opzioni da attivare quando viene compilato del codice sorgente. Le CFLAGS si riferiscono al codice sorgente scritto in C mentre le CXXFLAGS si riferiscono al codice sorgente scritto in C++.

Because such a large proportion of the packages that make up most Gentoo systems are written in C and C++, these are two variables administrators will definitely want to set correctly as they will greatly influence the way much of the system is built.

Possono essere usate per ridurre la quantità di messaggi per il debug, per incrementare i livelli dei messaggi di avvertimento e ovviamente per ottimizzare il codice prodotto. Il manuale GCC contiene una lista completa delle opzioni disponibili, e per ciascuna di esse viene fornita una spiegazione del motivo per cui vengono usate.

Come vengono usate?
Le CFLAGS e le CXXFLAGS possono essere utilizzate in due modi. Il primo modo consiste nell'utilizarle a livello di programmi singoli con i Makefile generati da.

Come è possibile vedere la variabile CXXFLAGS p configurata in modo da usare le stesse opzioni presenti nelle CFLAGS. La maggior parte dei sistemi dovrebbe essere configurata in questa maniera. Le opzioni aggiuntive in CXXFLAGS sono "estremamente rare" nei casi di uso comune.

Convinzioni errate
Anche se le variabili CFLAGS e CXXFLAGS possono essere un modo molto valido per produrre codice binario più piccolo o più veloce, se utilizzate in modo errato, possono anche compromettere la funzionalità del codice stesso, aumentare a dismisura le sue dimensioni, ridurre drasticamente le prestazioni. La loro configurazione errata può anche provocare errori di compilazione!

Remember, the global CFLAGS configured in will be applied to every package on the system so administrators typically only set general, widely-applicable options. Individual packages further modify these options either in the ebuild or the build system itself to generate the final set of flags used when invoking the compiler.

Pronto?
Being aware of the risks involved, take a look at some sane, safe optimizations. These will hold in good stead and will be endearing to developers the next time a problem is reported on Bugzilla. (Developers will usually request the user to recompile a package with minimal CFLAGS to see if the problem persists. Remember: aggressive flags can ruin code!)

Le basi
The goal behind CFLAGS and CXXFLAGS is to create code tailor-made to the system; it should function perfectly while being lean and fast, if possible. Sometimes these conditions are mutually exclusive, so this guide will stick to combinations known to work well. Ideally, they are the best available for any CPU architecture. For informational purposes, aggressive flag use will be covered later. Not every option listed on the GCC manual (there are hundreds) will be discussed, but basic, most common flags will be reviewed.

-march
The first and most important option is. This tells the compiler what code it should produce for the system's processor architecture (or arch); it tells GCC that it should produce code for a certain kind of CPU. Different CPUs have different capabilities, support different instruction sets, and have different ways of executing code. The  flag will instruct the compiler to produce specific code for the system's CPU, with all its capabilities, features, instruction sets, quirks, and so on.

Even though the  variable in  specifies the general architecture used,   should still be used so that programs can be optimized for the system specific processor. x86 and x86-64 CPUs (among others) should make use of the  flag.

What kind of CPU does the system have? To find out, run the following command:

Per ottenere maggiori dettagli, inclusi i valori di  e , si possono usare due comandi:


 * Il primo comanda dice al compilatore di non effettuare alcun link, ed invece di interpretare l'opzione  per dettagliare le opzioni della linea di comando, mostra se certe opzioni sono abilitate o disabilitate . In questo caso, le opzioni mostrate sono abilitate per il target selezionato.:


 * The second command will show the compiler directives for building the header file, but without actually performing the steps and instead showing them on the screen . The final output line is the command that holds all the optimization options and architecture selection:

Si veda  in azione. Questo è un esempio per un vecchio processore Pentium III:

Questo è invece un esempio per una CPU AMD a 64-bit:

Se il tipo di CPU è indeterminato, o se l'utente non sa quali impostazioni scegliere, è possibile usare l'opzione. Con questa opzione GCC proverà a determinare automaticamente di quale processore si dispone, impostando di conseguenza le opzioni più appropriate. Questa opzione, tuttavia, non dovrebbe essere utilizzata qualora si intenda compilare pacchetti per una CPU differente!

Se si sta compilando pacchetti su un certo computer ma intende eseguirli su un computer diverso (ad esempio nel caso in cui si dispone di un computer veloce che compila pacchetti da utilizzare su un'altro computer più lento) non bisogna utilizzare l'opzione. Native significa che il codice prodotto da una certa CPU potrà essere eseguito solo su quel tipo di CPU. Le applicazioni compilate con  su una CPU AMD Athlon 64 non potranno essere eseguite su una vecchia CPU VIA C3.

Also available are the  and   flags. These flags are normally only used when there is no available  option; certain processor architectures may require   or even. Unfortunately, GCC's behavior isn't very consistent with how each flag behaves from one architecture to the next.

On x86 and x86-64 CPUs,  will generate code specifically for that CPU using its available instruction sets and the correct ABI; it will have no backwards compatibility for older/different CPUs. Consider using  when generating code for older CPUs such as i386 and i486. produces more generic code than ; though it will tune code for a certain CPU, it does not take into account available instruction sets and ABI. Do not use  on x86 or x86-64 systems, as it is deprecated for those arches.

Only non-x86/x86-64 CPUs (such as Sparc, Alpha, and PowerPC) may require  or   instead of. On these architectures,  /   will sometimes behave just like   (on x86/x86-64) but with a different flag name. Again, GCC's behavior and flag naming is not consistent across architectures, so be sure to check the GCC manual to determine which one should be used.

-O
Next up is the  variable. This variable controls the overall level of optimization. Changing this value will make the code compilation take more time and will use much more memory, especially as the level of optimization is increased.

There are seven  settings: ,  ,  ,  ,  ,  , and. Only use one of them in

Con la sola eccezione di, ciascuna delle impostazioni   attiva alcune opzioni aggiuntive. Assicurarsi quindi di leggere il capitolo del manuale GCC sulle opzioni per l'ottimizzazione per capire quali opzioni vengono attivate per ciascuno dei livelli  e qual'è la loro funzione.

Let us examine each optimization level:


 * : This level (that is the letter "O" followed by a zero) turns off optimization entirely and is the default if no  level is specified in CFLAGS or CXXFLAGS . This reduces compilation time and can improve debugging info, but some applications will not work properly without optimization enabled. This option is not recommended except for debugging purposes.


 * : the most basic optimization level. The compiler will try to produce faster, smaller code without taking much compilation time. It is basic, but it should get the job done all the time.


 * : A step up from . The recommended level of optimization unless the system has special needs.   will activate a few more flags in addition to the ones activated by  . With , the compiler will attempt to increase code performance without compromising on size, and without taking too much compilation time.


 * : the highest level of optimization possible. It enables optimizations that are expensive in terms of compile time and memory usage. Compiling with   is not a guaranteed way to improve performance, and in fact, in many cases, can slow down a system due to larger binaries and increased memory usage.   is also known to break several packages. Using   is not recommended.


 * : optimizes code for size. It activates all  options that do not increase the size of the generated code. It can be useful for machines that have extremely limited disk storage space and/or CPUs with small cache sizes.


 * : Questa opzione è stata introdotta con GCC 4.8. Essa soddisfa il bisogno di ridurre i tempi della compilazione e quello di migliorare la capacità di effettuare il debug mantenendo però un ragionevole livello prestazionale in fase di esecuzione. Complessivamente con  l'attività di sviluppo dovrebbe risultare migliore rispetto a  . Si noti che   non implica  ;   si limita semplicemente a disattivare le ottimizzazioni che hanno ripercussioni negative sull'attività di debug.


 * : Si tratta di un'opzione introdotta con GCC 4.7. Essa consiste della somma di  con ,   e  .   viola la conformità stretta agli standard e pertanto non è consigliata.

As previously mentioned,  is the recommended optimization level. If package compilation fails and while not using, try rebuilding with that option. As a fallback option, try setting the CFLAGS and CXXFLAGS to a lower optimization level, such as  or even   (for error reporting and checking for possible problems).

-pipe
A common flag is. This flag has no effect on the generated code, but it makes the compilation process faster. It tells the compiler to use pipes instead of temporary files during the different stages of compilation, which uses more memory. On systems with low memory, GCC might get killed. In those cases do not use this flag.

-fomit-frame-pointer
This is a very common flag designed to reduce generated code size. It is turned on at all levels of  (except  ) on architectures where doing so does not interfere with debugging (such as x86-64), but it may need to be activated. In that case add it to the flags. Though the GCC manual does not specify all architectures, it is turned on by using the  option. It's still necessary to explicitly enable the  option, to activate it on x86-32 with GCC up to version 4.6, or when using   on x86-32 with any version of GCC. However, using  will make debugging hard or impossible.

In particular, it makes troubleshooting applications written in Java much harder, though Java is not the only code affected by using this flag. So while the flag can help, it also makes debugging harder; backtraces in particular will be useless. When not doing software debugging and no other debugging-related CFLAGS such as  have been used, then try using.

-msse, -msse2, -msse3, -mmmx, -m3dnow
These flags enable the Streaming SIMD Extentions (SSE), SSE2, SSE3, MMX, and 3DNow! instruction sets for x86 and x86-64 architectures. These are useful primarily in multimedia, gaming, and other floating point-intensive computing tasks, though they also contain several other mathematical enhancements. These instruction sets are found in more modern CPUs.

Normally none of these flags need to be added to, as long as the system is using the correct  (for example,   implies  ). Some notable exceptions are newer VIA and AMD64 CPUs that support instructions not implied by  (such as SSE3). For CPUs like these additional flags will need to be enabled where appropriate after checking.

Ma ottengo migliori prestazioni con -funroll-loops -fomg-optimize!
No, people only think they do because someone has convinced them that more flags are better. Aggressive flags will only hurt applications when used system-wide. Even the GCC manual says that using  and   will make code larger and run more slowly. Yet for some reason, these two flags, along with,  ,  , and similar flags, continue to be very popular among ricers who want the biggest bragging rights.

La verità è che si tratta di opzioni aggressive e pericolose. Si invita il lettore a controllare il [forum] e il [bugzilla] di Gentoo per vedere quali sono le conseguenze di queste opzioni. Niente di buono!

These flags are not needed globally in CFLAGS or CXXFLAGS. They will only hurt performance. They might bring on the idea of having a high-performance system running on the bleeding edge, but they don't do anything but bloat the code and get bugs marked INVALID or WONTFIX.

Non c'è bisogno di utilizzare tali opzioni. Non usarle. Limitarsi alle opzioni di base:,   e.

Livelli per -O maggiori di 3
Some users boast about even better performance obtained by using,  , and so on, but the reality is that   levels higher than 3 have no effect. The compiler may accept CFLAGS like, but it actually doesn't do anything with them. It only performs the optimizations for, nothing more.

Need more proof? Examine the source code:

As can be seen, any value higher than 3 is treated as just.

What about compiling outside the target machine?
Some readers might wonder if compiling outside the target machine with a strictly inferior CPU or GCC sub-architecture will result in inferior optimization results (compared to a native compilation). The answer is simple: No. Regardless of the actual hardware on which the compilation takes place and the CHOST for which GCC was built, as long as the same arguments are used (except for ) and the same version of GCC is used (although minor version might be different), the resulting optimizations are strictly the same.

To exemplify, if Gentoo is installed on a machine whose GCC's CHOST is i686-pc-linux-gnu, and a Distcc server is setup on another computer whose GCC's CHOST is i486-linux-gnu, then there is no need to be afraid that the results would be less optimal because of the strictly inferior sub-architecture of the remote compiler and/or hardware. The result would be as optimized as a native build, as long as the same options are passed to both compilers (and the  parameter doesn't get a   argument). In this particular case the target architecture needs to be specified explicitly as explained in Distcc and -march=native.

The only difference in behavior between two GCC versions built targeting different sub-architectures is the implicit default argument for the  parameter, which is derived from the GCC's CHOST when not explicitly provided in the command line.

Opzioni ridondanti
Oftentimes CFLAGS and CXXFLAGS that are turned on at various  levels are specified redundantly in. Sometimes this is done out of ignorance, but it is also done to avoid flag filtering or flag replacing.

Flag filtering/replacing is done in many of the ebuilds in the Portage tree. It is usually done because packages fail to compile at certain  levels, or when the source code is too sensitive for any additional flags to be used. The ebuild will either filter out some or all CFLAGS and CXXFLAGS, or it may replace  with a different level.

Il Gentoo Developer Manual spiega nel dettaglio in che modo funziona il filtraggio e la sostituzione delle opzioni e dove essi hanno luogo.

È possibile aggirare il filtraggio delle opzioni attivate ad un certo livello di, come ad esempio  , nel seguente modo:

However, this is not a smart thing to do. CFLAGS are filtered for a reason! When flags are filtered, it means that it is unsafe to build a package with those flags. Clearly, it is not safe to compile the whole system with  if some of the flags turned on by that level will cause problems with certain packages. Therefore, don't try to "outsmart" the developers who maintain those packages. Trust the developers. Flag filtering and replacing is done to ensure stability of the system and application! If an ebuild specifies alternative flags, then don't try to get around it.

Se si sceglie di compilare un pacchetto con opzioni inaccettabili è molto probabile che si avranno ulteriori problemi successivamente. Quando l'utente segnala un problema sul Bugzilla le opzioni che si usano in sono chiaramente visibili a tutti e gli sviluppatori chiederanno senz'altro di ricompilare il pacchetto senza le opzioni problematiche. Si può evitare il fastidio di dover ricompilare tali pacchetti se si evita in primo luogo di utilizzare opzioni ridondanti in questa maniera. Non si deve assumere di saperne di più rispetto agli sviluppatori.

LDFLAGS
The Gentoo developers have already set basic, safe LDFLAGS in the base profiles, so they do not need to be changed.

Opzioni specifiche per un singolo pacchetto
Information on how to use per-package environment variables (including CFLAGS ) is described in the Gentoo Handbook, "Per-Package Environment Variables".

Risorse
Le seguenti risorse sono utili per approfondire ulteriormente il tema dell'ottimizzazione:


 * La documentazione per GCC


 * Gentoo Handbook - Configuring compile options


 * man make.conf


 * Wikipedia


 * I forum Gentoo