Compiler Block Reordering and Memory Layout Optimization
GCC as enabled with -freorder-blocks and a optimization level larger 1 will reorder instructions at a block level. This optimization is mainly to compress correlated code to provide a optimized cache aware memory layout. Because of some Linux kernel hacking I forced to get the details when and where GCC’s optimizations kicks in. The most effective way for userland programs without branch-taken-knowledge is through profile guided optimization nowadays. But this is not possible in every setup (lack of realistic input data, …)....