Watch out for Compiler Optimizations

Moshe Beladev
4 min readJul 2, 2019

Every compiler, including Golang’s, optimizes our code to an extent. This makes our build faster and outputs a smaller and probably more efficient binary.

However, I believe that sometimes you should take a look under the hood to understand how things work. It will surely step up your programming and debugging skills.

Recently I faced a really strange behavior while benchmarking one of the functions I wrote:

0.31 ns/op is extremely fast, even for our super-efficient function 😆

Although it made me happy thinking my code runs so fast, I figured out that probably something went wrong and the test is misleading so I started digging for answers for this phenomenon.

Let’s start with a simple feature: Function Inlining.

What Does It Mean?

The compiler takes our function’s code and substitutes it with the function calls.

Why Inline At All?

Function call has its own burden of creating a new stack frame which generally includes:

  • Return address
  • Argument variables passed on the stack
  • Local variables
  • Modified registers

All of the above, affect our program execution with unwanted operations.

To make our application even faster, the compiler comes to the rescue and changes the code to include the function’s content on each call.

We need to remember there are always some strings attached and in our case, it results with a larger output binary which has its own cons as well.

Hence, most of the compilers define a threshold (which you can play with) that determines whether or not to inline the function.

From Go Wiki:

Only short and simple functions are inlined. To be inlined a function must contain less than ~40 expressions and does not contain complex things like function calls, loops, labels, closures, panic’s, recover’s, select’s, switch’es, etc.

Talk Is Cheap, Show Me The Code!

First, let’s implement a really cool and efficient sine approximation called Bhaskara I’s:

Using the gcflags -m option for go build may reveal some inner compiler decisions:

I found many useful Go tools tricks here

Cool, the compiler inlined our function. Now we can dive deeper and inspect the build disassembly using the -S flag to see how our functions are translated and disable the compiler optimizations using the -N flag to reduce uncertainty.

$ go build -gcflags '-S -N' main.go 2>&1

You can watch the full output in your terminal, but for the sake of readability I’ll filter the output a bit so we can take a look only at the parts that are important for us:

We can see part of the disassembled main function’s code

The command output is not a final machine code. For example, we can see the FUNCDATA and PCDATA which use as hints for the linker's garbage collecting arrangements. For our purpose, we can simply ignore them.

To understand how the optimizations processes affect our code, I’ll re-do the previous step without disabling the compiler optimizations. Let’s see what happens:

The operations that were in the main function vanished and only an immediate return remained

If you already checked the output, you probably noticed a strange thing — Our main function is empty!

Why Does This Happen?

The compiler sees that the function has no calling side effects (e.g: neither calling any other third-party function nor changing a global variable)

If we change the code and add a global variable to hold the function’s result as seen in the following example:

A global variable was added to hold the function’s result

The output of the main function will not be empty:

The main function is not empty anymore!

Where Can We Face It?

A common use case of unwanted compiler optimization might occur in benchmark tests.

To prevent the unwanted optimization (that could compromises our function’s performance measurement) one can disable optimizations when running the tests, which, in my opinion, does not reflect reality.

I would like to show you another possible solution:

(1) Used to prevent the compiler from eliminating the sin function call (2) Store the result in a package level variable to protect the BenchmarkSin function

Summary

We started by playing a little with Go build tools to understand a simple optimization of function inlining. Then we dived a little deeper to understand the compiled code and the optimization flags affections and finished by a real-world example we may face in our next benchmark tests.

I hope you now have a profound understanding of compiler optimization and some Go build tools, but most importantly curiosity to check other internal stuff.

Read more:

[1] https://github.com/golang/go/wiki/CompilerOptimizations

[2] https://dave.cheney.net/high-performance-go-workshop/

[3] https://stackoverflow.com/a/36975497

--

--