Sometimes, compilers have special treatments for some function implementations. Put simply, they replace the default implementation with another, possibly optimized, implementation. Such functions are known as intrinsic functions in compiler theory.
In this article, we’ll walk through a few examples to see how intrinsic functions are working in the HotSpot JVM.
A Tale of Two Logs
The Math.log() method in Java computes the natural logarithm of any given number. Typical high school stuff, nothing fancy! Here’s what the implementation of this method looks like in OpenJDK:
As shown above, the Math.log() method itself calls another method called StrictMath.log() under the hood. Despite this delegation, we usually tend to use the Math.log() instead of the strict and more direct one!
Despite Donald Knuth’s efforts, one might propose to use the StrictMath implementation, mainly to avoid the unnecessary indirection and being more sympathetic to the underlying mechanics!
Well, we all know that when the Math.log() method gets hot enough (i.e. being called frequently enough), then the HotSpot JVM will inline this delegation. Therefore, it’s only natural to expect that both method calls exhibit similar performance characteristics, at least when the performance matters!.
To prove this hypothesis, let’s conduct a simple benchmark comparing the two implementations:
The result should be so predictable, right?
The Observer Effect
If we package the benchmark and run the following command:
After a while, JMH will print the benchmark result like the following:
We didn’t see that coming, did we?! The indirect Math.log() implementation outperforms the direct and supposedly more performant implementation by almost 105% in terms of throughput!
Let’s take a closer look at the Math.log() implementation once again, just to make sure we didn’t missed something there:
The delegation exists, for sure. Quite interestingly, there is also a @IntrinsicCandidate annotation on the method. Before going any further, it’s worth mentioning that before Java 16, the same method did look like this:
So basically, as of Java 16, the jdk.internal.HotSpotIntrinsicCandidate is repackaged and renamed as jdk.internal.vm.annotation.IntrinsicCandidate.
Anyway, the @IntrinsicCandidate may reveal the actual reason behind this shocking benchmark result. Let’s take a peek at the annotation Javadoc:
Well, based on this, the HotSpot JVM may replace the Math.log() Java implementation with a possibly more efficient compiler intrinsic to improve the performance.
Down the Rabbit Hole
As it turns out, there actually is an intrinsic for the Math.log() method!
The HotSpot JVM defines all its intrinsics in the vmIntrinsics.hpp file1. In the HotSpot, there are two types of intrinsics:
Library intrinsics: These are typical compiler intrinsics as they will replace the method implementations.
Bytecode intrinsics: These methods won’t be replaced but instead would have special treatments.
The HotSpot JVM source code documents these two types as follows:
Right after this, they list all the possible VM intrinsics one after another. For instance:
As shown by the last line, there is actually an intrinsic replacement for the Math.log(). For instance, on x86-64 architectures, the Math.log() will be intrinsified as follows:
The vmIntrinsics.hpp only defines the fact that some methods may have intrinsic implementations. The actual intrinsic routine is provided somewhere else and usually depends on the underlying architecture. In the above example, the src/hotspot/cpu/x86/stubGenerator_x86_64.cpp is responsible for providing the actual intrinsic for the 64-bit x86 architecture.
In addition to being architecture-specific, intrinsics can be disabled. Therefore, the JVM compiler (C1 or C2) checks these two conditions before applying the intrinsic:
Basically, an intrinsic is available if:
The intrinsic is enabled, usually by using a tunable flag.
The underlying platform supports the intrinsic.
Let’s see more about those tunables.
Similar to many other aspects of the JVM, we can control the intrinsics to some extent using tunable flags.
For starters, the combination of -XX:+UnlockDiagnosticVMOptions and -XX:+PrintIntrinsics make the HotSpot to print all intrinsics while introducing them. For instance, if we run the same benchmark with these flags, we will see a lot of Math.log() related logs:
Also, we can disable all the Math related intrinsics using the -XX:-InlineMathNatives tunable:
As shown above, since the JVM no longer applies the intrinsics for the Math.log(), the throughputs are almost the same!
Using a simple grep, as always, we can see all the tunables related to a particular subject:
And, one more thing:
In this article, we saw how the JVM may replace some critical Java methods with more efficient implementations at runtime.
Of course, the JVM compiler is a complex piece of software. Therefore, covering all the details related to intrinsics is both beyond the scope of this article and certainly beyond the writer’s knowledge. However, I hope this serves as a good starting point for the curious!
As always, the source code is available on GitHub!
1. Over the years, the file responsible for declaring the VM intrinsics has changed. For instance, before the vmIntrinsics.hpp, the vmSymbols.hpp was the home for all intrinsics.
2. The cover image is from lls-ceilap on Quantum Observer Effect.