Development / Tools / Sponsored / Contributed

What Tens of Millions of VMs Reveal about the State of Java

12 Mar 2020 2:33pm, by

New Relic sponsored this post.

Ben Evans
Ben is a true Java guru who works from New Relic’s European development center in Barcelona. He is also a Java Champion, a three-time JavaOne Rock Star speaker and a prolific author of Java books, including “The Well-Grounded Java Developer” and lead author of “Optimizing Java” and the new edition of “Java in A Nutshell.” He spent six years as a member of the Java Community Process Executive Committee (aka the JCP EC) helping define standards for the Java ecosystem

The modern software industry is vast, and there’s no shortage of programming languages to choose from. However, even within a single technology stack, such as the Java ecosystem, it can be difficult to draw useful conclusions about its market. Java is incredibly successful, and it is present in almost every major industry and economic sector — this, in part, is what makes it so difficult to come to a single declarative point of view about the state of the Java ecosystem.

But that doesn’t mean we can’t try to assess the state of the world.

Every day, tens of millions of Java virtual machines (JVMs) share their data with New Relic. To create this report, we anonymized and deliberately coarse-grained that data to give some broad overviews of the Java ecosystem as we see it. We also avoided any detailed information that could help attackers and other malicious parties otherwise users of JVM data.

Our goal for these observations is to provide some new context and insights about the state of the Java ecosystem today. With that said, we looked at the following categories:

  • Which Java versions used in production.
  • The most popular vendors.
  • The most used garbage collection algorithms
  • The most common heap size configurations.

Java 8 Is Still the Standard — for Now

Let’s start with the one question Java developers are always curious about: Which versions are most used in production environments? Consider the following table:

Java version % in use
14 Total: 0.00
13 Total: 0.32
12 Total: 0.17
11 Total: 11.11
10 0.48
9 0.18
8 Current 42.02
8 Lagging 38.63
8 Vulnerable 3.83
7 2.54
pre-7 0.73
Non-LTS 1.14

Note: We split the Java 8 result into three parts:

  • Current: Recently updated and not vulnerable to any major recent CVEs.
  • Lagging: Has potential significant risks associated with the age of the Java versions.
  • Vulnerable: Likely to be a source of serious concern for teams running these versions.

As you can see, Java 11 — a long-term support release — is slowly increasing in popularity, but the market still seems hesitant, as compared to Java 8 (also LTS). Of note is the lack of adoption of non-LTS releases — Java 7 still shows over twice as much usage (2.54%) as all post-Java 8 non-LTS releases combined (1.14%).

The Rise of Non-Oracle Vendors

Another major dynamic we’ve observed over the last year is an increasing acceptance of non-Oracle Java vendors in the community.

Vendor % in use
Oracle 74.78
AdoptOpenJDK 7.06
IcedTea 5.30
Azul 2.96
IBM 2.37
Amazon 2.18
Unknown 1.96
Pivotal 1.40
SAP 0.74
Sun 0.58
Debian 0.54
Other 0.10

Oracle now comprises only 75% of the Java market. The community-led AdoptOpenJDK is the second most popular vendor. Our historical trending data (which we’ve not released, as it’s based on a significantly smaller sample size than the main dataset) indicates that AdoptOpenJDK has been gaining significantly in popularity, month-over-month.

Of particular note is that within the population of AdoptOpenJDK VMs reporting to New Relic, almost one-third (33.19%) are Java 11. This represents a much higher rate of usage of Java 11 among AdoptOpenJDK users than in the general population.

Note: In the interests of full disclosure, New Relic is a sponsor of the AdoptOpenJDK project and is contributing engineering time to that project.

How ‘Garbage Collection’ Algorithms Fare

Because of the role it plays in memory management, Java Garbage Collection is a topic of endless interest in the community. According to our dataset, the various garbage collection  algorithms have the following market share:

GC algorithm % in use
Parallel 57.77
G1 24.99
CMS 17.20
ZGC 0.04
Shenandoah <0.01

Broadly, these choices reflect the default collector in use on different Java versions. However, when we facet by JVM version, some interesting results emerge:

  • CMS is more popular than G1 on Java 8 (14.56% vs. 12.59%).
  • CMS is more popular than Parallel on Java 11 (3.96% vs. 0.20%).
  • CMS is more than 35x more popular than ZGC on Java 11.

Checking in on Heap Configs

No discussion of garbage collection and memory management in Java is relevant without looking at heap size configs. Heap-size configs are defined by a pair of values — the heap minimum and maximum (typically referred to as Xms and Xmx). The following table lists the top 30 most common heap sizes based on our data, which we’ve normalized to MB for ease of understanding.

Xms Xmx % set
2048MB 2048MB 8.84
512MB 512MB 8.74
1024MB 1024MB 5.76
4096MB 4096MB 2.83
1024MB 2.60
819MB 819MB 2.59
8192MB 8192MB 2.55
512MB 2.40
2340MB 2340MB 2.19
256MB 512MB 2.17
64MB 256MB 2.11
2048MB 2.06
3072MB 2.02
4096MB 1.77
6144MB 6144MB 1.61
3072MB 3072MB 1.55
512MB 1024MB 1.54
1024MB 2048MB 1.50
256MB 1024MB 1.38
492MB 492MB 1.36
2028MB 2028MB 1.20
256MB 1.14
96MB 1024MB 0.89
10240MB 10240MB 0.84
256MB 256MB 0.79
512MB 2048MB 0.78
120MB 256MB 0.77
768MB 768MB 0.63
16384MB 16384MB 0.63
5120MB 5120MB 0.63

Surprisingly, this indicates that JVM heap sizes remain relatively small — which seems to be in contrast with the drive to produce algorithms that cater to larger and larger heaps.

In particular, heaps that could ever become bigger than 16GB (i.e., set Xmx >= 16GB) account for only 3.3% of the overall total.

The continued appearance of the “pinned heap” flag combination — where Xms and Xmx have the same value — was another major surprise. Our data shows that 33.48% of JVMs still run with this combination.

In very early versions of the adaptive-sizing algorithms, there may have been some advantage to running with this combination, but for modern workloads, it’s almost always counterproductive. If you set this combination, the JVM is constrained in how it can resize and shape the heap, making it less responsive to sudden changes in traffic behavior or request rate.

If this combination is present in your runtimes, you may want to run some tests to see if you can remove it for better garbage collection performance.

Some Random, but Interesting, Stuff

To wrap up, here are five fun stats we observed:

  1. 7.35% of Java 8 JVMs run with deprecated flags (especially MaxPermSize).
  2. 6.78% of all JVMs run with experimental flags enabled.
  3. 8.07% of JVMs have repeated flags that appear more than once in the startup string.
  4. 2.54% of JVMs have “mismatched flags” that say contradictory things; for example, the flags specify Parallel GC and G1GC.
  5. 2.59% of JVMs set a max heap size of 819MB. This is almost certainly a typo for 8192MB (i.e., 8GB). Check your configs carefully — cut-and-paste configs are dangerous.

Conclusion

The primary bias in this report is that we only saw data reported to New Relic. This is in no way a completely accurate representation of the Java market, and we recognize that there are selection and other less-obvious biases in our data. Additionally, we realize that localized trends can cause significant small-scale variation from what we’ve presented. For example, Java teams in specific industries (like healthcare or financial services) typically operate under strict regulations that prevent them from moving between versions in a timely manner.

However, we do see real-time data from millions of JVMs every day, and that ever-changing stream of data represents a proxy for the Java market as a whole.

We’re presenting these numbers to the Java community with the hope of contributing in a positive way to the fascinating ongoing conversation about the trajectory of Java as a whole. In no way is it our intent to claim that “we have all the answers” or to denigrate the work of others. This is emphatically a shared journey.

Every New Relic customer has access — at no additional cost — to this level of detail for their production environments. If you’re a New Relic customer (or if you’d like to be) and want to see these kinds of insights, get in touch with your New Relic contacts to see how to query this data or build your own dashboard to track it.

We’ll be more than happy to assist you to observe your systems on your terms, as you continue on the path of producing more perfect software.

Oracle is a sponsor of The New Stack.

Feature image by B. Cameron Gain.

A newsletter digest of the week’s most important stories & analyses.

View / Add Comments

Please stay on topic and be respectful of others. Review our Terms of Use.