Every earnings season we realize how little is known about the competitive positions of the big cloud vendors. The vendors themselves, and various third parties offer plenty of stats and data points, but how many are truly believable?
It is universally agreed that Amazon Web Services (AWS) is the leader, but by how much? Based on our research, Microsoft and Rackspace come in second and third, respectively, regarding paying customers. Everyone else is far behind.
Microsoft doesn’t break out its cloud revenue between Azure and Office 365, nor does Alphabet (Google’s parent company) distinguish between Google Cloud Platform and other cloud apps (now collectively called G Suite). Until this happens, most news about cloud market share is speculation or cloud-washing.
We often rely on surveys to determine competitive positioning, but when looking at cloud providers they are usually flawed because 1) the samples are skewed, 2) the definition of cloud types (Infrastructure-as-a-Service, Software-as-a-Service, Platform-as-a-Service) are rarely clear and/or delineated, and 3) details about breadth and depth of use are limited. In the last few weeks, we spoke to sharp market forecasters like Ed Anderson at Gartner and Greg Zwakman of 451 Research. They are creating more reliable bottom-up models, but the research is privately shared with investors willing to pay dearly for it.
So, public reporting, surveys and market forecasts all have their problems. Another approach is using machine learning and web scraping techniques. For example, HG Data has amassed huge data set about installed technologies at companies around the world. It collects information from the online sources such as case studies, blog posts and press releases, but what makes HG Data unique is that it also combs through offline documents, such as contracts, insurance on large purchases and other transactions, that commonly occur when enterprises make larger purchases. Next, the unstructured data is curated with machine learning techniques. Finally, the information is compiled with real, live humans. We used a free version of HG Data’s Discovery tool to identify how many companies are using the leading IaaS cloud providers. Due to the nature of the data, many technologies are not indexed. That being said, we identified seven of the leading cloud offerings.
As with all data research, there are many reasons to be skeptical. The reader should remember that the methodology does not look at the extent of corporations’ cloud spending, and it is backward instead of forward looking. Also, although we use the term “cloud,” we mean Infrastructure service providers. Furthermore, the algorithms that control the data collection and processing are still maturing and were not created for this particular.
While AWS is the leader with 148,044 customers, Microsoft’s Azure has a strong second place with 90,827 companies using it. This jives with the findings from a Morgan Stanley CIO Survey because they both are looking at the breadth, not the depth of cloud use. In other words, HG Data and the Morgan Stanley do not distinguish between a $10/month web hosting deals and multi-million dollar contracts.
At first glance, the 51,489 companies using Rackspace is impressive. Interestingly, much of the survey research we’ve seen recently has Rackspace doing much worse, possibly because its managed hosting services are not always considered as “cloud.” This, plus the fact that it helps manage other company’s cloud offerings make Rackspace unique. Looking back at this summer’s private equity deal to take Rackspace private, perhaps the financiers saw the number of customers as an undervalued asset. Alternatively, Rackspace may just be milking its previous position. Since the data is backward looking, it is possible that much of the reported Rackspace use is associated with old and/or expiring contracts.
Based on the data, IBM SoftLayer is growing rapidly with almost eleven thousand companies using it. A few years ago the company was bought by IBM, but it is gradually being integrated into the company’s overall cloud portfolio. Since IBM has several other cloud offerings, it is often difficult to identify how much they are used for cloud.
Salesforce.com’s Heroku, Google Cloud Platform and DigitalOcean round out our list. Like Rackspace’s position, there is a decent chance that Heroku’s prominence is based on the fact that it had a substantial market presence a few years ago. DigitalOcean’s 163 percent growth in companies using it can be attributed to popularity among developers. Google Cloud Platform is also popular among developers, but conventional wisdom is that Google is hampered with an immature enterprise sales operations. It is possible that small, developer initiated use cases are not being captured for these companies, but that same dynamic is also present for the AWS.
Perspectives on the Methodology
- The data should not be used to make assumptions about market size.
- The documents mined by HG Data are all backward looking. They tell a story about what happened in the past. Furthermore, many smaller purchases may fall through the cracks and not be tracked. This can lead to substantial mismeasurement. For example, the data says that almost a hundred thousand companies are still using Windows XP even though the product ended several years ago.
- HG Data is not the only company that uses artificial intelligence to create a data-driven picture of a market. This spring, Aman Naimat wrote The Big Data Market using a similar approach. The company that bought his startup, Demandbase is similar to HG Data in that they both believe their data is valuable to companies using account-based marketing.
- Traditionally, IaaS includes compute, storage and networking services. In recent years, most analysts exclude SaaS from their cloud market reports. Although PaaS and infrastructure software as a service are important categories, they are also excluded from this analysis.
- OpenStack providers, VMware and Oracle were notably excluded from this analysis. After this article was written, we double-checked and found that VMware Cloud use was identified at 3,624 companies. Oracle Cloud wasn’t even included in HG Data’s database.
- Looking forward, many observers believe that cloud services is where the market is headed, but that’s a topic for another day…
DigitalOcean and IBM are sponsors of The New Stack.
Feature image via Pixabay.