Microsoft Prepped Windows Server 2016 for Ultra-Fast Persistent Storage

With the release of Windows Server 2016, Microsoft has prepared the operating system for the upcoming tectonic changes coming in system architectures, namely around how memory works in servers.
A new type of memory creeping into the market, commonly known as persistent memory that threatens to shake up computer architecture in a fundamental way, eliminating the need to keep heaps of RAM memory on servers.
“I consider this a very disruptive technology,” explained Neal Christiansen, the Microsoft principal software development lead at Microsoft, speaking at the Intel Developer Forum, held in San Francisco. Christiansen is in charge of the NTFS file system, the file system for Windows.
This new type of memory may cause some disruptions to how applications deal with storage. But, oh, will the performance gains be sweet — approaching two orders of magnitude, by Microsoft’s calculations.
In short, persistent memory — also known in various quarters as storage class memory, nonvolatile memory or byte-addressable memory — is non-volatile storage that operates at RAM-speed.
The existence of such a technology, at commercial prices, could destroy the traditional model of computing, the dominant form of computing in the past 50 years. In this model, a computer system is made of a compute node, a permanent disk storage and a set of quickly accessible working memory to load programs into.
But with permanent storage being just as quick as the working memory, why would you need working memory at all? Why not just access the data directly? Persistent memory has very low latency and high bandwidth because it sits directly on the memory bus.
To fully take advantage of the speeds persistent memory could offer, however, some changes would be needed in applications.
For Windows Server, the file system team wanted to support “zero copy access to persistent memory,” Christiansen said. But at the same time, they recognized that the vast majority of users would still want to run their applications unmodified. Most customers are not going to rewrite their applications to take advantage of this technology.
So, to accommodate this new technology, Microsoft created two new modes of storage access, both of which recognize JEDEC-defined NVDIMM-N devices out of the box.
For one, they created a new type of data volume system, called DAX (Direct Access Storage). In this setup, the application is given direct access to byte-addressable storage, with no intervention from the OS whatsoever. The principle for this architecture is “system software gets in the way of performance,” Christiansen said.
But, going forward, Windows will also offer a new “block” mode, which offers some of the performance gains of persistent memory, but at the same time offer backward compatibility.
The application is given a choice at format time — to either access storage in DAX mode or in the backward compatible mode.
Trade-offs
So developers may have to think about this decision carefully. Windows itself has no contact with the data in DAX mode, which means some typical features offered by the OS may not be available for DAX use. For instance, NTFS has built-in encryption, which can’t be used. Compression can’t be used either. With DAX, the OS “has no point where it can convert the data,” Christiansen said.
There is no paging I/O in the system, though DAX can still work with existing caching software.
Microsoft modified the cache manager such that it when it creates a cache map to a DAX volume, it maps directly to the underlying hardware. So when an app does a cache read or cache write, the cache manager goes directly to the memory location. “There is no going down the storage stack,” he said. No paging reads or writes are generated at all.
One challenge the design team had to confront with this model was how to deal with file system filters, which are layered on top of the file system to provide additional capabilities, such as encryption, snapshots, activity monitoring, quota monitoring and the like. These filters are made both by Microsoft and third party vendors.
In order not to disturb the operations of these drivers, Microsoft created a new volume class, allowing those who manage the filters update their drivers at their own pace. This means when the DAX volume is mounted; these drivers will not know about it unless their creators update them.
Faster
The extra work that will need to be done is more than offset by the resulting performance gains, if they pan out as predicted, are mind-boggling. Today’s speedy NVM SSD 14,553 IOPS, producing a throughput of 56.85 MB/sec, as measured on 4k random writes on a single thread running on a single core. DAX on NVDIMM produces 1,112,000 IOPS, producing a throughput of 3,343 MB/sec!
That’s a jump in nearly two orders of magnitude. Imagine your apps running nearly 100 times as fast as they do today, using pretty much the same CPU. The extra work might even be worth it.
But even if you went with block mode, you would still get an order of magnitude improvement in performance, without adjusting your app at all. Tomorrow’s gains in performance may not be coming from the CPUs after all but from memory.
Intel is a sponsor of The New Stack.
Images from Neal Christiansen’s IDF presentation.