Containers / Data / Development / Kubernetes / Machine Learning / Storage

Microsoft Prepares SQL Server 2017 for Linux and Containers

18 Jul 2017 2:00am, by

The first release candidate of Microsoft’s SQL Server 2017 is available this week, adding a handful of smaller updates to the major new features in this release, which comes a little more than a year after SQL Server 2016. The best-known new feature is support for Linux (RHEL, SUSE Enterprise Linux and Ubuntu), and for containers running on Windows, Linux and macOS; that includes Always On availability groups for high availability integrated with native Linux clustering tools like Pacemaker.

RC1 adds support for Microsoft’s Active Directory authentication system for Windows or Linux clients to SQL Server on Linux using domain credentials and using the Transportation Layer Security (TLS) encryption scheme (1.0, 1.1 or 1.2) to encrypt data transmitted from client applications to SQL Server on Linux.

Machine learning is also a focus, program manager Tony Petrossian told the New Stack. SQL Server 2017 can run in-database analytics using R or Python, without needing to extract and transform data to work with it.

“To add support for R AI and machine learning workloads in SQL Server 2017, we built an extensibility model,” he explained, “so we can execute the R runtime with SQL server on a fast path of data exchange between the R environment and SQL. That means you can execute R script as part of your code but with that extensibility enabled the additional work to enable Python was pretty small.”

Not only did that prove that the extensibility model is flexible, but it also means “we get out of the way of any argument between data scientists about the supremacy of R versus Python; we’ll enable both.” RC1 also adds native scoring and external library management to R Services on Windows Server.

SQL Server 2017 is a good example of the way Microsoft builds features first in Azure and then brings them to its on premises server products. As well as the same graph data features as Azure SQL Database, SQL Server 2017 also gets the Adaptive Query Processing performance improvements developed for the Azure database to optimize how queries are run (which can have a significant impact on query performance) by monitoring how well previous queries have run.

“This helps us be far more efficient in our use of resources within the execution of parallel queries and concurrent queries,” Petrossian explained. “The optimizer can adjust its behavior based on execution statistics that are coming through, as opposed to just trying to predict what things will be. As a result, customers will be able to run bigger queries and more concurrent queries.”

Initially, that has three modes, two in batch mode and one for interleaved execution. Now that SQL Server has the infrastructure for adaptive optimization, future releases will extend that throughout the database engine.

Production Ready

RC1 is close to a final version, Petrossian said. “We’re pretty much complete with the work and unless we find some serious bug, this is it.”

As usual, the new release won’t be fully supported until it’s generally available, but customers can use it in production if you want the new features. Several customers are already doing that, some with formal support from Microsoft to help test the new release as part of the Early Adoption Program. “We also have a couple of customers who just did it on their own and didn’t tell us,” Petrossian told us, although he noted those were “smaller workloads, dipping their toes in the waters.”

Financial analysis company dv01 originally created its reporting and analytics SaaS tools for bonds and loans on Python, Amazon RDS PostgreSQL, and Redshift data warehouse, but ran into performance and scale problems, with some queries taking longer than the 30-second timeout limit. Soon the engineers were spending more time tuning their database queries than building new features.

To get better performance and in-database analytics, they moved to SQL Server 2016, which meant using Windows Server on Azure. Query time went down to 1-2 seconds and better data compression reduced the amount of storage needed by two or three times, plus the data is encrypted in memory and on disk. With their other systems running on Linux and most engineers using Macs, after testing for a couple of months, they migrated their 40 production databases to SQL Server 2017 CTP2 running in Docker on Linux.

That’s exactly the kind of scenario that motivated Microsoft to bring SQL Server to Linux, Petrossian explained. “Aside from the obvious reason, that people are using Linux, one of the big motivators for us was that a lot of the container and private cloud technologies are built on the Linux infrastructure and we wanted SQL Server to be part of that modern IT ecosystem, whether that’s in public or private cloud or wherever it happens to be. Now that we have SQL Server running in Docker, you can take SQL Server and deploy it in container services managed by Kubernetes and so on.”

“For production use, we are saying customers can use the Linux images of SQL Server in Docker containers with some caution. We’re not suggesting people take their 500TB database and use a container for it but there are a lot of smaller workloads that people run in containers,” he said.

Windows support for containerizing SQL Server isn’t quite as advanced in RC1. “We have SQL Server in Windows containers as well; we’re working on that. On the Windows side, we recommend using it for devtest but not production yet; there are a few things where we still need to round the edges off and do some more work.”

Container support will interest even traditional Windows Server customers Petrossian said. “I think when it comes to convenience and the way developers work, as IT moves forward folks look at their peers and say ‘that seems like a simple way of doing it.’ If you can do it on Linux, why not on Windows?”

He compared cautions about containers to the early days of virtualization. “I remember when people said no-one is going to run a database in a virtual machine because of performance and so on. Of course, time has proven that wrong and people run VMs in databases all the time. You hear a lot of the same questions around containers: how is the performance, containers are ephemeral so what happens to the storage? All those things are fixed or being fixed or improving and we think containers will have a similar path to VMs in gaining adoption — but far more accelerated.”

And with this new release, SQL Server is no longer left out of this shift in IT infrastructure.

Feature image via Pixabay.

A newsletter digest of the week’s most important stories & analyses.

View / Add Comments

Please stay on topic and be respectful of others. Review our Terms of Use.