How has the recent turmoil within the OpenAI offices changed your plans to use GPT in a business process or product in 2024?
Increased uncertainty means we are more likely to evaluate alternative AI chatbots and LLMs.
No change in plans, though we will keep an eye on the situation.
With Sam Altman back in charge, we are more likely to go all-in with GPT and LLMs.
What recent turmoil?
Cloud Services

Microsoft Fabric Goes GA, Adds Multiple Integrations

The company's SaaS analytics platform is now generally available, and integrates with numerous Microsoft, third-party, and partner platforms.
Nov 15th, 2023 8:00am by
Featued image for: Microsoft Fabric Goes GA, Adds Multiple Integrations
Feature image via Pixabay.

At its Ignite conference in Seattle Wednesday, Microsoft launched into general availability (GA) Microsoft Fabric, the end-to-end analytics platform it released to public preview at its Build conference in May.

Along with the GA of the platform, Microsoft is announcing the public preview of some newer features and several new partnerships that allow third-party technologies to integrate into the Fabric environment. Microsoft is also making several AI-related announcements at Ignite, and Fabric plays a role in some of those as well.

I covered Microsoft Fabric when it began its public preview in May, so I won’t repeat all that detail in this piece. Instead, I’ll cover what’s new, go over some news on pricing and analyze how Microsoft seems to be imbuing Fabric with a most-favored platform status, relative to its other analytics offerings.

Shortcuts and Mirroring

Fabric’s GA announcement covers not just the main platform but also several types of “shortcuts” for OneLake, Fabric’s underlying data lake. OneLake shortcuts allow data that is physically stored in other systems to be virtualized and thereby become logically part of OneLake.

The shortcuts that are now GA include those for Amazon S3 storage buckets, Azure Data Lake Storage (ADLS Gen2) containers, Microsoft’s Dataverse data store as well as data stored in other OneLake implementations. There’s still no sign of shortcut support for Google Cloud Storage, but Microsoft reiterated that it’s coming.

Microsoft is serious about connecting OneLake to external data platforms, not only through virtualization using shortcuts but also via near real-time physical data replication through Mirroring, a new feature being released to public preview.

Mirroring is a bit reminiscent of the Synapse Link technology Microsoft introduced for Azure Synapse Analytics and, as with that replication bridge, Mirroring initially supports Azure SQL Database and Azure Cosmos DB as data sources. One way that Mirroring differs from Synapse Link, though, is that it works not just with Microsoft data offerings, but with third-party platforms too, initially including both MongoDB and Snowflake.

Workloads and Governance

Snowflake is a direct competitor to Fabric, so it’s interesting to see Microsoft accept that some of its customers will use the competing cloud data platform for various data analytics workloads but may still want to use Fabric’s business intelligence, data science, data engineering and other capabilities on that data.

Another workload customers may use is Fabric’s Data Activator monitoring/observability component which Microsoft just put into public preview in October of this year.

Beyond analytics workloads per se, customers may wish to govern the data brought in through shortcuts and Mirroring using the new, tighter integration between Fabric and Microsoft Purview that Microsoft is also announcing today. For example, Purview auto-scan integration enables Fabric artifacts to flow into the Purview data catalog automatically.

Beyond the data catalog, users can apply Purview Information Protection sensitivity labels to sensitive Fabric data and integrate both Purview Sensitive Information Types (SIT) based data loss prevention (DLP) policies and Purview audit into Fabric.

Interoperability FTW

In all cases, Mirroring will replicate the data into the OneLake data lake, unlike Synapse Link for Azure SQL Database, which replicates data directly into data warehouse tables.

And because the data will be in the lake in Delta Lake format (an open table format that builds on top of the Apache Parquet columnar data file format), other Delta-compatible formats including Azure Databricks will be able to make use it directly as well. The same should go for Trino-based platforms like Starburst, and the Trino clusters on Microsoft’s revamped Azure HDInsight for AKS, the public preview of which I covered recently.

The interop goes beyond the OneLake shortcut/Mirroring cohort, though, as Microsoft is making a number of partner announcements around Fabric as well. Companies signing on as partners include ESRI, Teradata, Informatica and SAS, whose location intelligence (GIS), data warehouse, data management and data science platforms, respectively, will integrate into Fabric and its user interface.

Even the London Stock Exchange Group is getting in on the game, by offering data discovery and access, and providing accompanying digital rights management, around its financial markets intelligence data products, within Fabric.

M365 and Microsoft AI

Microsoft 365/Microsoft Graph data will now integrate into OneLake as well. That capability, which is in preview, provides the data in Fabric’s preferred Delta Lake format (which is itself a big deal since M365 data was previously offered only in JSON format).

And, in the world of AI, OneLake is now available in preview as a datastore in Azure Machine Learning, and as a data source in Azure AI Studio, what Microsoft calls its “unified AI platform,” itself launching in preview at Ignite.

Speaking of AI, Microsoft is also firing up the public preview of Copilot in Fabric. This Copilot implementation goes far beyond a natural language interface for querying data. While it can certainly handle that task, it will initially be available within Fabric’s Power BI, Data Factory (dataflows and pipelines), Data Engineering, and Data Science components, allowing it, according to Microsoft, “to create dataflows and pipelines, […] build reports, or even develop machine learning models.”

While the preview of Copilot in Fabric is public, it will roll out in stages, so Microsoft can make certain to deploy the AI-capable infrastructure necessary to support all customers with access. Microsoft says all customers with F64 or higher (or Power BI Premium) compute capacities should have access to Copilot in Fabric by the end of March 2024, at the latest.

For reference, the F64 is a mid-tier capacity — there are five capacities smaller than it and five more that are larger. Pay-as-you-go pricing for F64 is $11.52/hr in most US Azure regions. But as part of its GA announcement, Microsoft is also introducing reservation pricing that will allow customers to pre-commit Fabric Capacity Units in one-year increments, which Microsoft says will yield savings of up to 40.5%, relative to published pay-as-you-go pricing.

Support and Migration

Microsoft claims over 25,000 organizations have been using Fabric (a number that specifically excludes organizations using only Power BI workloads within Fabric). That’s a big number. But Microsoft also has lots of customers using its other platforms, like Azure Synapse Analytics.

So what’s the roadmap here? And what, if any, risk do customers on other platforms run of finding they’re no longer supported? Microsoft is addressing this head-on, by explicitly stating that it has “no current plans to retire Azure Synapse Analytics” and that support, bug fixes and attention to the security of that platform will continue. Microsoft further states that it will provide advanced notice if these plans change and will adhere to the commitments in its Modern Lifecycle Policy.

What about Azure Databricks customers? Consider that Databricks developed the Delta Lake technology that Microsoft has adopted as Fabric’s native data format. And take note also that the main Fabric session at Ignite is called “Make your data AI ready with Microsoft Fabric and Azure Databricks”. On the other hand, Microsoft has also published a guide on Fabric’s blog to help Synapse customers plan their migration strategies. All this language makes it pretty clear that new investments in Microosft’s home-grown analytics technology will land in Fabric, making it likely that Azure Synapse Analytics will be robustly maintained, but likely not significantly enhanced.

How to Get Started, and Why

So if Fabric lines the analytics road forward, how can Microsoft customers get started? Despite being GA and therefore now a paid service, Microsoft is still making 60-day Fabric trials available, with trial users receiving the mid-range F64 capacities discussed above, in the context of Copilot. Folks interested can hit the service’s get started page and either create a free account or use their existing Power BI account to onboard to the Fabric trial.

As I said in my coverage in May, I’ve been working with Fabric since before it was in public preview. It ties together a lot of good  — but previously fragmented — technologies into one very complete platform that builds upon the business model and user experience of the very popular Power BI platform.

I’ve worked with Microsoft data technology for roughly thirty years now. I’ve seen a lot of it come and go, but Fabric is really the culmination, and the refinement, of decades of effort, paired with modern technologies that have become standard across the industry.

If you want to do AI, you need your data to be in top condition, and that’s exactly what Fabric and its team are focused on helping their customers achieve.

Group Created with Sketch.
THE NEW STACK UPDATE A newsletter digest of the week’s most important stories & analyses.