How has the recent turmoil within the OpenAI offices changed your plans to use GPT in a business process or product in 2024?
Increased uncertainty means we are more likely to evaluate alternative AI chatbots and LLMs.
No change in plans, though we will keep an eye on the situation.
With Sam Altman back in charge, we are more likely to go all-in with GPT and LLMs.
What recent turmoil?
Cloud Services / Storage

The Power of Data Fabric in a Hybrid Multicloud World

How to build a data fabric for multi-cloud high performnce computing operations.
Aug 19th, 2019 10:58am by
Featued image for: The Power of Data Fabric in a Hybrid Multicloud World

NetApp sponsored this post.

Every company has become a software company and data is every business’s most valuable asset. According to Gartner, by 2022, 90% of corporate strategies will explicitly include information as a critical enterprise asset and analytics as an essential competency.

However, possessing valuable data and being able to use it effectively to support digital transformations, including the adoption of big data analytics and AI, are very different things. In the face of an increasingly complex and challenging IT landscape, many companies are struggling to modernize its IT systems, innovate in the cloud and disrupt their markets. Overcoming these challenges requires adopting a new approach for data-infrastructure management: a data fabric.

Barriers to Modernizing the Infrastructure

Anthony Lye
Anthony is senior vice president and general manager of the cloud data services business unit for NetApp. He is responsible for the strategy and execution to further NetApp’s cloud innovation across private cloud, hybrid cloud and hyperscaler models and to establish the company as the undisputed leader in managing data in a cloud-integrated world. Anthony brings more than 25 years of diverse leadership experience to NetApp, spanning roles in development, product management, marketing and sales and as a high-tech CEO twice.

If data is a business’s most valuable asset, then the more data the business collects, the more value the data should have. But this assumption is incongruous to the reality of how data is collected and stored by many organizations. Enterprises usually collect both structured and unstructured data in diverse systems on different platforms; this siloed data makes it difficult to combine data for analysis. Data warehouses and data lakes are valiant attempts to overcome this, but the advent of public and private cloud technologies has created new challenges that are not addressed.

One of the key benefits of public cloud platforms, such as Amazon AWS, Microsoft Azure and Google Cloud Platform, is easy and scalable self-service access to data. However, to obtain all the performance and cost benefits of these cloud platforms, most companies want to take advantage of multiple clouds. Each public cloud offers varying benefits when it comes to compute, storage, security and pricing, and companies want to use the public cloud solution that best aligns with a specific application requirement or workload. This creates a multicloud environment. Further, many companies can’t or aren’t willing to put 100% of their workloads in a public cloud, so they are also looking to create on-premises private clouds. Now, these companies are dealing with a hybrid multicloud environment. This process makes data silos more complex than they were before.

These new data silos are again preventing companies from gaining the business insight they seek across all their data. What they need is the same level of agility, scale and self-service access across the hybrid multicloud environment as each public cloud vendor already provides for their own cloud services.

The Data Fabric

Data fabric is a strategy that many data-management with which companies are beginning to familiarize themselves as a remedy. It was created for simplifying the orchestration of data services across a choice of hybrid multicloud environments. It includes storage, networking and compute technologies and ties the various cloud and on-premises locations together with high-bandwidth connections. Using APIs, data fabric delivers a consistent and integrated view of data to support big data analytics initiatives, while offering the reliable data access, control and security that companies need to efficiently and confidently take full advantage of the integrated environment.

For example, WuXi NextCode built its data fabric to sequence one million genomes in parallel, obtaining results three times faster than was previously possible. The company utilizes a multicloud environment for storing genomic data and for performing secondary processing and running clinical applications for its customers. The multicloud strategy enables a data architecture that supports powerful analytics with elastic scalability and pipeline flexibility. It also allows for and enables faster and more reliable integration into existing customer systems. Its data fabric has allowed them to easily set up and configure a highly scalable “HPC-like” storage infrastructure in the cloud, and easily move data from on-premises environments, across cloud vendors and even into another data center with ease; all of this is done while processing orders of magnitude, integrating more data simultaneously and analyzing vast volumes of data in real-time. In addition to enabling practitioners to make life-saving decisions more quickly, the technology supports more sophisticated and rapid pattern recognition to help prevent and protect patients from human error.

Shifting gears to entertainment, DreamWorks is using its data fabric to meet the studio’s rapidly expanding data storage and management needs. DreamWorks teams must have access to a staggering amount of data from multiple on-premises and cloud sources. The creation of an average animated feature film requires hundreds of artists and engineers, over 600TB of data, more than 100 million compute hours, and half a billion digital files. And the studio can have as many as 10 animated films in production at the same time. Their data fabric enables teams to take advantage of the massive amount of content-creation and data from multiple sources across the various productions. By simplifying and automating the infrastructure, data fabric ensures on-demand availability of data and maximum production uptime.

Three Best Practices for Data Fabric Implementation

As a foundational approach that spans an entire infrastructure, a data fabric strategy can be surprisingly easy to implement because of its focus on modularity and scalability. As such, more and more companies are beginning to rely on this emerging technology and industry analysts are recognizing it’s potential with Gartner naming data fabric as one of the top-10 data technology trends of 2019. That said, there are some key considerations organizations will need to keep in mind as they get started:

  1. Act now: with the scale and speed of data increasing exponentially, data is only going to get more difficult to manage. Companies should prepare by identifying potential choke points in their current dataflow and utilization and by assessing how that could impair business growth. Implementing a data fabric strategy can help leapfrog these choke points will allow companies to avoid business disruption and also help developers to stay focused on creation, innovation and bottom-line results.
  2. Assess in-house limitations and invest accordingly: Many companies do not have sufficient expertise in data management, cloud, networking, integration, and data mobility to build data fabric immediately. However, new service models and automation technologies can offer an alternative to expensive new hires, pushing much of data collection, migration and consolidation behind-the-scenes. In 2019, widespread adoption of public clouds has created a new standard for IT experiences, and organizations increasingly expect — and often require — a similar ease-of-use and accessibility in their own environments.
  3. Avoid lock-in: hybrid multicloud has become the norm for enterprises for a good reason: organizations are consuming IT resources in radically different ways, mixing and matching multiple public clouds, on-premises systems and hybrid environments to support applications and workloads wherever they are best suited. In 2019, anything that limits your choices in how to design and scale infrastructure represents an impediment to your business’s growth and a barrier to a data fabric strategy.

Businesses today are faced with many challenges that land squarely with IT, and a large reason for that is the importance placed on data. The more vital data is to success, the more important it is to create the right infrastructure to take advantage of it. Data fabric spans every data location in a hybrid multicloud and provides fast, agile access no matter where the data resides. This new standard of data management will enable all users to work with data at the speed and scale they require, even as the amount of data and the complexity of analysis continue to grow. In a multicloud world, building your data fabric is the key to innovation, modernization, and industry disruption.

Feature image via Pixabay.

Group Created with Sketch.
THE NEW STACK UPDATE A newsletter digest of the week’s most important stories & analyses.