Cloud Services / Data / Contributed

4 Best Practices for Managing Data in a Hybrid Cloud

21 Apr 2022 10:00am, by
Randy Hopkins
Randy Hopkins is the vice president of global systems engineering and enablement at Komprise. He specializes in building and running systems engineering organizations that include presales, technical operations, channel and new product introductions.

Data management is hard enough when your data lives in a single data center or cloud. But when you opt for a hybrid cloud strategy — as about two-thirds of businesses do today — you face a whole new level of complexity when it comes to tracking, securing and governing your data.

The main reason why is that in a hybrid cloud model (meaning a cloud setup that pairs on-prem infrastructure with resources hosted in a public cloud like Amazon Web Services, Azure or GCP), you have many more data vendors, tools and protocols to contend with than you would when all of your data lives in a single environment.

You might, for example, have some data that lives in local file systems on on-prem Windows and Linux servers. Meanwhile, you also host some data in an NFS or SMB file share running on your corporate network. At the same time, you use a cloud-based object storage service like AWS S3 or Azure Blob Storage. You might have other storage solutions, such as NetApp, in the mix to boot.

The bottom line is that determining where your data lives — let alone managing it effectively — requires you to juggle a disparate set of tools when you have a hybrid cloud strategy.

Not only does each storage vendor or protocol in a scenario like this involve a different storage location, but it also entails an entirely independent set of tools for identifying, managing, backing up and protecting data. Securing data on a Linux file system requires you to use Unix tooling to set file permissions, for example, whereas on Windows you’d use a separate set of file system access controls. For cloud-based data, you’d use your cloud vendors’ access management framework, like AWS IAM. And so on.

The bottom line is that determining where your data lives — let alone managing it effectively — requires you to juggle a disparate set of tools when you have a hybrid cloud strategy. You have to navigate a variety of data silos and master numerous protocols and platforms to keep your data secure and enforce governance policies.

A Better Approach to Hybrid Cloud Data Management

You can’t erase the siloed nature of data in a hybrid cloud. It just comes with the territory.

What you can do, however, is to take steps to simplify and streamline the way you work with data across the various silos that exist within a hybrid cloud. By being proactive and holistic about the way you discover, protect and govern your data, you can not only make hybrid cloud management much more efficient, but you can also minimize your risk of inconsistencies and oversights, such as leaving sensitive data in an insecure location. There are four key practices to follow in this regard. 

1. Achieve full data visibility. The first is simply to know which data you have via the creation of a global data index. After all, you can’t govern data very effectively if you don’t know where it exists or which protocols or platforms it depends on. Building a data index that identifies all your data across the various assets in your hybrid environment can ensure you know where your data resides at all times. Some storage vendors can index their storage platform only. This is proprietary and limited to that silo, so IT would need to integrate the indexes manually along with any data stored in the cloud.

2. Build for accuracy. The second step toward better hybrid cloud data management is ensuring that your data index is continuously updated. It’s very likely that your data architecture changes constantly. You may move data from one location to another within your hybrid environment, for example, or introduce new types of data services. It’s critical that your data index remain flexible and scalable so that it can reflect these changes as they occur. Your index needs to support new data formats, storage locations, protocols and so on so it can keep adapting with your business.

3. Operate by rules and policy. Third, strive to deploy an actionable data management strategy. An actionable strategy is one that allows you not just to see where your data exists, but also to manage it proactively using a declarative approach. In other words, you should be able to write policies that define how data should be managed based on attributes you define and then enforce those policies automatically across your hybrid environment.

To illustrate what this means in practice, consider an organization that needs to delete data of a certain type (such as ex-employee or ex-customer data) after a set period of time to meet compliance requirements. Instead of attempting to meet that rule imperatively — which would mean going out and finding the data and then deleting it manually — the organization can adopt a declarative approach wherein it writes a policy that says “when data is tagged with [insert attribute here], delete it after one year.” Then the rule would be continuously enforced across the environment. Regardless of where exactly the data is stored or which protocol is managing it, it would be disposed of based on governance rules defined by the organization.

4. Maintain excellent user experience. Finally, the best hybrid cloud data management practices should be invisible to the applications and services that host your data. In other words, they should be able to enforce data governance rules without disrupting user access and/or the way that your workloads operate. They shouldn’t slow down performance or cause application errors even as they move data around, modify access controls and so on.

Complex Clouds, Simple Data Governance

When you embrace these four principles, you get a data management and governance process that works seamlessly across the various boundaries within your hybrid cloud.

Your data “governors” — meaning the auditors, compliance officers, security engineers and other stakeholders responsible for managing data securely and responsibly — are able to discover and classify all of your data automatically and then manage it via consistent policies. They can also enforce whichever data retention and disposition policies your organization needs, even if those requirements vary between the different data stores, services and protocols that live within your cloud.

Conclusion

There’s no denying that hybrid cloud architectures make data management inherently more complex. With the right approach, however, it’s possible to manage that complexity in a way that ensures both efficiency and consistency, no matter how many data silos, tools or protocols exist within your cloud environment.

So instead of letting your hybrid cloud constrain what you can do with your data, take charge of data management in a way that lets you build a hybrid cloud that is as complex as you want it to be without compromising your ability to govern data effectively.

Featured image via Pixabay.