Integral Applies Automation to Healthcare Privacy Compliance
When an organization wants to work with healthcare data, it has to go through a privacy certification process to comply with regulations like HIPAA.
Through his work with clients at LiveRamp Health, Shubh Sinha found this a cumbersome, manual and expensive process.
“Imagine combining your geographic data, your demographic data with your prescriptions and new claims and your history of surgeries, in order to get a 360 footprint of a particular patient or customer and using that to have more informed treatment messaging or developing better treatments — knowing somebody that well — and so I helped companies do that,” he said of his work at LiveRamp Health, which he described as a smaller version of the experimental Google X.
The privacy certification process for those data sets, however, involved working with a consultant to ensure you adhere to the policies that the U.S. government has laid out every time you want to combine sensitive healthcare data for any sort of analysis. That could mean 80 or more emails going back and forth and taking weeks.
“You can’t wait eight to 10 weeks to run a query or make a business decision,” he said, which led to the realization that this process was crying out for automation. That led to the birth of San Francisco-based Integral, which Sinha created with John Kuhn to apply automation and machine learning to this arduous process. It maintains it can take the process down to two weeks.
Collaboration Plus Privacy Monitoring
The Integral technology is two-pronged: a collaboration platform and a privacy engine.
“Today, you contract a HIPAA consultant, you send them all of your data sets, and then you send them a bunch of emails on like, ‘Hey, this is what I want to do. Here’s the data set. Here’s what it all means.’ And then there’s a back and forth,” he explained.
“The consultant does their analysis on the data sets, and they say, ‘Hey, this is too high risk. I need you to take out this column, eliminate this data set, whatever the remediation is,’ and then they go back and forth on the fix itself, to see what’s palatable from a business use case perspective, and then what’s palatable from a compliance perspective.
“And then once the consultant and the client reach agreement, all the paperwork is spun up, the report is done. And then a data set is delivered back to the client in a very clean and remediated way.
“So one part of our software is a collaborative suite of tools, where you can pretty much do all your project management in one place. There’s no Zoom calls and email threads. It’s all just one seamless platform where you can add notes, communicate with your partners, your stakeholders, wherever you want.”
Much of that can’t be automated yet, he admitted.
The privacy engine continuously monitors the data sets for privacy violations, which users can remediate in the UI.
Users install a Docker image, then connect with data storage systems such as AWS (Amazon Web Services) or GCP (Google Cloud Platform) through standard APIs with read and write permissions to begin analysis.
“We have a variety of statistical models baked into our software to do the privacy analysis, such that once we read the data, it gets fed into all of our models, and we output a score. That is this compliance score,” Sinha explained.
“Oftentimes, it’s really high because people have not cleaned their data at all. … So then we’ll give the problematic columns. And so for example, if you have a Social Security number, which is the easiest, like no-go in healthcare…. Sometimes you have somebody’s address. Sometimes you can include a city, sometimes you can’t …
“And so that’s where it gets challenging. …. And so how do you walk that line between privacy and business utility? We help walk that line because we score every individual column because of these statistical models.”
Because healthcare data from different sources often isn’t clean, the automation can discern what different columns are. For example, sometimes address is written as “add” or “origin.” It doesn’t have to go ask someone what a column header “origin” means.
“A user of this platform can see which columns are the most problematic and fix those. … The important thing here is the collaboration piece, because we’ve consolidated everything from collaboration to automation, we can generate all the paperwork because we own the end-to-end cycle,” Sinha said.
The HIPAA consultants are one vein of competition for Integral. Sinha also credits health data giant Datavant with providing the tailwinds on which Integral was built.
“Their main product is this extreme hashing engine that they drop it into different healthcare data silos. And then once the information is hashed, people feel comfortable sharing it,” he said. Datavant is releasing PrivacyHub, a product that will compete directly with Integral for connecting different data silos.
“The differentiation for us is that you come to us, and we basically facilitate the entire compliance process. They are focusing purely on the statistical privacy engine side of things and leaving the entire end-to-end process still to be manually managed by some project manager at a company. And so for us, one of the big things is that we are your one-stop shop.”
The company recently announced an undisclosed amount of pre-seed funding from Virtue Ventures, Caffeinated Capital, GreatPoint Ventures, Array Ventures, LiveRamp Ventures and several angel investors.
“By speeding up the compliance process for healthcare data, Integral makes it possible to create bespoke patient experiences, streamline drug development research and more while ensuring patient privacy. This is critical to creating a better healthcare system,” Sinha said.
Integral has its eyes on large health companies, insurers and digital healthcare firms. As it is now, if two companies want to share data, they each have to hire a HIPAA consultant to go through the process.
“Our end goal of automating compliance is to standardize it throughout the ecosystem. These companies are going to share data with or without us. We want to be the data infrastructure that powers safe exchange of data from one place to another.”