She’s the CEO and He’s Facebook’s Former Director of Engineering. Together They Have Launched Interana for Behavioral Analytics
Interana is one of those rare startups that we don’t see that often. It’s an all-in-one, in-memory stack designed to do data analysis. It is also a technology company led by a married couple, one with a background in hardware and the other a software maker with social network pedigree.
Ann Johnson is the CEO and an electrical engineer who has worked in senior positions at Intel. Bobby Johnson is the CTO, the former director of engineering at Facebook for six years, responsible for scaling the site from millions to billions of users. A third co-founder, Lior Abraham, invented SCUBA while at Facebook, a visual analytics tool adopted by over half of Facebook’s employees.
Today, following months of secrecy, the two are taking Interana, a 30-person company with $8.2 million in funding from Battery Ventures, out of beta and into general availability. What they are offering is the ability to analyze event data to do behavioral analysis. That event data might include click streams to a web page or the data that comes from people pressing the buttons on their remote control while watching television.
While Ann Johnson is focused on systems efficiency, Bobby Johnson is leveraging the work he did at Facebook where he wrote Scribe, a system for collecting massive amounts of data from a large number of servers, which, according to a Facebook blog post, the company used to do such task as track how much memory a database is using to delivering context to the news feed. It’s at Facebook where he also led teams that developed Cassandra, the Apache Foundation, open source distributed NoSQL database.
Interana is interested in things at discrete times, things that are triggered by events, such as sessions on a page over a lapsed period of time, said Bobby Johnson in a phone interview. It’s a compressed column store across multiple nodes that uses memory and hard disk for data spillover.
Time is viewed as a first order matter in Interana, something that is not necessarily an aspect of an SQL database. With time as a priority, Interana has the capability to find patterns and sequences in data that might help SaaS companies, for example, optimize ways to get more subscribers. At Facebook, Bobby Johnson was known for using interconnected data sets, understanding how they worked and then outputting that data into a graphical user interface. This design philosophy is a central theme of Interana, which is apparent in the dashboards it has developed.
The data Interana manages is often semi-structured, set over a period of time, organized according to attribute. The more attributes, the more enriched the analysis becomes. An attribute may identify an important real-world “actor” – such as a user, a device, or an IP address, according to an Interana white paper.
These will be important as behavior events we can ask questions about. Often, separate datasets exists with time-independent information about each actor. For example, demographic information about users or metadata about products. Interana can load these datasets as “Lookup Tables” and reference them from a query over events.
Some other aspects of the Interana technology:
- Interana is an OLAP technology. According to Interana information, their technology has “no indexes, no B-trees, and no block fragmentation. Instead it has long contiguous extents of memory whose only writes are appends. This makes writes fast, and it means reads are pipelined at all levels and never blocked on locks.”
- The technology’s use of column-oriented format makes it optimized for scans. SQL databases are designed to process Columns not referenced in a query don’t waste CPU cycles, they don’t take up space in the CPU cache, nor in RAM.
- Interana scans data in-memory but also lets it spill over “from one layer to the next.”
- By taking advantage of its time and entity data model, Interana moves a minimal amount of data across the network in distributed queries.
- The behavior engine in the data store is written in C++ and has been optimized to take Interana scanned data as it comes in. which it says differentiates itself from tradfitonal data warehouse offerings.
The in-memory database world has boomed over the past few years. Interana is a bit different. The technology is designed with time as the primary consideration, making it well suited to potentially work with sensor data, which like clicks on a web site, is all event based. There are lots of options out there, including somewhat similar offerings such as AWS Redshift. SAP and Oracle have their own technologies but have different use cases and not designed to scale out in the manner that Interana does.