Build Machine Learning Apps in Your Notebook with Tecton
Tecton, a machine learning (ML) “feature platform” company founded by the creators of Uber’s Michelangelo ML platform, today announced version 0.6 of its product. The update allows users to “build production-ready features directly in their notebooks, and deploy them to production in a matter of minutes,” said Mike Del Balso, co-founder and CEO of Tecton.
I spoke to Del Balso in a Zoom call to find out what, exactly, an ML feature platform is — and what it’s typically used for inside enterprise companies. Also on the call was Gaetan Castelein, head of marketing at Tecton.
What Is a Feature and What Does It Do?
“If you think about a machine learning application, there are two parts to it,” said Del Balso. “There’s a model that’s ultimately making the predictions. But then that model […] needs to take in some data inputs — those data inputs are the features. And those features contain all the relevant information about the world that it needs to know at this time, so it can make the right prediction.”
An example of a feature would be data about how busy the roads are for an Uber trip. Or, is it rush hour? Both sets of data would be “features” for an ML application.
In fact, Del Balso and his Tecton co-founder Kevin Stumpf (CTO) came up with the idea for a “feature platform” while they were working at Uber. According to Tecton’s ‘About’ page, the pair built the Michelangelo ML platform at Uber, which “was instrumental in enabling Uber to scale to 1000s of [ML] models in production in just a few years, supporting a broad range of use cases from real-time pricing, to fraud detection, and ETA forecasting.”
They soon realized that a feature platform could be used in any ML workloads which involved what Del Balso called “real-time production machine learning.” Prior to Uber, Del Balso worked at Google, “on machine learning that powers the ad systems at Google.” Other use cases for Tecton’s technology include recommendation systems, real-time dynamic pricing, and fraud detection for a payments system.
Defining and Running a Feature
The primary users of Tecton are data scientists or engineers, and it requires defining a feature using code. According to the documentation, features in Tecton “are defined as views against a data source using Python, SQL, Snowpark, or PySpark.”
“This is not a no-code platform or something like that,” Del Balso confirmed. “When you use the feature platform, you’re defining the code, defining the transformations that take your business’s raw data and turn them into the data — the features — that the model uses to make its predictions.”
After the features have been defined through code, the feature platform “manages all aspects of those data pipelines through all stages of the machine learning lifecycle,” he said.
This includes doing computation and updates on the data itself, all throughout the process.
The feature platform is “continuously computing the latest values of all of these signals, such that the model always has the most relevant information [in order] to make the most accurate prediction,” he explained.
Bridging Dev and Production
Because machine learning in applications is still relatively new in the enterprise, there is often a mix of skill sets in Tecton users.
“We’re kind of in this interesting space in the industry, where […] machine learning teams look very different across companies,” said Del Balso. “So, our target is people who are building your machine learning application. That can be a data scientist who does not have production engineering skills, but very often in a company it’s an engineer who has the production engineering skills but maybe they’re not really an expert at data science.”
Where there have been issues in the past is in “the wall” between a development environment and a production one. Data scientists, in particular, do not generally have experience in moving an application to production. Tecton aims to solve that, said Del Balso.
“You have these two different worlds, the data scientists and the engineers didn’t know how to work with each other — at development time, let alone an ongoing operational time. And the value that the feature platform brings is that it breaks down that wall, making it easy. It gives a centralized way, a single way, for data scientists to define all of these feature pipelines in their development workflows, and have essentially no additional tasks to productionize them.”
With v0.6 of its platform, Tecton says it has integrated the feature workflow within a data scientist’s existing notebook tools. This, says Del Balso, removes the obstacles preventing data scientists from easily going to production.
“Now you don’t even have to leave your data science tools,” he said. “You get to use your same Jupyter Notebook. You get to use the same data science environment that you built, or that you’re used to using. So the experience is much closer to what they [data scientists] love and are comfortable with. And it allows us to bring the development and production environments and experience closer than they’ve ever been before.”
AI in the Enterprise
While generative AI continues to grab all the headlines (OpenAI just released GPT-4 this week), it’s just as interesting to track how AI and machine learning are moving into the world of enterprise IT. Just as we saw a DevOps revolution after cloud computing emerged in the late 2000s and into the 2010s, we’re now seeing an “MLOps” (for want of a better term) revolution in the early 2020s, as AI takes hold.
Overall, Tecton is another example of the expanding range of AI tools that are becoming more and more essential in the business environment.