Freeplay: New LLM Dev Tool for Java Developers (and Others)
Freeplay is a new LLM development platform (a sentence I have repeated multiple times this year). It’s also been given the “Figma of…” label by one of its investors, who called it “the Figma of LLM development.” So we’re two for two so far on the dev tools hype index.
Cairns told me that he and his co-founder Eric Ryan (who also came from Twitter) built Freeplay “to help product development teams make use of LLMs in their products.” Currently Freeplay helps with testing, experimentation, monitoring, and prompt management.
“That cocktail works together to help people through the software development lifecycle,” said Cairns. “From when you’re at a prototyping stage, to when you’re testing, to make sure it’s ready to go to production, and then eventually — when you’re live — helping you know what’s happening at scale in your system, and then find ways to improve it.”
Similar to Gradient, the most recent LLM development platform I’ve profiled, Freeplay is targeting enterprise software developers.
“Most of our customers have been incumbent software companies,” said Cairns. “You know, they’re not AI-first startups. They’re companies that are trying to adopt LLMs, where they have an established business, an established customer base. They are great software developers, but they probably haven’t been working with LLMs before — and maybe not even ML [machine learning].”
Bringing Java to AI World
Freeplay offers developer SDKs for Python, Node and Java. Cairns claimed that the Java SDK is unique in the AI engineering space currently.
He said that many established software companies are, for example, a “Kotlin shop” — and Kotlin is compatible with the JVM. He added that when he worked at Twitter, they were a “Scala shop” (Scala being another JVM-compatible language).
“You might not want to adopt a new programming language just to make an API call to a language model,” he said. “And that’s what we found, is a lot of established companies are just integrating with the software that they’ve been writing for years — they’re not going out and pulling some new AI framework off the shelf.”
Data Flywheel: More than Observability
I mentioned that, at first glance, Freeplay reminded me of Humanloop, which describes itself as a “collaborative playground” where developers can test out and deploy prompts. I asked Cairns if that’s a fair comparison?
He replied that it’s “a solid comp,” but that Freeplay tends to be used by their customers “after they’re already live in production” — the implication being that it’s less of a “playground” than perhaps Humanloop is.
I observed that both Freeplay and Humanloop include testing and monitoring capabilities, so they both seem similar to a traditional DevOps observability platform. Cairns somewhat agreed, but he noted that the needs of an LLM application are different to that of a traditional application.
“In traditional observability, a lot of the goal is just — hey, what’s happening,” he explained. “But with ML systems, and I think this is true with LLMs, that observability needs — what’s happening — is still there, but it’s actually part of this data flywheel, that helps you start to make the product better in a different way when you’re doing ML.”
The point is that because the quality of an LLM app relies on the underlying data (or how well that data is queried), anything from the software development cycle that helps optimize the application is fed back into the LLMs (or into the prompts/queries). As Cairns put it, Freeplay is “not just an observability platform — observability is a feature that helps with that bigger optimization loop.”
The Figma Comparison
Similar to a low-code platform, Freeplay is designed to be used by both developers and product or business people. This is where the Figma comparison comes in. It’s a tool that enables professional developers to collaborate with stakeholders in the business, using a web frontend. I asked Cairns who is typically driving this process — the developers or product managers.
“The people that are doing the initial setup have been […] tech leadership,” he replied. “So whether it’s a CTO or an engineering director, you know, they are people who are saying, hey, we want to help give our teams better tools.”
Where developers come in is with the implementation of a project.
“Developers certainly start the work with Freeplay, to do an integration and continue to use it,” Cairns said. “But the way it works is you drop our SDK into an application that you’re building. And we start to manage prompts, like a server-side experiment.”
He compared it to tools like Amplitude or LaunchDarkly, which let you do A/B testing — “where a product manager can enable an experiment.” So once a developer has set up the system, product managers can do that kind of experimentation or testing.
Comparing OpenAI’s Developer Platform to Twitter’s
Lastly, I asked Cairns if he and his co-founder Eric Ryan had any learnings from their time working on the Twitter Developer Platform. Both arrived at Twitter in 2014 due to its acquisition that year of Gnip, a social media API aggregation company.
Twitter, as some of you will recall, had the potential to be a massive app development platform — but they screwed it up in the early 2010s by kneecapping third-party developers. Cairns was careful not to talk too much about that, but he did draw an interesting analogy with the position OpenAI is in now.
“I think OpenAI actually did something really well that Twitter missed an opportunity [to do] back in 2010 before we were there. Which was, they did make a big change yesterday — where it seems like they’re competing with people who have been building other separate chatbots or agents, but they’re also giving them this app store and this opportunity to come [and] monetize on their platform. I thought they did a good job there.”