Fixie and its Agent Approach to Leveraging LLMs

The speed at which startups have grabbed the Large Language Model (LLM) ball and tried to run with it has been quite impressive. Fixie is one of those platforms, run by the type of people qualified to play this sport well.
Fixie’s main idea is to allow users to enhance an LLM with a specific skill or intelligent response, leaving the LLM to manage the query interaction. The access to the new stuff is written into an agent, which is what the user creates and controls. A typical user might have their own platform, whose customers could be given their own ChatGPT-like experience. For example, you might want to offer customers of your ferry company a nicer experience when finding out about sailing times.
First off, there is an obvious problem: GPT4 is swallowing the world so fast that it might already have the functionality you think you are uniquely offering. While at the moment OpenAI only trains its LLMs on a snapshot of the internet, everyone knows that could change. You don’t want to be in the Lando Calrissian position of praying that the big scary guy doesn’t alter the deal any further. But then again, it is possibly the case that OpenAI itself uses agents to enhance itself, so this is just opening out a likely future path.
The site has a slightly haphazard feel. There is a Discord server and many examples. The SDK seems to be Python only. There isn’t quite a ‘start here’, but that is the curse of the multisided platform. I looked at the Twitter example the company gives, but it didn’t really work — before I realised that was because Twitter was on fire! Fixie was just another innocent bystander to The Desecration of the Mad King.
The Fixie Workflow
For the moment, let’s say that an agent within Fixie is some way to convert a user’s natural language query into a call to some custom code that creates a response. How does it all hang together? Unusually, the architectural diagram offered is readable, and backs up what you logically think is happening. Once you make your agent, you register it. Fixie then probably uses some type of vector embedding to make it searchable amongst the other agents.
So from a user’s perspective, you can make a query against an agent. Whether you use the web frontend that hosts example agents or a curl request in the terminal, you will get a randomly named session whilst the router places your request to the agent. Session as a first class object certainly improves the experience in the sandbox. Once your query is intercepted, Fixie works internally with GPT4 (or similar) to parse your natural language query. Later on we will see how the agent utilises an interesting method to do this.
We can go straight to the dice example sandbox to show how this works:
Here is another session with same agent:
…and another session:
Note how the language variation is well handled, including indirect references — this is the clue that LLM power has been leveraged. Also note how the request is funnelled into a function call.
We can see the dice agent listed in the GitHub:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
import random import fixieai BASE_PROMPT = """I'm an agent that rolls virtual dice!""" FEW_SHOTS = """ Q: Roll a d20 Ask Func[roll]: 20 1 Func[roll] says: 12 A: You rolled a 12! Q: Roll two dice and blow on them first for good luck Ask Func[roll]: 6 2 Func[roll] says: 4 3 A: You rolled a 4 and a 3, with a total of 7. Q: Roll 3d8 Ask Func[roll]: 8 3 Func[roll] says: 5 3 8 A: You rolled 5, 3, and 8, for a total of 16. """ agent = fixieai.CodeShotAgent(BASE_PROMPT, FEW_SHOTS) @agent.register_func def roll(query: fixieai.Message) -> str: dsize, num_dice = query.text.split() dice = [random.randint(1, int(dsize)) for _ in range(int(num_dice))] return " ".join([str(x) for x in dice]) |
Now let’s separate the “few shots” dynamic from its short registered function code. The Few Shots (or Code Shots) is how the LLM is made to speed learn the new skill.
Code Shots
This is the special sauce for Fixie, because it uses a natural way to map your query into the heart of an LLM. We know pre-trained LLMs are good few-shot learners once given exemplars.
Look at those statements between the triple quotes above. These are literally examples of queries; both referencing a separate function, and showing the format of the response that we want the LLM to absorb.
With Code Shots, Fixie does the LLM part internally, and leaves the “Func” invocations to be executed in the customer’s infrastructure or in Fixie’s cloud.
When a Code Shot agent receives a query, such as "Roll 3D6"
, Fixie takes it along with the few-shot examples in the Code Shots manifest, and passes them to a LLM for processing. Fixie “selects” the best LLM and prompt to handle the query.
The output of the LLM will be something like Ask Func[roll]: 6 3
. This tells Fixie to invoke the function to perform the next step of processing, via a REST call.
Fixie then feeds the function’s response back into the LLM, again with the appropriate prompt and context, to continue processing the query. In this case, the LLM will generate a response like "A: You rolled a 2 and a 5"
. This is sent back to the client.
The Functional Code
The Python code (see the last 5 or so lines where roll is defined) just rolls the dice. Reading the natural language example, we see that the first argument is the type of die (number of faces) and the second argument is intended to be the number of times thrown.
For those, like me, who don’t speak Python, let me pick the 4 lines of code apart. The first line defines the name of the function (it’s called roll) and labels the incoming parameter, which should just be a string with two numbers in it, separated by a space.
The second line splits the parameter string into (two appropriate) numbers, dsize and num_dice, using a split method. At no point have we said anything else, so I assume those two variables remain strings. (I don’t know whether the ‘text’ method acts on the incoming object, or is part of python — either way, the intention is obvious.)
The third slightly overpacked line is a loop that selects a random integer ranged from 1 to the dice size (dsize), as many times as we asked (num_dice). The results are clearly written into an array named dice. While not a thing in the languages I do use, I believe underscore just means ignore the identity of a variable that is otherwise available for use.
And the last line returns a string of the resultant throws within the dice array, separated by a space. I note how Python has cast variables into strings or ints as needed. I don’t see any error detection.
Fixie is Straightforward, With Complications
Because the functional aspect provided by the agent is separated from the language parsing provided by the LLM, the theory is that you can use the Code Shots to associate an example with a query and response. This clearly works, but there is a lot of “passing off” from one side to another in this model, so debugging might be quite interesting. However, working with a very interactive sandbox makes things much more solid.
If I send “Roll a d20” to a different agent, it will not try to use the registry to find the correct agent — it will just shrug and say “sorry, don’t get this.” As I understand it, the first agent is responsible for breaking down your query, but sub-queries can then be handled within the registry.
I did mention that ChatGPT might eat Fixie’s lunch. As it happens, GPT4 can already roll various dice:
Nevertheless, giving an LLM indirect access to your own private data is still something a lot of people will want in order to enhance their own products, and Fixie potentially gives the customer a lot of control. There are other products in this fast moving space (for example Cody) and I don’t think any business models have been tied down. So now is a good time to check out the LLM carnival, before it pitches up on your lawn.