Culture / Open Source / Software Development

All About ‘Bank Python,’ a Finance-Specific Language Fork

5 Dec 2021 6:00am, by
Python language logo on blue background

In November, London-based software engineer Cal Paterson published a fascinating “oral history of Bank Python,” describing it as “The strange world of Python, as used by big investment banks … effectively proprietary forks of the entire Python ecosystem.”

In an email interview with The New Stack, Paterson told me he’s worked at more than one institution using its own flavor of Bank Python — and the “strong similarities” suggest IT workers carried the banner of its basic ideology when they moved from one investment bank to another.

“If you know a certain important function existed at a previous bank in a certain file, it probably exists in the same file in your current bank,” he said.

His own work history includes writing Python in proprietary in-house environments at JP Morgan (Athena) and Bank of America Merill Lynch (Quartz), as well as a four-month gig as a Python developer at Citibank. (Although Paterson also adds that even within the investment banking industry, “some banks have it and some do not.”)

To avoid revealing employer-specific details, his blog post just presented a fictional amalgamation of what he’s seen throughout his career. But it still offers a great chance to explore an exotic fork of an entire programming language, which is almost completely unknown to the general public.

Underscoring the intrigue in his essay, Paterson writes that “foreign systems, like foreign countries, can be mind-expanding when experienced firsthand.”

No Filesystems, Just a Big Database of Objects

The first big surprise: Bank Python doesn’t have access to a filesystem! But it also doesn’t seem to need one, since most of what it would operate on already exists in a massive database of objects, representing things like trade and market data. These objects represent everything from simple bonds to more complex derivative instruments.

“If some bad news about a company is published and a credit agency downgrades their credit rating, then someone in bonds will update the relevant Bond object,” Paterson’s essay explained.

And this instantly propagates to any other calculations that include the value of that bond object — just as simple as changing a cell’s value in an Excel spreadsheet automatically causes any formulas using that cell to update its own value. (This also allows the implementation of handy additional features like “time-traveling” through your data to see its values in the past or future.)

An anthropologist might argue this structure holds clues to the culture of investment banks. Interestingly, Paterson described this stack in his essay as “heavily influenced by the technological path dependency of the financial sector, which is another way of saying: there is a lot of MS Excel.”

Through the years, he acknowledged in his essay, many have had logical business reasons for a company-wide switch to Kubernetes, microservices and a service mesh. But weaning reluctant users off their familiar Microsoft spreadsheet “takes away that basic agency of those Excel users, who no longer understand the business process they run and now have to negotiate with ludicrous technology dweebs for each software change.”

He continued, “Using simple Python functions, in a source-controlled system, is a better middle ground than the modern-day equivalent of Java 2 Enterprise Edition.

“Financiers are able to learn Python, and while they may never be amazing at it they can contribute to a much higher level and even make their own changes and get them deployed.”

So Bank Python programs first just open a connection to that database of objects — and then away they go. But it’s even crazier than that, since even your application’s main source code gets stored in that database, where it’s run by a single jobs-processing “runner.”

In his essay, Paterson compared it to something like Jenkins or systemd — and note that this setup has several advantages. “It can restart your software if it crashes and sends out alerts if it keeps crashing. It stores logs. It understands dependencies between jobs (much like systemd does) so if the job that generates the data your job needs fails, your job doesn’t even try starting up but instead fires more alerts.”

But most importantly, it lets you run code in a major security-conscious institution just by creating a very simple initialization file. “This is a big deal because negotiating anything in large bank is an exercise in frustration: lead times on hardware can be measured in months,” Paterson wrote. “Getting people to agree with you takes, of course, much longer than that.”

Customized Python, ‘Invented’ In-House

Bank Python implementations also seem to be using their own proprietary data structure for tables, offering faster access to medium-sized datasets (while storing them more efficiently in memory).

“Some implementations are lumps of C++ (not atypical of financial software) and some are thin veneers over sqlite3,” Paterson said.  (His friend Salim Fadhley, a London-based developer, has even released an (all-Python) version of the table data structure called eztable.)

Paterson concludes that while most programming has a code-first approach, Bank Python would be characterized as data-first. While it’s ostensibly object-oriented, “you group the data into tables and then the code lives separately.”

Needless to say, Bank Python inevitably ends up getting its own internal integrated development environment (IDE) to handle all of its unique configuration quirks, and it even has its own unique version-control system for code. Paterson acknowledged the uncharitable assessment that it’s all just a grand exercise in distrusting anything that originated outside the company (known in the programming field as the fearful “Not Invented Here” syndrome).

The biggest downside made be for the developers who work there, since every year you spend in your employer’s bespoke monoculture, “the skills you need to interact with normal software atrophy.” Just one example: “When everything is in the same repository and all code is just an import away, software packaging just does not come up.”

Of course, after years of working at banks, Paterson also experienced a more personal problem: “existential ennui arising from prolonged exposure to Windows 7 and MS Outlook 2010.”

Developers Share ‘Bank Python’ Tales

Paterson’s blog post captured the interest of developers on social media, garnering 546 upvotes in Reddit’s programming forum, and another 864 upvotes on Hacker News. “Having worked in an investment bank before this brought some flashbacks,” admitted one Hacker News commenter.

“Honestly, this sounds a lot less insane than a lot of ‘conventional’ stacks,” another reader commented.

But it’s a glimpse into an alternate universe where entire software systems are ultimately home-brewed in-house. One Hacker News comment came from Sean Hunter, a former vice president on the Goldman Sachs strategist (or “strats”) team, who posted that he’d been the one who’d introduced Python there around 2002 to 2003.

And Hunter shared some more memories with the eFinancialCareers site, remembering how over his next nine years at Goldman Sachs they’d even created an ahead-of-its-time DevOps environment deploying code updates with a CI/CD pipeline — written in Perl.

“At that time it was 20 million lines of C++ and 10 million lines of Java and we’d push it out to all our machines worldwide,” Hunter said. He believes that, at the time, there was nothing like it anywhere else in the world.

And there was also a cloud-like internal computing grid where calculations were spread across different servers. (In a later comment, Hunter recalled 667 different developers committing in just one two-week cycle.) “Everything we did was top secret and no one outside would ever know about it,” he told eFinancial Carers about his time at Goldman Sachs. “Banks build huge amounts of things for specific use cases that don’t exist yet and then the world catches up.”

Respect for ‘Bank Python’ Contributors

Paterson enjoyed Hunter’s comments on Hacker News, and said he also heard from people who work outside of the finance industry saying they liked the idea of Bank Python. “I think there are a number of ‘k8s dissidents’ who dislike the direction that making software has gone in.”

He also suggested that, today, the company Beacon is essentially marketing “a white-label version of Bank Python.”

But underneath his reaction seems to be a true appreciation for all the care and effort that went into building an entire Bank Python ecosystem.

Today Python’s core team has now grown and become more professionalized/less volunteer — but Paterson estimates that, in 2010, a large bank that forked the Python ecosystem (including the interpreter) could’ve ultimately been deploying more manhours than Python’s original, smaller, all-volunteer core team.

In our interview, Paterson described the system as “brilliant,” and told me he wished he could still use it.


WebReduce