Guido van Rossum on Types, Speed and the Future of Python
Where will Python be in 100 years? It’s a question MIT-based AI researcher Lex Fridman posed to Python creator Guido van Rossum towards the end of a wide-ranging, three-hour interview.
It’s just one of the ways they explored how a popular programming language evolves — along with specific questions about Python’s own future. How will they improve Python’s standard library? Will Python ever see a static type checker? What are the prospects for improving Python’s performance-slowing global interpreter lock? And what steps would they take to avoid another painful breaking change like the transition to Python 3…?
But along the way, Fridman also asked van Rossum the more hypothetical question about the language’s role in a distant future a century down the line. “Do you ever imagine a future of human civilization — we’re living inside the metaverse, on Mars, humans and robots everywhere.
“What part does Python play in that?”
The Year 2122
Python has been around for more than 30 years, so van Rossum has already seen the ebb and flow of popular programming languages. So in response to Fridman’s question, van Rossum answered that given enough decades, the immensely popular Python will “eventually become a legacy language — that plays an important role, but that most people have never heard of and don’t need to know about.” While people may build on top of it, Python would be buried somewhere further down, in this hypothetical world with many layers of abstractions — continuing our ongoing progression. “I mean, most programmers nowadays rarely need to do binary arithmetic, right?”
The meta-message seems to be van Rossum remains confident in the language’s enduring appeal — and here in the present, van Rossum says he isn’t actively watching for gaps that need filling in Python’s standard library. “Usually when something’s missing from the standard library nowadays, it’s a relatively new idea, and there is a third-party implementation — or maybe possibly multiple third-party implementations. But they evolve at a much higher rate than they could when they’re in the standard library… I like that there is a lively package ecosystem.”
Now that he’s stepped down from his role as “Benevolent Dictator for Life” — transferring control of the language in 2019 to a Python Steering Council — van Rossum still sees a continuity in the clarity of Python’s vision for the future. The advantage of the old hierarchy was its predictability — with Guido charting “a clear, fairly straight path.”
“Fortunately, the successor structure with the steering council has found a similar way of leading the community in a fairly steady direction without stagnating.”
“And for me personally, it’s more fun, because there are things that I can just ignore! ‘Oh, there’s a bug in multiprocessing? Let someone else decide whether that’s important to solve or not!’ I’ll stick to typing and async IO and the faster interpreter.”
Van Rossum is now part of a team at Microsoft working to speed up the language’s performance. And in October a Microsoft blog post reported that Python 3.11 had brought speedups of 10-60% to some parts of the language. When asked how that was achieved, van Rossum felt they’d fixed design decisions that date back to the origins of Python. In the search for ways to speed Python, the “low-hanging fruit” was the interpreter, where they’d focused almost all of their efforts.
One example? Something as simple as a plus sign has multiple meanings, with entirely different “add” functions depending on the kind of data being added. Was it an ordinary integer or a floating point number (with additional digits after the decimal point) — or was it a string? Or was it a combination of different data types? “We know statistically that the outcome is almost always yes, they are both integers,” van Rossum says, “so we quickly make that check, and then we proceed with the Add Integer operation.
“And then there is a fallback mechanism where we say, ‘Oops, one of them wasn’t an integer….'”
It all comes down to something fundamental about the language: that Python doesn’t require its programmers to declare a variable’s type. This makes life easier for programmers — but it introduces a lack of quick efficiency in the interpreter. “So we’re trying to make the interpretation more efficient without losing the super-dynamic nature of the language. That’s always the challenge.”
But what about the possibility of a Python that instead just offered static typing — with variable types declared in the code, to speed up performance. Fridman pointed van Rossum to mypy, Python’s experimental (and optional) static type checker, asking “where does mypy stand now — and what’s the future of static typing in Python?”
“It is still controversial,” van Rossum says. “But it is much more accepted than when mypy and Python Enhance Proposal #484 were young…. ” Introduced in 2014, PEP 484 launched “a very productive year where many hundreds of messages were exchanged debating the merits of every aspect of that…” (Fridman cited estimates that today 20 to 30% of Python 3 codebases are now using type hints.)
And van Rossum noted that Google, Facebook and Microsoft have since each developed their own static type checker for Python — which speaks to the interest in it. “My assumption is that many, many people developing Python software professionally, for some kind of production situation, are using a static type checker. Especially anybody who has a continuous integration cycle — probably, one of the steps in their testing routine that happens for basically every commit is ‘Run a static type checker’…. So I think it’s a pretty popular topic.”
So is there going to be a future where a static type checker gets integrated into the language, Fridman asks. “Nobody is currently excited about doing any work towards that,” answers van Rossum, before adding “That doesn’t mean that five or 10 years from now, the situation isn’t different.”
But he also points out that since new releases of Python only happen once a year, type checkers “evolve at a much higher speed than Python and its annotation syntax.”
van Rossum also points out that once you introduce a new feature into a language, you have to assume people are using it. Which means, “Once we’ve all agreed that we are going to put some new syntax in, we can never take it back. At least deprecating an existing feature takes many releases…”
More Possible Futures
The conversation turned to Python’s infamous Global Interpreter Lock, which prevents “race conditions” by allowing only one thread into the interpreter at a time. Because of this, it’s often blamed for slowing Python’s performance. (It’s a design decision which van Rossum traces to the days where multicore CPUs were rare).
When asked if there’s ideas for removing it from Python — maybe replacing it with multiple subinterpreters — van Rossum answers “Yeah, there are a couple of possible futures there. The most likely future is that we’ll get multiple subinterpreters — which each run a completely independent Python program, but there’s still some benefit of faster communication between those programs…”
But that opens up the possibility of those dreaded concurrency bugs surprising programmers — which van Rossum calls “the downfall of the multithreaded programming model.”
And here van Rossum reiterated his position that “Concurrency bugs are just harder… It would be nice if Python had no global interpreter lock, and it had so-called free threading — but it would also cause a lot more software bugs.”
And yet “There is still a possible future where we are actually going to, or where we could experiment at least with that.” He notes Facebook’s fork of CPython with an improved interpreter that eliminates the lock entirely (along with other optimizations). “The single-threaded case doesn’t run too much slower, and multi-threaded cases will actually use all the cores that you have. So that would be an interesting possibility — if we would be willing as Python core developers to actually maintain that code indefinitely. And if we’re willing to put up with the additional complexity of the interpreter, and the additional overhead for the single-threaded case.”
But it also might not happen. “I’m personally not convinced that there are enough people needing the speed of multiple threads with their Python programs that it’s worth it to take that performance hit and that complexity hit. And I feel that the Global Interpreter Lock is a pretty nice, ‘Goldilocks’ point between no threads and all threads all the time. But not everybody agrees on that. so that is definitely a possible future.”
And meanwhile, “The sub-interpreters look like a fairly safe bet for Python 3.12, so say a year from now.”
When Python 2 Became Python 3
Looking back over Python’s history, Fridman remembered Python’s long transition to Python 3 (which wasn’t backward compatible with Python 2). Fridman asked if such a break would ever happen again — and immediately van Rossum stressed that even hypothetically, “If there is going to be one, we’ll plan the transition very differently — because clearly, we underestimated the pain the transition caused for our users in the Python 3 case. And had we known, we could have designed Python 3 somewhat differently, without making it any worse. We just thought that we had a good plan. But we underestimated where — what the users were capable of when it comes to that kind of transition….”
“If we’re going to have a Python 4, we’re going to have to have both a different reason for having that — and a different process for managing the transition!”