Meta Adds Cool, New Features to Python 3.12
Python 3.12 has been released with new features and performance improvements, some contributed by Meta, which is showing its continued support for open source.
“Open source at Meta is an important part of how we work and share our learnings with the community,” wrote Carl Meyer, a software engineer at Meta unit Instagram in a blog post about Meta’s contributions to Python 3.12.
Python is one of the primary programming languages at Meta and is used for its AI/ML work, highlighted by the company’s development of PyTorch, a machine learning framework used for various use cases like computer vision, natural language processing and more. Python is also an essential part of the infrastructure at Meta, as well as the Instagram server stack.
Meta has been working closely with the community to introduce new features and optimizations to improve Python performance and to allow easier third-party experimentation with Python runtime optimizations.
One of the new features that Meta added to Python 3.12 is something called Immortal Objects.
Meyer noted that Immortal Objects — Python Enhancement Proposal (PEP) 683 makes it possible to create Python objects that don’t participate in reference counting, and will live until Python interpreter shutdown. The original motivation for this feature was to reduce memory use in the forking Instagram web server workload by reducing copy-on-writes triggered by reference-count updates, he said.
Moreover, Immortal Objects are also an important step towards truly immutable Python objects that can be shared between Python interpreters with no need for locking, for example, via the global interpreter lock (GIL) This can enable improved Python single-process parallelism, whether via multiple sub-interpreters or GIL-free multithreading, Meyer added.
Instagram introduced Immortal Objects — PEP-683 — to Python.
“Now, objects can bypass reference count checks and live throughout the entire execution of the runtime, unlocking exciting avenues for true parallelism,” wrote Eddie Elizondo, another Instagram software engineer in a separate blog post about Immortal Objects. “At Meta, we use Python (Django) for our frontend server within Instagram. To handle parallelism, we rely on a multiprocess architecture along with asyncio for per-process concurrency. However, our scale — both in terms of business logic and the volume of handled requests — can cause an increase in memory pressure, leading to efficiency bottlenecks.”
The problem of state mutation of shared objects is at the heart of how the Python runtime works, Elizondo said. Given that it relies on reference counting and cycle detection, the runtime requires modifying the core memory structure of the object, which is one of the reasons the language requires a global interpreter lock (GIL).
“To get around this issue, we introduced Immortal Objects — PEP-683,” he wrote. “This creates an immortal object (an object for which the core object state will never change) by marking a special value in the object’s reference count field. It allows the runtime to know when it can and can’t mutate both the reference count fields and GC header.”
Other New Features
Other new features Meta contributed to Python 3.12 include:
- Type system improvements
- Performance optimizations
- New benchmarks
- Cinder hooks
Regarding type system improvements, The engineering team behind Pyre, an open source Python type-checker, authored and implemented PEP 698 to add a @typing.override decorator, which helps avoid bugs when refactoring class inheritance hierarchies that use method overriding.
“Python developers can apply this new decorator to a subclass method that overrides a method from a base class,” Meyer said. “As a result, static type checkers will be able to warn developers if the base class is modified such that the overridden method no longer exists.”
Meyer explained that in previous Python versions, all comprehensions were compiled as nested functions, and every execution of a comprehension allocated and destroyed a single-use Python function object.
However, “We landed PEP 709 for Python 3.12, which inlines all list, dict, and set comprehensions for better performance — up to two times better in the best case,” he said. “The implementation and debugging of PEP 709 also uncovered a pre-existing bytecode compiler bug that could result in silently wrong code execution in Python 3.11, which we fixed.”
Meanwhile, for several years, Meta shared its work on Python and CPython through its open source Python runtime, Cinder. The company has also been working closely with the Python community to introduce new features and optimizations to improve Python’s performance and to allow third parties to experiment with Python runtime optimization more easily, Meyer said.
In addition, he said some parts of Cinder (our JIT compiler and Static Python) wouldn’t make sense as part of upstream CPython (because of limited platform support, C versus C++, semantic changes, and just the size of the code), so our goal is to package these as an independent extension module, CinderX.
This requires a number of new hooks in the core runtime, Meyer noted adding that Meta landed many of these hooks in Python 3.12:
- An API to set the vectorcall entrypoint for a Python function. This gives the JIT an entry point to take over execution for a given function.
- We added dictionary watchers, type watchers, function watchers, and code object watchers. All of these allow the Cinder JIT to be notified of dynamic changes that might invalidate its assumptions, so its fast path can remain as fast as possible.
- We landed extensibility in the code generator for CPython’s core interpreter that will allow Static Python to easily re-generate an interpreter with added Static Python opcodes, and a C API to visit all GC-tracked objects, which will allow the Cinder JIT to discover functions that were created before it was enabled.
- We also added a thread-safe API for writing to perf-map files. Perf-map files allow the Linux perf profiler to give a human-readable name to dynamically-generated sections of machine code, e.g. from a JIT compiler. This API will allow the Cinder JIT to safely write to perf map files without colliding with other JITs or with the new Python 3.12 perf trampoline feature.
The Python Performance Benchmark suite is the standard set of benchmarks used in open source Python optimization work. During the 3.12 development cycle, we contributed several new benchmarks to it so that it more accurately represents the workload characteristics we see at Meta.
The company added:
- A set of async_tree benchmarks that better model an asyncio-heavy workload.
- A pair of benchmarks that exercise comprehensions and super() more thoroughly, which were blind spots of the existing benchmark suite.
“Meta’s work with the Python community doesn’t end with the 3.12 release,” Meyer said. “We are currently discussing a new proposal, PEP 703, with the Python Steering Council to remove the GIL and allow Python to run in multiple threads in parallel. This update could greatly help anyone using Python in a multi-threaded environment.”
Finally, Meta’s involvement with the Python community also goes beyond code. In 2023, we continued supporting the Developer in Residence program for Python and sponsored events like PyCon US, Meyer said.