Cython Offers the Ease of Python, the Speed of C++
Python is the preferred programming language for working with massive data sets, making it the go-to choice for machine learning, artificial intelligence, and statistical analysis.
But it’s not without flaws, with speed being one of its main weaknesses and another its inability to interact with hardware. C, on the other hand, is faster and can interact with hardware, but has a steep learning curve.
Cython, a superset of Python, bridges the gap between Python and C or C++. Its aim is to make writing C extensions for Python as easy as Python itself. The rationale is that the C extensions can perform much more quickly as stand-alone modules than those run through the Python interpreter.
Cython developers released Cython 3.0 earlier this month with some noteworthy improvements.
This recent blog post written by Mike James did a great job of covering the basics of the latest release of Cython. Cython expanded the use of pure Python mode, strengthened its NumPy compatibility, and made internal updates to enhance future compatibility with Python.
Cython is an optimizing static compiler for both the Python programming language and the extended Cython programming language (based on Pyrex, a Python-like language for rapidly and easily writing Python extension modules). It provides developers the ability to write Python code that calls to and from C or C++ natively.
By using Cython, developers can turn readable Python code into plain C performance by adding static type declaration. By adding these efficiencies, Cython helps Python interact more efficiently with large data sets. Cython integrates natively with existing code and data from legacy, low-level or high-performance applications and libraries.
Recently, version 3 of Cython has been released. The list below is a non-exhaustive highlight of Cython’s new upgrades.
Expanded Pure Python Mode
Historically Cython used its own syntax, a combination of Python and the C-type declaration. This created challenges on its own by limiting the developer’s abilities to troubleshoot and debug with Python tooling as it doesn’t understand Cython’s syntax. As a solution, Cython developers created “pure Python mode.”
Pure Python mode is an alternative syntax that’s fully compatible with Python’s syntax. This meant developers could use their existing linting and code analysis tools on Cython code. The new expanded pure Python mode means the vast majority of Cython functions are now exposed in pure Python mode, including functions for calling external C libraries.
Deeper NumPy Compatibility
NumPy is a widely used Python library that focuses on scientific computing. NumPy creates a multidimensional array object, various derived objects, and an assortment of routines centered around performing quick operations on arrays. Developers can now write NumPy ufuncs directly in Cython. A simple numerical function written in Cython can be quickly and easily applied to the entire contents of a NumPy data structure. Though Cython and NumPy were always compatible, this new feature adds speed and more ease to development.
Now Cython’s build is more compatible with ongoing updates to Python’s internal changes. Python has a new “limited API” that exposes a guaranteed stable subset of Python’s APIs, specifically for the type of tasks Cython does to hook into the Python interpreter. Cython has the initial, with growing support, for the limited API. This means Cython extensions built for one version of Python will also work in future versions of Python without needing to be recompiled.
As someone who doesn’t work with Python often, I find it interesting that this is the third article in as many months that I’ve written about tools that deepen the connection between Python and C. It’s such a marker as to where the industry is going. Data sets get larger and larger, Python remains the go-to, and now these tools are either popping up or getting better to further support the growth.