Culture / Data / Development

Q&A with Peter Wang Co-Founder and CEO of Anaconda

25 Nov 2021 6:00am, by

The Python programming language has been around since 1991 and has continued its rise in popularity thanks to big data, machine learning, the continued proliferation of websites, automation, and so much more. Python is also one of the easiest languages to learn, so it’s very beginner-friendly and grows along with a developer’s skill.

But what makes Python still so relevant? I had a Q&A with Peter Wang co-founder and CEO of Anaconda, which offers a widely-used Python distribution under the same name, to find out what it is that keeps Python so important to modern computing. Let’s find out what he had to say.

What makes Python so relevant today?

A simple answer would be that Python is relevant today because it was relevant yesterday; a network effect plays a significant role in its popularity. When you reach a certain point of market maturity, success begets success. But what are the foundations that led to Python becoming such an essential language? Python is an excellent language for data science and machine learning workflows, which have become crucial to many businesses today. However, its usefulness isn’t limited to that sphere; Python is also a great connective tissue because it is used for various tasks, from scripting to modeling, to building APIs. In many ways, Python is like super glue for different computing workloads, holding everything together.

In a cloud native world, what’s Python’s superpower?

Python has two main superpowers. First, it’s much more accessible to people due to its simplified syntax, making it easier to learn, write, and execute than other programming languages. At the same time, it’s a powerful language, so once you’ve learned it, there are a wide variety of tasks you can accomplish; in that sense, it’s a very democratizing technology.

“I fully expect Python to gain on Excel as a popular business computing tool because of how many students will enter the workforce already knowing how to use Python for data analysis.”

While it’s a dominant language for data science work, Python’s second superpower is that it’s extremely versatile, being good for many other jobs as well. Very few languages can successfully do so many different things, which is a big part of why Python is used as a glue technology in so many places.

What is in store for Python in the near future?

I see Python as continuing to grow in popularity, especially as businesses become increasingly data-driven. The growth of the community also means that the next few years will see Python evolving at an accelerated pace. I fully expect Python to gain on Excel as a popular business computing tool because of how many students will enter the workforce already knowing how to use Python for data analysis.

In your opinion, what’s the best use-case for Python?

Python has such a rich ecosystem of tools and capabilities that there is no single “best use case,” although it is the dominant language in AI, machine learning, and data science. However, there is a best “user” of Python, and that is someone who wants to spend more time focused on what they want to do with a computer, rather than getting mired in the minutiae of how to do it; Python is the language for them.

What makes Python one of the easiest languages to learn?

Its creator and the early community decided that the language should be accessible, easy-to-read, and unsurprising. Very few languages have that pedigree as a teaching language — and most teaching languages aren’t also designed for real production use cases. Additionally, the numerical and data science libraries were made by real scientists and researchers whose natural design ethos was to make the interfaces something pleasant that they themselves would want to use.

How does Python fit in with containerization?

This is use-case-dependent, but containerization is generally intended for portability between platforms and cloud hosts. Python is excellent for containerization because it’s simple to use in a container as you install and copy code. With tools like Conda [Python package manager], you can easily copy your code inside the container and run conda install, which allows you to have a fully working environment. Having a container image that includes Conda preinstalled can simplify things such as Python-based deployment or cross-platform test automation.

Python is much more accessible to people due to its simplified syntax, making it easier to learn, write, and execute than other programming languages.

Other tools, like the Binder Project, make it easy to “spin-up” a container and are used to create and share containerized Jupyter notebooks. Since data science involves a lot of exploratory workloads that have widely varying hardware needs, containers also aid with moving environments and software environments from development on a subset of data to execution on large-scale production data.

In a world obsessed with speed, what is Python doing to avoid the stigma of being too slow?

This is 90% a marketing problem. Sadly, a large number of professional software developers have typecast Python as a “slow, interpreted, scripting language.” Since many of them also do not write software for numerical computing (ironically, some of the most performance-intensive workloads), these traditional software engineers simply do not know that Python’s libraries are all powered by exceptionally efficient, fast low-level code.

There are a few legitimate complaints about the runtime speed and multithreaded performance of the CPython interpreter, and there are several different efforts underway to help optimize Python in these areas. In August this year, we at Anaconda announced our support for Pyston, a compiler that significantly improves the execution performance of most Python programs. Additionally, Microsoft has announced measures in this area.

How will Python remain so popular in the coming years?

Python will continue to grow in popularity and become even more ubiquitous in the coming years because it’s the connective tissue for so many things. For non-data-science workloads, Python is often the second-best language of choice. This has become a great strength for the language and has allowed it to connect multiple tasks in any given workflow.

You can take existing libraries such as C++ and Fortran and connect them using Python; it’s difficult to displace it with such a central role. And of course, it’s also the dominant programming language for data science workflows, which will continue to be important for businesses as they seek to be data-driven in their decision-making.

What can Python do to address the rise of low-code/no-code?

From a data science perspective, many practitioners welcome the rise of low-code or no-code tools. In our 2021 State of Data Science report, 55% of respondents stated they hope to see more automation and AutoML in data science. As always, low-code or no-code solutions should not displace the work of a data scientist, but help provide support for easily repeatable tasks.

Automation helps make the data science workflow more flexible and efficient, freeing practitioners to spend more time on complex activities like data exploration or developing model output analysis.

At Anaconda, the HoloViz group is busily developing low-code/no-code solutions for data analysis and visualization that allow all users to harness the power of these tools. With the potential and extensibility of Python, paired with low-code/no-code tools, practitioners can streamline their workflows.

It’s easy to be lured into the trap of thinking that “coding is too hard for most people,” but the data simply does not bear this out. Hundreds of millions of non-programmers are extremely comfortable with complex modeling in Excel, and this outnumbers all existing “visual programming” environments. The allure of “low-code” is a perennial desire in the IT space, but history shows that programmatic environments generally win.

The fundamental issue is one of complexity: the possibility space of algorithms, data types and schemas, and business problems massively exceed what can be simply cast into drag-and-drop environments. While visual, low-code/no-code tools will have their place for easy tasks, the value of being able to complete tasks quickly means that data analysts who can code will be massively more productive and valuable to their organizations.

Explain what Python is to a child who’s never written a single line of code.

Python is an easy language to learn. You can take it one step at a time, and you can do anything with it — from writing games to making music and programming robots. It’s a skill that you can master over time, and some of the most exciting (and best-paying) jobs in technology all depend on Python.

Python was used to discover black holes, launch SpaceX rockets, and it powers the AI behind Siri, Alexa, and Netflix. One day when we cure cancer or Alzheimer’s, Python will undoubtedly power the genomic tools that the scientists used. The only limit is your imagination! Python makes it easy to solve problems too complex for Excel or that need to be handled on a large scale.

To make it child-friendly, coding with Python is like playing with Legos. When you look at a single Lego, it doesn’t look like much. However, when you put a bunch of those blocks together, you can create complicated structures — like a replica of the Taj Mahal or Millennium Falcon.

Python code is similar in that it allows you to connect metaphorical building blocks to create complicated models from simple code.

Since Python is based on software code that is open to everyone, you can access the best solutions, and if you can’t find a solution that solves your problem, many people are willing to help you. Think of the Python community as many individuals who have access to the best Lego blocks and are eager to share them so you can make the best possible building, car, or airplane.