Python Delights Excel Data Nerds Plus Data Lake Enthusiasts
The New Microsoft Excel add-in brings AI-powered Anaconda Assistant, curated data catalogs, and cloud features to Python in Excel users.
Anaconda Toolbox is a new suite of tools built to enhance the experience and capabilities of Python in Excel. The Toolbox will be accessible to current Python in Excel beta users through the Microsoft Marketplace.
Launched last month. Python in Excel now boasts new features added by Anaconda Toolbox that will enable developers to use Python in Excel, even if they don’t know Python. Included in Toolbox is Anaconda Assistant, the recently released AI assistant designed specifically for Python users and data scientists, which can guide you in your first steps or supercharge your work, even if you have advanced experience.
Python in Excel beta users can sign up to experience Anaconda Toolbox today.
Anaconda Toolbox enables anyone, regardless of experience, to quickly generate code and visualizations while learning Python along the way, the company said. Because the code runs in Excel, you know how it will work when you share the file with others, even if they don’t have Toolbox.
“The AI revolution has triggered an explosion in creativity and productivity. The Anaconda Toolbox fits neatly in that same area as it provides the perfect on-ramp for advanced data science and AI with Python,” said Timothy Hewitt, Senior Product Manager for Python in Excel at Anaconda. “We understand that many Excel users have never used Python, that’s why we included our AI-powered Anaconda Assistant. This AI-assistant helps users accomplish what they need using natural language without needing to know all of the underlying Python code. Whether you need to visualize a data set, develop a script, or quickly generate insights, the Anaconda Assistant makes that possible — and it’s now just one click away.”
Ask the Assistant
Know what you want to do, but don’t know how to do it in Python? Just ask Anaconda Assistant, the company says. When it gives you the code, just push it to the Excel grid, where you can edit and run it just like other Python code. If you start with one of our provided prompts, it will analyze your tables and recommend different ways of working with your data.
Microsoft has released Python in Excel as a Public Preview to its Insiders Beta Channel so it is still early days for the technology but the company will continue to roll out updates on: improved editing experiences (such as autocomplete and syntax highlighting), default repairs, enhanced error behaviors, help and documentation, and more, said Stefan Kinnestrand, a general manager of product marketing/management at Microsoft in a blog post.
With Python in Excel, users can integrate Python and Excel analytics within the same Excel grid for uninterrupted workflow.
“Python in Excel combines Python’s powerful data analysis and visualization libraries with Excel’s features you know and love,” Kinnestrand said. “You can manipulate and explore data in Excel using Python plots and libraries, and then use Excel’s formulas, charts and PivotTables to further refine your insights.”
To help with this integration, Microsoft has partnered with Anaconda, a leading enterprise-grade Python repository used by tens of millions of data practitioners worldwide. Microsoft said Python in Excel leverages Anaconda Distribution for Python running in Azure, which includes the most popular Python libraries such as pandas for data manipulation, statsmodels for advanced statistical modeling, and Matplotlib and seaborn for data visualization.
“Python has become the lingua Franca and Swiss Army Knife of working with data, and it’s the de facto language of data science and machine learning,” said Andrew Brust, CEO of Blue Badge Insights, a data consultancy. “It’s present in Microsoft Fabric, Azure Synapse Analytics, Azure Machine Learning, Azure Databricks, Visual Studio, VS Code, SQL Server and Power BI. And since Microsoft and Anaconda have collaborated around many of these integrations, doing so in the Excel case was almost a foregone conclusion.”
In 2022 Anaconda launched PyScript, a web-based tool for coding in the browser and deploying apps with the click of a button. The company also launched Anaconda Learning to help people build foundational skills in Python, data visualization, machine learning, and more.
Python education is part of Anaconda’s mission. Every day more and more people are starting to learn Python and for most Anaconda is their first stop in that journey.
“We want to see the Python community continue to grow, so we’ve developed an extensive library of free educational content and certificates to that have helped thousands of new users break into a whole new world of data science and AI,” Hewitt told The New Stack. “The Anaconda Toolbox for Python in Excel absolutely extends our mission of Python education. In the toolbox, users can find a curated selection of open-source data sets to test new data science skills and the built-in Anaconda Assistant can be used to guide users in self-learning, evaluate code, and explain the code it develops.”
Ibis and PyStarburst
Meanwhile, Starburst, the data lake analytics platform, recently announced extended support for Python and a new integration with the open source Python library, Ibis (built in collaboration with Voltron Data) to reinforce its commitment to openness.
For developers and data engineers used to working with PySpark and Snowpark, PyStarburst provides a familiar syntax that makes it easy to not only build new data pipelines but also migrate existing pipelines to Starburst without rewriting lots of code. Meanwhile, the new Ibis integration provides a uniform Python API and an open backend that connects to any cloud data source so that data and software engineers can build and scale data applications from development to production without rewriting code.
“Many data engineers prefer writing code over SQL for transformations, and many software engineers are used to building data applications in Python. With PyStarburst, we’re giving them the freedom to do so with the increased productivity and performance of Starburst’s enterprise-grade Trino,” said Martin Traverso, CTO of Starburst, in a statement.
For developers and data engineers looking to build scalable data applications, the new Ibis integration provides a uniform Python API that can execute queries on more than 18 different engines — including DuckDB, Pandas, PostgreSQL, and now Starburst Galaxy. This means you can scale from development on a laptop to production in Galaxy without rewriting a single line of code.
There’s a lot of tooling going into the ecosystem, the analytic data transformation data engineering base built around Python, there are libraries for doing machine learning data science, Traverso told The New Stack. So Python tests tend to be like glue for everything. And that’s the language that all the data scientists use on a day-to-day basis. They’re building AI models, they’re interacting with data, engine data, permission engines to massage their data to provide to their AI modeling systems. And Python happens to be their tool of choice, so yeah, we see a lot of a lot of people rely on that. If you look at Spark, Spark started as built in Scala, and originally the APIs were built around Scala which was a hard language to deal with. And for the regular programmers, Python is a little more flexible, a lot easier to pick up. So there’s a whole Python ecosystem that’s built around that. And eventually became the language of choice to interact with Spark. And therefore, anyone that’s dealing with, with data processing at large scale with Spark will be familiar with that. So we’re kind of capitalizing on that, on that investment, that expertise and trying to bring that to the Starburst, he noted. At Starburst everything is built with openness in mind, and we are interoperable with nearly any data environment, so we’re extending that commitment to our programming languages. The partnership with Voltron Data and Ibis was a natural fit,” said Harrison Johnson, Head of Technology Partnerships at Starburst.
Together, Ibis and Starburst Galaxy empower users to write portable Python code that executes on Starburst’s high-performance data lake analytics engine, operating on data from more than 50 supported sources. Users will now be able to build analytic expressions across multiple data sources with reusable scripts that execute at any scale.
“Python users struggle to bridge the gap between prototypes on their laptops and production apps running on platforms like Starburst Galaxy. Ibis makes it much easier to bridge this gap,” said Josh Patterson, CEO of Voltron Data. “With Ibis, you can write Python code once and run it anywhere, with any supported backend execution engine. You can move seamlessly from crunching gigabyte-scale test data on your laptop to crunching petabyte-scale data in production using Starburst Galaxy.”