Starburst Introduces Python DataFrame Support
September 07, 2023

Starburst extended support for Python with PyStarburst, and announced a new integration with the open source Python library Ibis, built in collaboration with composable data systems builder and Ibis maintainer Voltron Data.

For Starburst and Trino developers and data engineers, this announcement means that they no longer need to offload data to frameworks like PySpark and Snowpark to handle complex transformation workloads. Instead, teams can leverage a single, powerful MPP engine for both their analytical and transformation workloads – reducing the cost and complexity of their stack.

PyStarburst provides a familiar syntax to PySpark and Snowpark for writing and running production-grade ETL pipelines and data transformations, making it easy to not only build new pipelines with PyStarburst but also to migrate existing PySpark and Snowpark pipelines to Starburst without rewriting code.

"Many data engineers prefer writing code over SQL for transformations, and many software engineers are used to building data applications in Python. With PyStarburst, we're giving them the freedom to do so with the increased productivity and performance of Starburst's enterprise-grade Trino," said Martin Traverso, CTO of Starburst.

For developers and data engineers looking to build scalable data applications, the new Ibis integration provides a uniform Python API that can execute queries on more than 18 different engines – including DuckDB, pandas, PostgreSQL, and now Starburst Galaxy. This means you can scale from development on a laptop to production in Galaxy without rewriting a single line of code.

"At Starburst everything is built with openness in mind, and we are interoperable with nearly any data environment, so we're extending that commitment to our programming languages. The partnership with Voltron Data and Ibis was a natural fit," said Harrison Johnson, Head of Technology Partnerships at Starburst.

Together, Ibis and Starburst Galaxy empower users to write portable Python code that executes on Starburst's high-performance data lake analytics engine, operating on data from more than 50 supported sources. Users will now be able to build analytic expressions across multiple data sources with reusable scripts that execute at any scale.

"Python users struggle to bridge the gap between prototypes on their laptops and production apps running on platforms like Starburst Galaxy. Ibis makes it much easier to bridge this gap," said Josh Patterson, CEO of Voltron Data. "With Ibis, you can write Python code once and run it anywhere, with any supported backend execution engine. You can move seamlessly from crunching gigabyte-scale test data on your laptop to crunching petabyte-scale data in production using Starburst Galaxy."

Share this

Industry News

April 25, 2024

JFrog announced a new machine learning (ML) lifecycle integration between JFrog Artifactory and MLflow, an open source software platform originally developed by Databricks.

April 25, 2024

Copado announced the general availability of Test Copilot, the AI-powered test creation assistant.

April 25, 2024

SmartBear has added no-code test automation powered by GenAI to its Zephyr Scale, the solution that delivers scalable, performant test management inside Jira.

April 24, 2024

Opsera announced that two new patents have been issued for its Unified DevOps Platform, now totaling nine patents issued for the cloud-native DevOps Platform.

April 23, 2024

mabl announced the addition of mobile application testing to its platform.

April 23, 2024

Spectro Cloud announced the achievement of a new Amazon Web Services (AWS) Competency designation.

April 22, 2024

GitLab announced the general availability of GitLab Duo Chat.

April 18, 2024

SmartBear announced a new version of its API design and documentation tool, SwaggerHub, integrating Stoplight’s API open source tools.

April 18, 2024

Red Hat announced updates to Red Hat Trusted Software Supply Chain.

April 18, 2024

Tricentis announced the latest update to the company’s AI offerings with the launch of Tricentis Copilot, a suite of solutions leveraging generative AI to enhance productivity throughout the entire testing lifecycle.

April 17, 2024

CIQ launched fully supported, upstream stable kernels for Rocky Linux via the CIQ Enterprise Linux Platform, providing enhanced performance, hardware compatibility and security.

April 17, 2024

Redgate launched an enterprise version of its database monitoring tool, providing a range of new features to address the challenges of scale and complexity faced by larger organizations.

April 17, 2024

Snyk announced the expansion of its current partnership with Google Cloud to advance secure code generated by Google Cloud’s generative-AI-powered collaborator service, Gemini Code Assist.

April 16, 2024

Kong announced the commercial availability of Kong Konnect Dedicated Cloud Gateways on Amazon Web Services (AWS).

April 16, 2024

Pegasystems announced the general availability of Pega Infinity ’24.1™.