Starburst Introduces Python DataFrame Support
September 07, 2023

Starburst extended support for Python with PyStarburst, and announced a new integration with the open source Python library Ibis, built in collaboration with composable data systems builder and Ibis maintainer Voltron Data.

For Starburst and Trino developers and data engineers, this announcement means that they no longer need to offload data to frameworks like PySpark and Snowpark to handle complex transformation workloads. Instead, teams can leverage a single, powerful MPP engine for both their analytical and transformation workloads – reducing the cost and complexity of their stack.

PyStarburst provides a familiar syntax to PySpark and Snowpark for writing and running production-grade ETL pipelines and data transformations, making it easy to not only build new pipelines with PyStarburst but also to migrate existing PySpark and Snowpark pipelines to Starburst without rewriting code.

"Many data engineers prefer writing code over SQL for transformations, and many software engineers are used to building data applications in Python. With PyStarburst, we're giving them the freedom to do so with the increased productivity and performance of Starburst's enterprise-grade Trino," said Martin Traverso, CTO of Starburst.

For developers and data engineers looking to build scalable data applications, the new Ibis integration provides a uniform Python API that can execute queries on more than 18 different engines – including DuckDB, pandas, PostgreSQL, and now Starburst Galaxy. This means you can scale from development on a laptop to production in Galaxy without rewriting a single line of code.

"At Starburst everything is built with openness in mind, and we are interoperable with nearly any data environment, so we're extending that commitment to our programming languages. The partnership with Voltron Data and Ibis was a natural fit," said Harrison Johnson, Head of Technology Partnerships at Starburst.

Together, Ibis and Starburst Galaxy empower users to write portable Python code that executes on Starburst's high-performance data lake analytics engine, operating on data from more than 50 supported sources. Users will now be able to build analytic expressions across multiple data sources with reusable scripts that execute at any scale.

"Python users struggle to bridge the gap between prototypes on their laptops and production apps running on platforms like Starburst Galaxy. Ibis makes it much easier to bridge this gap," said Josh Patterson, CEO of Voltron Data. "With Ibis, you can write Python code once and run it anywhere, with any supported backend execution engine. You can move seamlessly from crunching gigabyte-scale test data on your laptop to crunching petabyte-scale data in production using Starburst Galaxy."

Share this

Industry News

February 27, 2024

MacStadium announced the launch of its online community to deepen the connections of application developers through knowledge sharing and collaboration.

February 27, 2024

Octopus Deploy announced the acquisition of Codefresh Inc.

February 26, 2024

Intel announced its new Edge Platform, a modular, open software platform enabling enterprises to develop, deploy, run, secure, and manage edge and AI applications at scale with cloud-like simplicity.

February 26, 2024

Tray.io announced AI-augmented API Management, a new Tray Universal Automation Cloud capability that turns any new or existing workflow into a reusable API, significantly decreasing the technical debt associated with the operational effort and costs of traditional API management (APIM).

February 26, 2024

Bitwarden Secrets Manager is now integrated with Ansible Playbook.

February 22, 2024

Check Point® Software Technologies Ltd. introduces Check Point Quantum Force series: an innovative lineup of ten high-performance firewalls designed to meet and exceed the stringent security demands of enterprise data centers, network perimeters, campuses, and businesses of all dimensions.

February 22, 2024

Tabnine announced that Tabnine Chat — the enterprise-grade, code-centric chat application that allows developers to interact with Tabnine AI models using natural language — is now available to all users.

February 22, 2024

Avaamo released Avaamo LLaMB™, a new low-code framework for building generative AI applications in the enterprise safely, securely, and fast.

February 21, 2024

CAST announced the winter release of CAST Imaging, an imaging system for software applications, with significant user experience (UX) enhancements and new features designed to simplify and accelerate processes for engineers who develop, maintain, modernize, complex software applications.

February 21, 2024

Pulumi now offers native ways to manage Pinecone indexes, including its latest serverless indexes.

February 21, 2024

Orkes, whose platform offers the fastest way to scale distributed systems, has raised $20 million in new funding.

February 20, 2024

JFrog and Carahsoft Technology announced a partnership that empowers U.S. Government organizations to safeguard their software supply chains with automated DevSecOps workflows to secure software services consumed by citizens.

February 20, 2024

Multiplayer, a collaborative tool for teams that work on system design and distributed software, announced its public beta.

February 20, 2024

DataStax announced its out-of-the-box retrieval augmented generation (RAG) solution, RAGStack, is now generally available powered by LlamaIndex as an open source framework, in addition to LangChain.

February 20, 2024

UiPath announced new features in its platform designed to enable developers to build, test, and accelerate implementation of automations.