Blog - Data Science Current

Performance Improvements for Stateful Pipelines in Apache Spark Structured Streaming

databricks

FEBRUARY 27, 2024

Introduction Apache Spark™ Structured Streaming is a popular open-source stream processing platform that provides scalability and fault tolerance, built on top of the S.

Data Engineering

Data Engineering Data Engineer Data Engineering Data Engineering

A Deep Dive into the Latest Performance Improvements of Stateful Pipelines in Apache Spark Structured Streaming

databricks

FEBRUARY 28, 2024

This post is the second part of our two-part series on the latest performance improvements of stateful pipelines. The first part of this.

Data Engineering

Data Engineering Data Engineer Data Engineering Data Engineering

Join 20,000+

professionals

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Webinars

Generative AI Deep Dive: Advancing from Proof of Concept to Production

Understanding User Needs and Satisfying Them

Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You Need to Know

The Project Clinic: Assessing Project Health, Planning, and Execution

Leading the Development of Profitable and Sustainable Products

MORE WEBINARS

Trending Sources

Mitigating Redundant UDF Computations in Spark Plans

Towards AI

FEBRUARY 12, 2024

Photo by Samuel Sianipar on Unsplash Originally published on my blog. It’s not uncommon to be caught up in long debugging cycles when working with Spark. I was recently caught in such a debugging train when one of my pipelines was taking longer than expected. When processing big data, efficiency is key.

Big Data

Big Data Big Data AI AI

Webinars

Generative AI Deep Dive: Advancing from Proof of Concept to Production

Understanding User Needs and Satisfying Them

Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You Need to Know

The Project Clinic: Assessing Project Health, Planning, and Execution

Leading the Development of Profitable and Sustainable Products

MORE WEBINARS

How to Build Machine Learning Systems With a Feature Store

The MLOps Blog

JANUARY 26, 2024

To generate value from your model, it should make many predictions, and these predictions should improve a product or lead to better decisions. In this article, I’ll introduce you to a unified architecture for ML systems built around the idea of FTI pipelines and a feature store as the central component. But what is an ML pipeline?

Machine Learning

Machine Learning Machine Learning ML ML

Performance Improvements for Stateful Pipelines in Apache Spark Structured Streaming

A Deep Dive into the Latest Performance Improvements of Stateful Pipelines in Apache Spark Structured Streaming

Webinars

Trending Sources

Mitigating Redundant UDF Computations in Spark Plans

Webinars

How to Build Machine Learning Systems With a Feature Store

Stay Connected