Analytics, Data Pipeline and Events

Data pipelines

Dataconomy

JUNE 3, 2025

Data pipelines are essential in our increasingly data-driven world, enabling organizations to automate the flow of information from diverse sources to analytical platforms. What are data pipelines? Purpose of a data pipeline Data pipelines serve various essential functions within an organization.

Data Pipeline

Data Pipeline ETL Analytics Analytics

Streaming Langchain: Real-time Data Processing with AI

Data Science Dojo

NOVEMBER 25, 2024

Artificial intelligence (AI) and natural language processing (NLP) technologies are evolving rapidly to manage live data streams. They power everything from chatbots and predictive analytics to dynamic content creation and personalized recommendations.

AI

AI AI Predictive Analytics Python

Complex Event Processing (CEP)

Dataconomy

MARCH 11, 2025

Complex Event Processing (CEP) is at the forefront of modern analytics, enabling organizations to extract valuable insights from vast streams of real-time data. As industries evolve, the ability to process and respond to events in the moment becomes mission-critical. What is Complex Event Processing (CEP)?

Apache Kafka

Apache Kafka Machine Learning Machine Learning Data Mining

Webinars

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Unlocking near real-time analytics with petabytes of transaction data using Amazon Aurora Zero-ETL integration with Amazon Redshift and dbt Cloud

Flipboard

NOVEMBER 27, 2024

While customers can perform some basic analysis within their operational or transactional databases, many still need to build custom data pipelines that use batch or streaming jobs to extract, transform, and load (ETL) data into their data warehouse for more comprehensive analysis. or a later version) database.

ETL

ETL Data Warehouse Analytics Analytics

Exploring the Power of Microsoft Fabric: A Hands-On Guide with a Sales Use Case

Data Science Dojo

SEPTEMBER 11, 2024

Microsoft Fabric aims to reduce unnecessary data replication, centralize storage, and create a unified environment with its unique data fabric method. Microsoft Fabric is a cutting-edge analytics platform that helps data experts and companies work together on data projects. What is Microsoft Fabric?

Power BI

Power BI Data Pipeline Data Warehouse Data Engineering

Streaming Data Pipelines: What Are They and How to Build One

Precisely

DECEMBER 28, 2023

The concept of streaming data was born of necessity. More than ever, advanced analytics, ML, and AI are providing the foundation for innovation, efficiency, and profitability. But insights derived from day-old data don’t cut it. Business success is based on how we use continuously changing data.

Data Pipeline

Data Pipeline Apache Kafka Big Data Big Data

Airbyte: The ultimate workhorse for all your ELT pipelines

Data Science Dojo

JANUARY 27, 2023

An ELT pipeline is a data pipeline that extracts (E) data from a source, loads (L) the data into a destination, and then transforms (T) data after it has been stored in the destination. If you can’t import all your data, you may only have a partial picture of your business.

Azure

Azure Data Science Data Pipeline Data Engineering

Build generative AI applications quickly with Amazon Bedrock IDE in Amazon SageMaker Unified Studio

AWS Machine Learning Blog

DECEMBER 4, 2024

Through simple conversations, business teams can use the chat agent to extract valuable insights from both structured and unstructured data sources without writing code or managing complex data pipelines. This will provision the backend infrastructure and services that the sales analytics application will rely on.

AWS

AWS AI AI SQL

Data science vs data analytics: Unpacking the differences

IBM Journey to AI blog

SEPTEMBER 19, 2023

Though you may encounter the terms “data science” and “data analytics” being used interchangeably in conversations or online, they refer to two distinctly different concepts. Meanwhile, data analytics is the act of examining datasets to extract value and find answers to specific questions.

Data Science

Data Science Analytics Analytics Data Scientist

Data sips and bites: An evening of data insights

Dataconomy

JULY 29, 2024

Hosted at one of Mindspace’s coworking locations, the event was a convergence of insightful talks and professional networking. Mindspace , a global coworking and flexible office provider with over 45 locations worldwide, including 13 in Germany, offered a conducive environment for this knowledge-sharing event.

Apache Kafka

Apache Kafka Data Pipeline Data Warehouse ETL

Shaping the future: OMRON’s data-driven journey with AWS

AWS Machine Learning Blog

APRIL 3, 2025

At the heart of this transformation is the OMRON Data & Analytics Platform (ODAP), an innovative initiative designed to revolutionize how the company harnesses its data assets. The robust security features provided by Amazon S3, including encryption and durability, were used to provide data protection.

AWS

AWS Data Governance Data Silos SQL

Real‑time data streaming architecture: The essential guide to AI‑ready pipelines and instant personalization

Dataconomy

MAY 16, 2025

Six core principles of a realtime streaming pipeline Drawing on Matus Tomleins stepbystep Implementation Guide: Building an AIReady Data Pipeline Architecture , you can anchor any streaming stack around six nonnegotiables: Explicit data requirements. Schemafirst design. Robust ingestion. Duallayer storage.

AI

AI AI ETL ML

10 Data Engineering Topics and Trends You Need to Know in 2024

ODSC - Open Data Science

JANUARY 9, 2024

Data Observability and Monitoring Data observability is the ability to monitor and troubleshoot data pipelines. Data monitoring is the process of collecting and analyzing data about data pipelines to identify and resolve problems. Interested in attending an ODSC event?

Data Engineering

Data Engineering Data Engineering Data Engineer Data Engineering

Accelerate disaster response with computer vision for satellite imagery using Amazon SageMaker and Amazon Augmented AI

AWS Machine Learning Blog

FEBRUARY 24, 2023

Solution overview In brief, the solution involved building three pipelines: Data pipeline – Extracts the metadata of the images Machine learning pipeline – Classifies and labels images Human-in-the-loop review pipeline – Uses a human team to review results The following diagram illustrates the solution architecture.

ML

ML ML AWS Data Pipeline

OfferUp improved local results by 54% and relevance recall by 27% with multimodal search on Amazon Bedrock and Amazon OpenSearch Service

AWS Machine Learning Blog

FEBRUARY 5, 2025

The following diagram illustrates the data pipeline for indexing and query in the foundational search architecture. The listing writer microservice publishes listing change events to an Amazon Simple Notification Service (Amazon SNS) topic, which an Amazon Simple Queue Service (Amazon SQS) queue subscribes to.

K-nearest Neighbors

K-nearest Neighbors Machine Learning Machine Learning Database

How to Load Google Analytics 4 Dataset into Snowflake with BigQuery & Azure Data Factory

phData

SEPTEMBER 5, 2023

Google Analytics 4 (GA4) is a powerful tool for collecting and analyzing website and app data that many businesses rely heavily on to make informed business decisions. However, there might be instances where you need to migrate the raw event data from GA4 to Snowflake for more in-depth analysis and business intelligence purposes.

Azure

Azure Analytics Analytics Data Pipeline

Reducing hallucinations in LLM agents with a verified semantic cache using Amazon Bedrock Knowledge Bases

AWS Machine Learning Blog

FEBRUARY 21, 2025

If the question was Whats the schedule for AWS events in December?, AWS usually announces the dates for their upcoming # re:Invent event around 6-9 months in advance. Rajesh Nedunuri is a Senior Data Engineer within the Amazon Worldwide Returns and ReCommerce Data Services team.

AWS

AWS Natural Language Processing Machine Learning Machine Learning

How a modern data stack is unlocking agility across the retail industry

Tableau

MAY 19, 2021

Retailers are also dealing with online shopping surges that add new complexities to existing data strategies due to an influx of raw, unprepped, and largely underutilized data. . Analytics agility is a competitive advantage in retail today—and it will be table stakes for retail success tomorrow. What is a modern data stack?

Tableau

Tableau Cloud Data Data Pipeline Analytics

Ways Big Data Creates a Better Customer Experience In Fintech

Smart Data Collective

SEPTEMBER 19, 2022

billion on financial analytics by 2030. And Big Data is one such excellent opportunity ! Big Data is the collection and processing of huge volumes of different data types, which financial institutions use to gain insights into their business processes and make key company decisions. Improving Risk Assessment.

Big Data

Big Data Big Data Big Data Analytics Big Data Analytics

How Kakao Games automates lifetime value prediction from game data using Amazon SageMaker and AWS Glue

AWS Machine Learning Blog

MARCH 1, 2023

This post is co-written with Suhyoung Kim, General Manager at KakaoGames Data Analytics Lab. The result of these events can be evaluated afterwards so that they make better decisions in the future. With this proactive approach, Kakao Games can launch the right events at the right time. However, this approach is reactive.

AWS

AWS ML ML ETL

The journey of PGA TOUR’s generative AI virtual assistant, from concept to development to prototype

AWS Machine Learning Blog

MARCH 14, 2024

In this post we highlight how the AWS Generative AI Innovation Center collaborated with the AWS Professional Services and PGA TOUR to develop a prototype virtual assistant using Amazon Bedrock that could enable fans to extract information about any event, player, hole or shot level details in a seamless interactive manner.

SQL

SQL AWS AI AI

How to Unlock Real-Time Analytics with Snowflake?

phData

MAY 3, 2024

Leveraging real-time analytics to make informed decisions is the golden standard for virtually every business that collects data. If you have the Snowflake Data Cloud (or are considering migrating to Snowflake ), you’re a blog away from taking a step closer to real-time analytics.

Analytics

Analytics Analytics Apache Kafka ETL

AIOps vs. MLOps: Harnessing big data for “smarter” ITOPs

IBM Journey to AI blog

AUGUST 12, 2024

Whereas AIOps is a comprehensive discipline that includes a variety of analytics and AI initiatives that are aimed at optimizing IT operations, MLOps is specifically concerned with the operational aspects of ML models, promoting efficient deployment, monitoring and maintenance.

Big Data

Big Data Big Data ML ML

Apache Kafka and Apache Flink: An open-source match made in heaven

IBM Journey to AI blog

NOVEMBER 3, 2023

Apache Kafka and Apache Flink working together Anyone who is familiar with the stream processing ecosystem is familiar with Apache Kafka: the de-facto enterprise standard for open-source event streaming. Apache Kafka streams get data to where it needs to go, but these capabilities are not maximized when Apache Kafka is deployed in isolation.

Apache Kafka

Apache Kafka Data Warehouse Data Pipeline Big Data

Feature Platforms?—?A New Paradigm in Machine Learning Operations (MLOps)

IBM Data Science in Practice

MARCH 8, 2023

— Features In machine learning, a feature is data that is used as the input for ML models to make predictions. Source: Advancing Analytics Data scientists and data engineers often spend a large amount of their time crafting features, as they are the basic building blocks of datasets. Spark, Flink, etc.)

Machine Learning

Machine Learning Machine Learning ML ML

A Guide to Choose the Best Data Science Bootcamp

Data Science Dojo

JULY 3, 2024

Data Engineering : Building and maintaining data pipelines, ETL (Extract, Transform, Load) processes, and data warehousing. Career Support Some bootcamps include job placement services like resume assistance, mock interviews, networking events, and partnerships with employers to aid in job placement.

Data Science

Data Science Machine Learning Machine Learning Data Visualization

Comparing Tools For Data Processing Pipelines

The MLOps Blog

MARCH 15, 2023

In this post, you will learn about the 10 best data pipeline tools, their pros, cons, and pricing. A typical data pipeline involves the following steps or processes through which the data passes before being consumed by a downstream process, such as an ML model training process.

Data Pipeline

Data Pipeline ETL SQL Data Quality

Identify cybersecurity anomalies in your Amazon Security Lake data using Amazon SageMaker

AWS Machine Learning Blog

DECEMBER 20, 2023

Whether logs are coming from Amazon Web Services (AWS), other cloud providers, on-premises, or edge devices, customers need to centralize and standardize security data. After the security log data is stored in Amazon Security Lake, the question becomes how to analyze it. Deploy the trained ML model to a SageMaker inference endpoint.

AWS

AWS ML ML Algorithm

Hybrid Vs. Multi-Cloud: 5 Key Comparisons in Kafka Architectures

Smart Data Collective

AUGUST 17, 2022

Kafka And ETL Processing: You might be using Apache Kafka for high-performance data pipelines, stream various analytics data, or run company critical assets using Kafka, but did you know that you can also use Kafka clusters to move data between multiple systems.

Apache Kafka

Apache Kafka ETL Data Lakes AWS

Agentic AI and AI‑ready data: Transforming consumer‑facing applications

Dataconomy

MAY 14, 2025

That means feeding them streams of high-quality information about user actions, events, and context in real time. So, what exactly is AI-ready data? Simply put, AI-ready data is structured, high-quality information that can be easily used to train machine learning models and run AI applications with minimal engineering effort .

AI

AI AI Data Warehouse Data Pipeline

Data Analytics in the Age of AI, When to Use RAG, Examples of Data Visualization with D3 and Vega…

ODSC - Open Data Science

APRIL 4, 2024

Data Analytics in the Age of AI, When to Use RAG, Examples of Data Visualization with D3 and Vega, and ODSC East Selling Out Soon Data Analytics in the Age of AI Let’s explore the multifaceted ways in which AI is revolutionizing data analytics, making it more accessible, efficient, and insightful than ever before.

Data Visualization

Data Visualization Analytics Analytics Big Data Analytics

Apache Kafka use cases: Driving innovation across diverse industries

IBM Journey to AI blog

SEPTEMBER 4, 2024

Apache Kafka is an open-source , distributed streaming platform that allows developers to build real-time, event-driven applications. With Apache Kafka, developers can build applications that continuously use streaming data records and deliver real-time experiences to users.

Apache Kafka

Apache Kafka Internet of Things Data Pipeline Clustering

Apache Flink for all: Making Flink consumable across all areas of your business

IBM Journey to AI blog

AUGUST 29, 2024

Event-driven businesses across all industries thrive on real-time data, enabling companies to act on events as they happen rather than after the fact. Flink jobs, designed to process continuous data streams, are key to making this possible. They are able to adapt to changing demands quickly to seize new opportunities.

Apache Kafka

Apache Kafka Hadoop ETL Data Pipeline

Training Models on Streaming Data [Practical Guide]

The MLOps Blog

FEBRUARY 5, 2023

In the later part of this article, we will discuss its importance and how we can use machine learning for streaming data analysis with the help of a hands-on example. What is streaming data? A streaming data pipeline is an enhanced version which is able to handle millions of events in real-time at scale.

Machine Learning

Machine Learning Machine Learning Data Pipeline Apache Kafka

What Does a Data Engineering Job Involve in 2024?

ODSC - Open Data Science

JANUARY 30, 2024

Not only does it involve the process of collecting, storing, and processing data so that it can be used for analysis and decision-making, but these professionals are responsible for building and maintaining the infrastructure that makes this possible; and so much more. Think of data engineers as the architects of the data ecosystem.

Data Engineering

Data Engineering Data Engineering Data Engineer Data Engineering

Optimize pet profiles for Purina’s Petfinder application using Amazon Rekognition Custom Labels and AWS Step Functions

AWS Machine Learning Blog

OCTOBER 18, 2023

AWS Lambda is an event-driven compute service that lets you run code for virtually any type of application or backend service without provisioning or managing servers. Rushikesh Jagtap is a Solutions Architect with 5+ years of experience in AWS Analytics services.

AWS

AWS ML ML Machine Learning

The Role of RTOS in the Future of Big Data Processing

ODSC - Open Data Science

JUNE 19, 2023

As the name suggests, real-time operating systems (RTOS) handle real-time applications that undertake data and event processing under a strict deadline. For big data to harness these sources, it is important to have the ability to integrate data and make them interoperable across different sources.

Big Data

Big Data Big Data Artificial Intelligence Artificial Intelligence

Top 5 Fivetran Connectors for Healthcare

phData

APRIL 29, 2024

In our previous blog, Top 5 Fivetran Connectors for Financial Services , we explored Fivetran’s capabilities that address the data integration needs of the finance industry. Now, let’s cover the healthcare industry, which also has a surging demand for data and analytics, along with the underlying processes to make it happen.

SQL

SQL Data Warehouse Azure Cloud Data

Find Your AI Solutions at the ODSC West AI Expo

ODSC - Open Data Science

OCTOBER 15, 2023

Institute of Analytics The Institute of Analytics is a non-profit organization that provides data science and analytics courses, workshops, certifications, research, and development. The courses and workshops cover a wide range of topics, from basic data science concepts to advanced machine learning techniques.

Machine Learning

Machine Learning Machine Learning Data Pipeline AI

How a modern data stack is unlocking agility across the retail industry

Tableau

MAY 19, 2021

Retailers are also dealing with online shopping surges that add new complexities to existing data strategies due to an influx of raw, unprepped, and largely underutilized data. . Analytics agility is a competitive advantage in retail today—and it will be table stakes for retail success tomorrow. What is a modern data stack?

Tableau

Tableau Cloud Data Data Pipeline Analytics

2024 Mexican Grand Prix: Formula 1 Prediction Challenge Results

Ocean Protocol

NOVEMBER 28, 2024

Introduction The Formula 1 Prediction Challenge: 2024 Mexican Grand Prix brought together data scientists to tackle one of the most dynamic aspects of racing — pit stop strategies. With every second on the track critical, the challenge showcased how data can shape decisions that define race outcomes.

Cross Validation

Cross Validation Data Scientist Decision Trees Data Science

Discover the Most Important Fundamentals of Data Engineering

Pickl AI

NOVEMBER 4, 2024

Data Engineering is designing, constructing, and managing systems that enable data collection, storage, and analysis. It involves developing data pipelines that efficiently transport data from various sources to storage solutions and analytical tools. ETL is vital for ensuring data quality and integrity.

Data Engineering

Data Engineering Data Engineering Data Engineer Data Engineering

11 Open-Source Data Engineering Tools Every Pro Should Use

ODSC - Open Data Science

FEBRUARY 6, 2024

Apache Kafka For data engineers dealing with real-time data, Apache Kafka is a game-changer. This open-source streaming platform enables the handling of high-throughput data feeds, ensuring that data pipelines are efficient, reliable, and capable of handling massive volumes of data in real-time.

Data Engineering

Data Engineering Data Engineering Data Engineer Data Engineering

Know Before You Go: Precisely at Confluent’s Current 2023

Precisely

SEPTEMBER 12, 2023

As a proud member of the Connect with Confluent program , we help organizations going through digital transformation and IT infrastructure modernization break down data silos and power their streaming data pipelines with trusted data. Let’s cover some additional information to know before attending.

Data Silos

Data Silos Apache Kafka Data Pipeline Data Quality

Data pipelines

Streaming Langchain: Real-time Data Processing with AI

Webinars

Trending Sources

Complex Event Processing (CEP)

Webinars

Unlocking near real-time analytics with petabytes of transaction data using Amazon Aurora Zero-ETL integration with Amazon Redshift and dbt Cloud

Exploring the Power of Microsoft Fabric: A Hands-On Guide with a Sales Use Case

Streaming Data Pipelines: What Are They and How to Build One

Airbyte: The ultimate workhorse for all your ELT pipelines

Build generative AI applications quickly with Amazon Bedrock IDE in Amazon SageMaker Unified Studio

Data science vs data analytics: Unpacking the differences

Data sips and bites: An evening of data insights

Shaping the future: OMRON’s data-driven journey with AWS

Real‑time data streaming architecture: The essential guide to AI‑ready pipelines and instant personalization

10 Data Engineering Topics and Trends You Need to Know in 2024

Accelerate disaster response with computer vision for satellite imagery using Amazon SageMaker and Amazon Augmented AI

OfferUp improved local results by 54% and relevance recall by 27% with multimodal search on Amazon Bedrock and Amazon OpenSearch Service

How to Load Google Analytics 4 Dataset into Snowflake with BigQuery & Azure Data Factory

Reducing hallucinations in LLM agents with a verified semantic cache using Amazon Bedrock Knowledge Bases

How a modern data stack is unlocking agility across the retail industry

Ways Big Data Creates a Better Customer Experience In Fintech

How Kakao Games automates lifetime value prediction from game data using Amazon SageMaker and AWS Glue

The journey of PGA TOUR’s generative AI virtual assistant, from concept to development to prototype

How to Unlock Real-Time Analytics with Snowflake?

AIOps vs. MLOps: Harnessing big data for “smarter” ITOPs

Apache Kafka and Apache Flink: An open-source match made in heaven

Feature Platforms?—?A New Paradigm in Machine Learning Operations (MLOps)

A Guide to Choose the Best Data Science Bootcamp

Comparing Tools For Data Processing Pipelines

Identify cybersecurity anomalies in your Amazon Security Lake data using Amazon SageMaker

Hybrid Vs. Multi-Cloud: 5 Key Comparisons in Kafka Architectures

Agentic AI and AI‑ready data: Transforming consumer‑facing applications

Data Analytics in the Age of AI, When to Use RAG, Examples of Data Visualization with D3 and Vega…

Apache Kafka use cases: Driving innovation across diverse industries

Apache Flink for all: Making Flink consumable across all areas of your business

Training Models on Streaming Data [Practical Guide]

What Does a Data Engineering Job Involve in 2024?

Optimize pet profiles for Purina’s Petfinder application using Amazon Rekognition Custom Labels and AWS Step Functions

The Role of RTOS in the Future of Big Data Processing

Top 5 Fivetran Connectors for Healthcare

Find Your AI Solutions at the ODSC West AI Expo

How a modern data stack is unlocking agility across the retail industry

2024 Mexican Grand Prix: Formula 1 Prediction Challenge Results

Discover the Most Important Fundamentals of Data Engineering

11 Open-Source Data Engineering Tools Every Pro Should Use

Know Before You Go: Precisely at Confluent’s Current 2023

Stay Connected