Books, Data Pipeline and Data Quality

Books

Data Pipeline

Data Quality

10 Best Data Engineering Books [Beginners to Advanced]

Pickl AI

AUGUST 1, 2023

Aspiring and experienced Data Engineers alike can benefit from a curated list of books covering essential concepts and practical techniques. These 10 Best Data Engineering Books for beginners encompass a range of topics, from foundational principles to advanced data processing methods. What is Data Engineering?

Data Engineering

Data Engineering Data Engineering Data Engineer Data Engineering

ODSC West 2023 Recap in Pictures

ODSC - Open Data Science

DECEMBER 5, 2023

ODSC West is now a part of our history books, and we couldn’t be happier with how everything turned out. We had our first-ever Halloween party, more book signings, exciting keynotes, and plenty of sessions to fit everyone’s needs.

Data Science

Data Science Artificial Intelligence Artificial Intelligence Machine Learning

Join 17,000+

professionals

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Webinars

What’s New in Apache Airflow® 3.0—And How Will It Reshape Your Data Workflows?

MORE WEBINARS

Trending Sources

What is the Pile Dataset

Pickl AI

DECEMBER 25, 2024

Its diverse content includes academic papers, web data, books, and code. EleutherAI created the Pile to democratise AI research with high-quality, accessible data. Diversity of Sources : The Pile integrates 22 distinct datasets, including scientific articles, web content, books, and programming code.

Natural Language Processing

Natural Language Processing Machine Learning Machine Learning AI

Webinars

What’s New in Apache Airflow® 3.0—And How Will It Reshape Your Data Workflows?

MORE WEBINARS

HCLTech’s AWS powered AutoWise Companion: A seamless experience for informed automotive buyer decisions with data-driven design

AWS Machine Learning Blog

JANUARY 15, 2025

The solution contains the following processing layers: Data pipeline The various data sources, such as sales transactional data, unstructured QRT reports, social media reviews in JSON format, and vehicle metadata, are processed, transformed, and stored in the respective databases.

AWS

AWS SQL AI AI

The importance of data ingestion and integration for enterprise AI

IBM Journey to AI blog

JANUARY 9, 2024

The entire generative AI pipeline hinges on the data pipelines that empower it, making it imperative to take the correct precautions. 4 key components to ensure reliable data ingestion Data quality and governance: Data quality means ensuring the security of data sources, maintaining holistic data and providing clear metadata.

AI AI Data Quality Data Pipeline

Build trust in banking with data lineage

IBM Journey to AI blog

APRIL 20, 2023

Before a bank can start the process of certifying a risk model, they first need to understand what data is being used and how it changes as it moves from a database to a model.

Database

Database Data Engineering Data Engineering Data Engineer

Five benefits of a data catalog

IBM Journey to AI blog

DECEMBER 16, 2022

You have a specific book in mind, but you have no idea where to find it. You enter the title of the book into the computer and the library’s digital inventory system tells you the exact section and aisle where the book is located. Ensuring data quality is made easier as a result.

Data Quality

Data Quality Data Governance Data Scientist Data Wrangling

Architect a mature generative AI foundation on AWS

Flipboard

MAY 30, 2025

Data quality is ownership of the consuming applications or data producers. Governance The two key areas of governance are model and data: Model governance Monitor model for performance, robustness, and fairness. For model security, custom model weights should be encrypted and isolated for different tenants.

AWS

AWS AI AI Database

Know Before You Go: Precisely at Confluent’s Current 2023

Precisely

SEPTEMBER 12, 2023

As a proud member of the Connect with Confluent program , we help organizations going through digital transformation and IT infrastructure modernization break down data silos and power their streaming data pipelines with trusted data. Book your meeting with us at Confluent’s Current 2023. See you in San Jose!

Data Silos

Data Silos Apache Kafka Data Pipeline Data Quality

Data Governance for Dummies: Your Questions, Answered

Alation

FEBRUARY 17, 2023

In this blog, I’ll address some of the questions we did not have time to answer live, pulling from both Dr. Reichental’s book as well as my own experience as a data governance leader for 30+ years. Can you have proper data management without establishing a formal data governance program?

Data Governance

Data Governance Data Quality Data Analyst Data Pipeline

Scale knowledge management use cases with generative AI

IBM Journey to AI blog

JULY 27, 2023

Data quality strongly impacts the quality and usefulness of content produced by an AI model, underscoring the significance of addressing data challenges. Data lakehouses improve the efficiency of deploying AI and the generation of data pipelines.

AI AI Data Scientist Data Quality

Why data governance is essential for enterprise AI

IBM Journey to AI blog

AUGUST 23, 2023

It is really well done, but as someone who spends all my time working on data governance and privacy, that top left section of “contextual data → data pipelines” is missing something: data governance. The post Why data governance is essential for enterprise AI appeared first on IBM Blog.

Data Governance

Data Governance AI AI Artificial Intelligence

Secrets from Data Governance Leaders: DGIQ West 2023 (June 5 – 9)

Alation

MAY 31, 2023

American Family Insurance: Governance by Design – Not as an Afterthought Who: Anil Kumar Kunden , Information Standards, Governance and Quality Specialist at AmFam Group When: Wednesday, June 7, at 2:45 PM Why attend: Learn how to automate and accelerate data pipeline creation and maintenance with data governance, AKA metadata normalization.

Data Governance

Data Governance DataOps Data Pipeline Business Intelligence

LLMOps vs. MLOps: Understanding the Differences

Iguazio

FEBRUARY 8, 2024

To read more about LLMOps and MLOps, checkout the O’Reilly book “Implementing MLOps in the Enterprise” , authored by Iguazio ’s CTO and co-founder Yaron Haviv and by Noah Gift. Continuous monitoring of resources, data, and metrics. Data Pipeline - Manages and processes various data sources. What is LLMOps?

ML ML Data Scientist AI

The Evolution of Customer Data Modeling: From Static Profiles to Dynamic Customer 360

phData

SEPTEMBER 27, 2024

Activity Schema Modeling: Capturing the Customer Journey in Action Now that we’ve got our Lego blocks of customer data, let’s talk about another game-changing approach that’s shaking up the world of customer data modeling: Activity Schema Modeling. Your customer data game will never be the same.

Data Modeling

Data Modeling Data Models Apache Kafka Data Lakes

Data Science Current

10 Best Data Engineering Books [Beginners to Advanced]

ODSC West 2023 Recap in Pictures

Webinars

Trending Sources

What is the Pile Dataset

Webinars

HCLTech’s AWS powered AutoWise Companion: A seamless experience for informed automotive buyer decisions with data-driven design

The importance of data ingestion and integration for enterprise AI

Build trust in banking with data lineage

Five benefits of a data catalog

Architect a mature generative AI foundation on AWS

Know Before You Go: Precisely at Confluent’s Current 2023

Data Governance for Dummies: Your Questions, Answered

Scale knowledge management use cases with generative AI

Why data governance is essential for enterprise AI

Secrets from Data Governance Leaders: DGIQ West 2023 (June 5 – 9)

LLMOps vs. MLOps: Understanding the Differences

The Evolution of Customer Data Modeling: From Static Profiles to Dynamic Customer 360

Stay Connected