Data Governance, Data Pipeline and Database

CI/CD for Data Pipelines: A Game-Changer with AnalyticsCreator

Data Science Blog

MAY 20, 2024

Continuous Integration and Continuous Delivery (CI/CD) for Data Pipelines: It is a Game-Changer with AnalyticsCreator! The need for efficient and reliable data pipelines is paramount in data science and data engineering. They transform data into a consistent format for users to consume.

Data Pipeline

Data Pipeline Data Warehouse Azure Data Lakes

Testing and Monitoring Data Pipelines: Part Two

Dataversity

JUNE 19, 2023

In part one of this article, we discussed how data testing can specifically test a data object (e.g., table, column, metadata) at one particular point in the data pipeline.

Data Pipeline

Data Pipeline Database Data Models Data Modeling

Data Fabric and Address Verification Interface

IBM Data Science in Practice

NOVEMBER 28, 2022

Implementing a data fabric architecture is the answer. What is a data fabric? Data fabric is defined by IBM as “an architecture that facilitates the end-to-end integration of various data pipelines and cloud environments through the use of intelligent and automated systems.”

Data Pipeline

Data Pipeline Data Quality Data Preparation Data Governance

Future trends in ETL

Dataconomy

FEBRUARY 12, 2024

The acronym ETL—Extract, Transform, Load—has long been the linchpin of modern data management, orchestrating the movement and manipulation of data across systems and databases. This methodology has been pivotal in data warehousing, setting the stage for analysis and informed decision-making. Image credit ) 5.

ETL

ETL Data Governance Machine Learning Machine Learning

An IBM Z Data Integration Success Story

Precisely

MARCH 28, 2025

Some departments used IBM Db2, while others relied on VSAM files or IMS databases creating complex data governance processes and costly data pipeline maintenance. They realized they needed a more automated, streamlined way to access the data.

Data Pipeline

Data Pipeline Database Data Governance Analytics

Architect a mature generative AI foundation on AWS

Flipboard

MAY 30, 2025

A generative AI foundation can provide primitives such as models, vector databases, and guardrails as a service and higher-level services for defining AI workflows, agents and multi-agents, tools, and also a catalog to encourage reuse. Considerations here are choice of vector database, optimizing indexing pipelines, and retrieval strategies.

AWS

AWS AI AI Database

Build trust in banking with data lineage

IBM Journey to AI blog

APRIL 20, 2023

This trust depends on an understanding of the data that inform risk models: where does it come from, where is it being used, and what are the ripple effects of a change? Moreover, banks must stay in compliance with industry regulations like BCBS 239, which focus on improving banks’ risk data aggregation and risk reporting capabilities.

Database

Database Data Engineering Data Engineer Data Engineering

Discover the Most Important Fundamentals of Data Engineering

Pickl AI

NOVEMBER 4, 2024

Key components include data modelling, warehousing, pipelines, and integration. Effective data governance enhances quality and security throughout the data lifecycle. What is Data Engineering? This section explores essential aspects of Data Engineering. from 2025 to 2030.

Data Engineer

Data Engineer Data Engineering Data Engineering Data Engineering

Self-Service Analytics for Google Cloud, now with Looker and Tableau

Tableau

OCTOBER 8, 2021

Connecting directly to this semantic layer will help give customers access to critical business data in a safe, governed manner. This partnership makes data more accessible and trusted. Your data in the cloud. Tableau Prep allows you to combine, reshape, and clean data using an easy-to-use, visual, and direct interface.

Tableau

Tableau Analytics Analytics Machine Learning

How data engineers tame Big Data?

Dataconomy

FEBRUARY 23, 2023

Collecting, storing, and processing large datasets Data engineers are also responsible for collecting, storing, and processing large volumes of data. This involves working with various data storage technologies, such as databases and data warehouses, and ensuring that the data is easily accessible and can be analyzed efficiently.

Big Data

Big Data Big Data Data Engineering Data Engineer

Cataloging MicroStrategy

Alation

FEBRUARY 20, 2020

Alation’s deep integration with tools like MicroStrategy and Tableau provides visibility into the complete data pipeline: from storage through visualization. Get the latest data cataloging news and trends in your inbox. In creating a single source of truth, MicroStrategy has reduced the risk of error or misinterpretation.

Data Governance

Data Governance Tableau Hadoop Data Pipeline

MLOps Landscape in 2023: Top Tools and Platforms

The MLOps Blog

JUNE 27, 2023

It integrates with Git and provides a Git-like interface for data versioning, allowing you to track changes, manage branches, and collaborate with data teams effectively. Dolt Dolt is an open-source relational database system built on Git. It could help you detect and prevent data pipeline failures, data drift, and anomalies.

Machine Learning

Machine Learning Machine Learning ML ML

Announcing Alation Tableau Edition

Alation

FEBRUARY 20, 2020

We believe that this offering, Alation Tableau Edition, realizes the full promise of self-service analytics by allowing analysts to self-serve without making any of the errors of omission or commission that traditionally accompany an ungoverned data environment. We characterize this offering as Governance for Insight.

Tableau

Tableau Data Governance Data Pipeline Analytics

5 Data Quality Best Practices

Precisely

SEPTEMBER 30, 2024

Data enrichment adds context to existing information, enabling business leaders to draw valuable new insights that would otherwise not have been possible. Managing an increasingly complex array of data sources requires a disciplined approach to integration, API management, and data security.

Data Quality

Data Quality Data Governance Machine Learning Machine Learning

What Is Data Modernization? 5 Benefits Worth Knowing

Alation

APRIL 19, 2022

Data producers and consumers alike are working from home and hybrid locations more often. And in an increasingly remote workforce, people need to access data systems easily to do their jobs. This might mean that they’re accessing a database from a smartphone, computer, or tablet. Today, data dwells everywhere.

Data Governance

Data Governance Cloud Data Database Data Silos

How Fastweb fine-tuned the Mistral model using Amazon SageMaker HyperPod as a first step to build an Italian large language model

AWS Machine Learning Blog

DECEMBER 18, 2024

Metas FAISS library, renowned for its efficiency in similarity search and clustering of dense vectors, was used as the underlying vector database due to its ability to handle large-scale datasets effectively. She specializes in AI operations, data governance, and cloud architecture on AWS.

Clustering

Clustering AWS AI AI

How Do You Call Snowflake Stored Procedures Using dbt Hooks?

phData

AUGUST 2, 2024

Snowflake AI Data Cloud is one of the most powerful platforms, including storage services supporting complex data. Integrating Snowflake with dbt adds another layer of automation and control to the data pipeline. Snowflake stored procedures and dbt Hooks are essential to modern data engineering and analytics workflows.

Data Pipeline

Data Pipeline Python Database SQL

The Data Dilemma: Exploring the Key Differences Between Data Science and Data Engineering

Pickl AI

JULY 25, 2023

Data engineers are essential professionals responsible for designing, constructing, and maintaining an organization’s data infrastructure. They create data pipelines, ETL processes, and databases to facilitate smooth data flow and storage. Data Warehousing: Amazon Redshift, Google BigQuery, etc.

Data Engineering

Data Engineering Data Engineer Data Engineering Data Engineering

Self-Service Analytics for Google Cloud, now with Looker and Tableau

Tableau

OCTOBER 8, 2021

Connecting directly to this semantic layer will help give customers access to critical business data in a safe, governed manner. This partnership makes data more accessible and trusted. Your data in the cloud. Tableau Prep allows you to combine, reshape, and clean data using an easy-to-use, visual, and direct interface.

Tableau

Tableau Analytics Analytics Machine Learning

Using Agile Data Stacks To Enable Flexible Decision Making In Uncertain Economic Times

Precisely

FEBRUARY 2, 2023

Pipelines must have robust data integration capabilities that integrate data from multiple data silos, including the extensive list of applications used throughout the organization, databases and even mainframes. Changes to one database must also be reflected in any other database in real time.

Data Silos

Data Silos Data Pipeline Database Data Observability

Top 5 Fivetran Connectors for Healthcare

phData

APRIL 29, 2024

Understanding Fivetran Fivetran is a popular Software-as-a-Service platform that enables users to automate the movement of data and ETL processes across diverse sources to a target destination. The phData team achieved a major milestone by successfully setting up a secure end-to-end data pipeline for a substantial healthcare enterprise.

SQL

SQL Data Warehouse Azure Cloud Data

Maximising Efficiency with ETL Data: Future Trends and Best Practices

Pickl AI

OCTOBER 17, 2024

Introduction ETL plays a crucial role in Data Management. This process enables organisations to gather data from various sources, transform it into a usable format, and load it into data warehouses or databases for analysis. The goal is to retrieve the required data efficiently without overwhelming the source systems.

ETL

ETL Data Warehouse Data Quality Data Governance

What is Snowflake Horizon?

phData

AUGUST 5, 2024

Who should have access to sensitive data? How can my analysts discover where data is located? All of these questions describe a concept known as data governance. The Snowflake AI Data Cloud has built an entire blanket of features called Horizon, which tackles all of these questions and more.

Data Governance

Data Governance Data Quality Data Lakes ML

How Does Fivetran Drive Business Value?

phData

APRIL 23, 2024

From structured data sources like ERPs, CRM, and relational data stores to unstructured data such as PDFs, images, and videos, enterprises are confronted with the daunting challenge of keeping up with their ever-expanding data ecosystem.

Data Governance

Data Governance Data Pipeline Data Warehouse Cloud Data

What is Snowflake’s Data Quality Monitoring Feature and How is it Used?

phData

OCTOBER 25, 2024

It’s common to have terabytes of data in most data warehouses, data quality monitoring is often challenging and cost-intensive due to dependencies on multiple tools and eventually ignored. This results in poor credibility and data consistency after some time, leading businesses to mistrust the data pipelines and processes.

Data Quality

Data Quality Data Pipeline Data Governance Database

The Cloud Connection: How Governance Supports Security

Alation

APRIL 14, 2022

Semantics, context, and how data is tracked and used mean even more as you stretch to reach post-migration goals. This is why, when data moves, it’s imperative for organizations to prioritize data discovery. Data discovery is also critical for data governance , which, when ineffective, can actually hinder organizational growth.

Data Governance

Data Governance ML ML Cloud Data

The Audience for Data Catalogs and Data Intelligence

Alation

JUNE 21, 2022

The audience grew to include data scientists (who were even more scarce and expensive) and their supporting resources (e.g., After that came data governance , privacy, and compliance staff. Power business users and other non-purely-analytic data citizens came after that. Data engineers want to catalog data pipelines.

DataOps

DataOps Data Scientist Data Quality Data Pipeline

Data democratization: How data architecture can drive business decisions and AI initiatives

IBM Journey to AI blog

AUGUST 4, 2023

Data democratization instead refers to the simplification of all processes related to data, from storage architecture to data management to data security. It also requires an organization-wide data governance approach, from adopting new types of employee training to creating new policies for data storage.

Data Lakes

Data Lakes AI AI Data Governance

Five benefits of a data catalog

IBM Journey to AI blog

DECEMBER 16, 2022

Because Alex can use a data catalog to search all data assets across the company, she has access to the most relevant and up-to-date information. She can search structured or unstructured data, visualizations and dashboards, machine learning models, and database connections. Protected and compliant data.

Data Quality

Data Quality Data Governance Data Wrangling Data Scientist

Turnkey Cloud DataOps: Solution from Alation and Accenture

Alation

MARCH 22, 2022

They created each capability as modules, which can either be used independently or together to build automated data pipelines. Alation’s governance capabilities include automated classification, profiling, data quality, lineage, stewardship, and deep policy integration with leading cloud-native databases like Snowflake.

DataOps

DataOps Data Pipeline Data Engineering Data Engineer

How Fivetran + dbt provides Enterprise Scale to ELT Pipelines

phData

OCTOBER 12, 2023

When the data or pipeline configuration needs to be changed, tools like Fivetran and dbt reduce the time required to make the change, and increase the confidence your team can have around the change. These allow you to scale your pipelines quickly. Governance doesn’t have to be scary or preventative to your cloud data warehouse.

Data Warehouse

Data Warehouse Database Cloud Data Data Pipeline

The Data Integration Solution Checklist: Top 10 Considerations

Precisely

MAY 13, 2024

Whether you’re bringing a new system online or connecting an existing database with your analytics platform, the process should be simple and straightforward. It synthesizes all the metadata around your organization’s data assets and arranges the information into a simple, easy-to-understand format.

Data Governance

Data Governance Data Pipeline Cloud Data Data Quality

Why You Need Data Observability to Improve Data Quality

Precisely

MAY 4, 2023

A broken data pipeline might bring operational systems to a halt, or it could cause executive dashboards to fail, reporting inaccurate KPIs to top management. Is your data governance structure up to the task? Read What Is Data Observability? Complexity leads to risk.

Data Observability

Data Observability Data Quality Data Pipeline Machine Learning

Building a Data Culture with Snowflake: A Guide for CIOs

phData

JUNE 20, 2024

This oftentimes leads to shadow IT processes and duplicated data pipelines. Data is siloed, and there is no singular source of truth but fragmented data spread across the organization. Establishing a data culture changes this paradigm. The business will find other means to answer their questions.

Data Governance

Data Governance Analytics Analytics Power BI

phData Toolkit December 2022 Update

phData

DECEMBER 29, 2022

The phData Toolkit continues to have additions made to it as we work with customers to accelerate their migrations , build a data governance practice , and ensure quality data products are built. This includes things like creating and modifying databases, schemas, and permissions. But what does this actually mean?

SQL

SQL Database Database Administration Data Profiling

Data architecture strategy for data quality

IBM Journey to AI blog

JANUARY 5, 2023

What does a modern data architecture do for your business? A modern data architecture like Data Mesh and Data Fabric aims to easily connect new data sources and accelerate development of use case specific data pipelines across on-premises, hybrid and multicloud environments.

Data Quality

Data Quality Data Lakes Data Warehouse Big Data

How to Manage Unstructured Data in AI and Machine Learning Projects

DagsHub

OCTOBER 23, 2024

With proper unstructured data management, you can write validation checks to detect multiple entries of the same data. Continuous learning: In a properly managed unstructured data pipeline, you can use new entries to train a production ML model, keeping the model up-to-date. mp4,webm, etc.), and audio files (.wav,mp3,acc,

Machine Learning

Machine Learning Machine Learning Data Lakes AI

Who is a BI Developer: Role, Responsibilities & Skills

Pickl AI

JULY 3, 2023

Here are steps you can follow to pursue a career as a BI Developer: Acquire a solid foundation in data and analytics: Start by building a strong understanding of data concepts, relational databases, SQL (Structured Query Language), and data modeling. Knowledge of tools like D3.js

Business Intelligence

Business Intelligence Business Intelligence SQL Data Visualization

What is the Snowflake Data Cloud and How Much Does it Cost?

phData

NOVEMBER 9, 2023

The main goal of a data mesh structure is to drive: Domain-driven ownership Data as a product Self-service infrastructure Federated governance One of the primary challenges that organizations face is data governance. This is “ lift-and-shift,” while it works, it doesn’t take full advantage of the cloud.

Data Warehouse

Data Warehouse Data Lakes Clustering Cloud Data

Implementing Gen AI for Financial Services

Iguazio

FEBRUARY 20, 2024

The required architecture includes a data pipeline, ML pipeline, application pipeline and a multi-stage pipeline. Let’s dive into the data management pipeline. What are the Key Elements of Data Management in Gen AI? The third is substituting names and SSNs with masked data.

AI

AI AI Data Pipeline Analytics

Introduction to Apache NiFi and Its Architecture

Pickl AI

JULY 30, 2024

Flow-Based Programming : NiFi employs a flow-based programming model, allowing users to create complex data flows using simple drag-and-drop operations. This visual representation simplifies the design and management of data pipelines. Its visual interface allows users to design complex ETL workflows with ease.

ETL

ETL Data Lakes Big Data Big Data

Maximize the Power of dbt and Snowflake to Achieve Efficient and Scalable Data Vault Solutions

phData

AUGUST 10, 2023

In data vault implementations, critical components encompass the storage layer, ELT technology, integration platforms, data observability tools, Business Intelligence and Analytics tools, Data Governance , and Metadata Management solutions. The most important reason for using DBT in Data Vault 2.0

SQL

SQL Data Observability Data Quality Data Pipeline

An Overview of Security and Compliance Features in Snowflake

phData

JANUARY 15, 2024

Access Controls and User Authentication Access control regulates who can interact with various database objects, such as tables, views, and functions. In Snowflake, securable objects (representing database resources) are controlled through roles. The process computes costs based on data volume.

Data Governance

Data Governance Database Data Warehouse Cloud Computing

Performance Benefits of Snowpark for ML Workloads

phData

MARCH 22, 2023

Top Use Cases of Snowpark With Snowpark, bringing business logic to data in the cloud couldn’t be easier. Transitioning work to Snowpark allows for faster ML deployment, easier scaling, and robust data pipeline development. ML Applications For data scientists, models can be developed in Python with common machine learning tools.

ML

ML ML Python Machine Learning

CI/CD for Data Pipelines: A Game-Changer with AnalyticsCreator

Testing and Monitoring Data Pipelines: Part Two

Trending Sources

Data Fabric and Address Verification Interface

Future trends in ETL

An IBM Z Data Integration Success Story

Architect a mature generative AI foundation on AWS

Build trust in banking with data lineage

Discover the Most Important Fundamentals of Data Engineering

Self-Service Analytics for Google Cloud, now with Looker and Tableau

How data engineers tame Big Data?

Cataloging MicroStrategy

MLOps Landscape in 2023: Top Tools and Platforms

Announcing Alation Tableau Edition

5 Data Quality Best Practices

What Is Data Modernization? 5 Benefits Worth Knowing

How Fastweb fine-tuned the Mistral model using Amazon SageMaker HyperPod as a first step to build an Italian large language model

How Do You Call Snowflake Stored Procedures Using dbt Hooks?

The Data Dilemma: Exploring the Key Differences Between Data Science and Data Engineering

Self-Service Analytics for Google Cloud, now with Looker and Tableau

Using Agile Data Stacks To Enable Flexible Decision Making In Uncertain Economic Times

Top 5 Fivetran Connectors for Healthcare

Maximising Efficiency with ETL Data: Future Trends and Best Practices

What is Snowflake Horizon?

How Does Fivetran Drive Business Value?

What is Snowflake’s Data Quality Monitoring Feature and How is it Used?

The Cloud Connection: How Governance Supports Security

The Audience for Data Catalogs and Data Intelligence

Data democratization: How data architecture can drive business decisions and AI initiatives

Five benefits of a data catalog

Turnkey Cloud DataOps: Solution from Alation and Accenture

How Fivetran + dbt provides Enterprise Scale to ELT Pipelines

The Data Integration Solution Checklist: Top 10 Considerations

Why You Need Data Observability to Improve Data Quality

Building a Data Culture with Snowflake: A Guide for CIOs

phData Toolkit December 2022 Update

Data architecture strategy for data quality

How to Manage Unstructured Data in AI and Machine Learning Projects

Who is a BI Developer: Role, Responsibilities & Skills

What is the Snowflake Data Cloud and How Much Does it Cost?

Implementing Gen AI for Financial Services

Introduction to Apache NiFi and Its Architecture

Maximize the Power of dbt and Snowflake to Achieve Efficient and Scalable Data Vault Solutions

An Overview of Security and Compliance Features in Snowflake

Performance Benefits of Snowpark for ML Workloads

Stay Connected