Data Modeling, Data Pipeline and Machine Learning

Streamlining Process Configuration in Machine Learning with Hydra

Pickl AI

NOVEMBER 29, 2024

Summary: Hydra simplifies process configuration in Machine Learning by dynamically managing parameters, organising configurations hierarchically, and enabling runtime overrides. As the global Machine Learning market, valued at USD 35.80 These issues can hinder experimentation, reproducibility, and workflow efficiency.

Machine Learning

Machine Learning Machine Learning ML ML

Essential data engineering tools for 2023: Empowering for management and analysis

Data Science Dojo

JULY 6, 2023

Data engineering tools are software applications or frameworks specifically designed to facilitate the process of managing, processing, and transforming large volumes of data. It integrates well with other Google Cloud services and supports advanced analytics and machine learning features.

Data Engineering

Data Engineering Data Engineering Data Engineering Data Engineer

Best 8 Data Version Control Tools for Machine Learning 2024

DagsHub

DECEMBER 11, 2023

The following points illustrates some of the main reasons why data versioning is crucial to the success of any data science and machine learning project: Storage space One of the reasons of versioning data is to be able to keep track of multiple versions of the same data which obviously need to be stored as well.

Machine Learning

Machine Learning Machine Learning Data Lakes Data Science

Architect a mature generative AI foundation on AWS

Flipboard

MAY 30, 2025

Data quality is ownership of the consuming applications or data producers. Governance The two key areas of governance are model and data: Model governance Monitor model for performance, robustness, and fairness. Model versions should be managed centrally in a model registry.

AWS

AWS AI AI Database

MLOps Landscape in 2023: Top Tools and Platforms

The MLOps Blog

JUNE 27, 2023

How to evaluate MLOps tools and platforms Like every software solution, evaluating MLOps (Machine Learning Operations) tools and platforms can be a complex task as it requires consideration of varying factors. An integrated model factory to develop, deploy, and monitor models in one place using your preferred tools and languages.

Machine Learning

Machine Learning Machine Learning ML ML

The Official Machine Learning and AI Platform of Hacktoberfest 2023

DagsHub

SEPTEMBER 10, 2023

Join us for a month-long celebration of open-source contributions to Machine Learning and AI projects. Gain hands-on experience building datasets, models, pipelines, and more! How to participate in DagsHub's Machine Learning Challenge during Hacktoberfest? Image by Hacktoberfest What is DagsHub?

Machine Learning

Machine Learning Machine Learning Data Pipeline AI

Building an efficient MLOps platform with OSS tools on Amazon ECS with AWS Fargate

AWS Machine Learning Blog

SEPTEMBER 18, 2024

In addition to its groundbreaking AI innovations, Zeta Global has harnessed Amazon Elastic Container Service (Amazon ECS) with AWS Fargate to deploy a multitude of smaller models efficiently. Zeta’s AI innovation is powered by a proprietary machine learning operations (MLOps) system, developed in-house.

AWS

AWS Machine Learning Machine Learning ML

Best 8 Experiment Tracking Tools for Machine Learning 2024

DagsHub

DECEMBER 5, 2023

Building machine learning models is a highly iterative process. After building a simple MVP for our project, we will most likely carry out a series of experiments in which we try out different models (along with their hyperparameters), create or add various features, or utilize data preprocessing techniques.

Machine Learning

Machine Learning Machine Learning ML ML

Best Data Engineering Tools Every Engineer Should Know

Pickl AI

MARCH 19, 2025

Summary: Data engineering tools streamline data collection, storage, and processing. Learning these tools is crucial for building scalable data pipelines. offers Data Science courses covering these tools with a job guarantee for career growth. Below are 20 essential tools every data engineer should know.

Data Engineering

Data Engineering Data Engineering Data Engineering Data Engineer

How to Manage Unstructured Data in AI and Machine Learning Projects

DagsHub

OCTOBER 23, 2024

Unstructured data makes up 80% of the world's data and is growing. Managing unstructured data is essential for the success of machine learning (ML) projects. Without structure, data is difficult to analyze and extracting meaningful insights and patterns is challenging.

Machine Learning

Machine Learning Machine Learning Data Lakes AI

Data science vs data analytics: Unpacking the differences

IBM Journey to AI blog

SEPTEMBER 19, 2023

Overview: Data science vs data analytics Think of data science as the overarching umbrella that covers a wide range of tasks performed to find patterns in large datasets, structure data for use, train machine learning models and develop artificial intelligence (AI) applications.

Data Science

Data Science Analytics Analytics Data Scientist

The Data Dilemma: Exploring the Key Differences Between Data Science and Data Engineering

Pickl AI

JULY 25, 2023

Data engineers are essential professionals responsible for designing, constructing, and maintaining an organization’s data infrastructure. They create data pipelines, ETL processes, and databases to facilitate smooth data flow and storage. Read more to know.

Data Engineering

Data Engineering Data Engineering Data Engineering Data Engineer

Self-Service Analytics for Google Cloud, now with Looker and Tableau

Tableau

OCTOBER 8, 2021

Leveraging Looker’s semantic layer will provide Tableau customers with trusted, governed data at every stage of their analytics journey. With its LookML modeling language, Looker provides a unique, modern approach to define governed and reusable data models to build a trusted foundation for analytics.

Tableau

Tableau Analytics Analytics Machine Learning

Find Your AI Solutions at the ODSC West AI Expo

ODSC - Open Data Science

OCTOBER 15, 2023

Institute of Analytics The Institute of Analytics is a non-profit organization that provides data science and analytics courses, workshops, certifications, research, and development. The courses and workshops cover a wide range of topics, from basic data science concepts to advanced machine learning techniques.

Machine Learning

Machine Learning Machine Learning Data Pipeline AI

Discover the Most Important Fundamentals of Data Engineering

Pickl AI

NOVEMBER 4, 2024

Summary: The fundamentals of Data Engineering encompass essential practices like data modelling, warehousing, pipelines, and integration. Understanding these concepts enables professionals to build robust systems that facilitate effective data management and insightful analysis. What is Data Engineering?

Data Engineering

Data Engineering Data Engineering Data Engineering Data Engineer

Building Scalable AI Pipelines with MLOps: A Guide for Software Engineers

ODSC - Open Data Science

OCTOBER 7, 2024

In today’s landscape, AI is becoming a major focus in developing and deploying machine learning models. It isn’t just about writing code or creating algorithms — it requires robust pipelines that handle data, model training, deployment, and maintenance.

Machine Learning

Machine Learning Machine Learning AI AI

Comparing Tools For Data Processing Pipelines

The MLOps Blog

MARCH 15, 2023

If you will ask data professionals about what is the most challenging part of their day to day work, you will likely discover their concerns around managing different aspects of data before they get to graduate to the data modeling stage. This ensures that the data is accurate, consistent, and reliable.

Data Pipeline

Data Pipeline ETL Data Quality SQL

Self-Service Analytics for Google Cloud, now with Looker and Tableau

Tableau

OCTOBER 8, 2021

Leveraging Looker’s semantic layer will provide Tableau customers with trusted, governed data at every stage of their analytics journey. With its LookML modeling language, Looker provides a unique, modern approach to define governed and reusable data models to build a trusted foundation for analytics.

Tableau

Tableau Analytics Analytics Machine Learning

How to Choose MLOps Tools: In-Depth Guide for 2024

DagsHub

APRIL 21, 2024

Source: [link] Similarly, while building any machine learning-based product or service, training and evaluating the model on a few real-world samples does not necessarily mean the end of your responsibilities. You need to make that model available to the end users, monitor it, and retrain it for better performance if needed.

Machine Learning

Machine Learning Machine Learning ML ML

Where Does Fivetran Fit into The Modern Data Stack?

phData

JULY 17, 2023

In order to fully leverage this vast quantity of collected data, companies need a robust and scalable data infrastructure to manage it. This is where Fivetran and the Modern Data Stack come in. Data modeling, data cleanup, etc. Click on the link above to learn more! What is Fivetran?

Data Warehouse

Data Warehouse Data Pipeline Cloud Data ETL

How to use foundation models and trusted governance to manage AI workflow risk

IBM Journey to AI blog

OCTOBER 16, 2023

It includes processes that trace and document the origin of data, models and associated metadata and pipelines for audits. An AI governance framework ensures the ethical, responsible and transparent use of AI and machine learning (ML).

AI

AI AI Data Warehouse ML

What are Snowflake Dynamic Tables?

phData

NOVEMBER 2, 2023

Managing data pipelines efficiently is paramount for any organization. The Snowflake Data Cloud has introduced a groundbreaking feature that promises to simplify and supercharge this process: Snowflake Dynamic Tables. If you’re looking to leverage the full power of the Snowflake Data Cloud, let phData be your guide.

Data Pipeline

Data Pipeline SQL Data Warehouse Data Engineer

ML Collaboration: Best Practices From 4 ML Teams

The MLOps Blog

DECEMBER 28, 2022

Building ML team Following the surge in ML use cases that have the potential to transform business, the leaders are making a significant investment in ML collaboration, building teams that can deliver the promise of machine learning. Team composition The team comprises domain experts, data engineers, data scientists, and ML engineers.

ML

ML ML Data Scientist Machine Learning

What Industries are Hiring for Different Jobs in AI

ODSC - Open Data Science

APRIL 26, 2023

For example, a data scientist would be a good fit for a team that is in charge of handling large swaths of data and creating actionable insights from them. In another industry what matters is being able to predict behaviors in the medium and short terms, and this is where a machine learning engineer might come to play.

Data Analyst

Data Analyst Machine Learning Machine Learning Power BI

Who is a BI Developer: Role, Responsibilities & Skills

Pickl AI

JULY 3, 2023

It is the process of converting raw data into relevant and practical knowledge to help evaluate the performance of businesses, discover trends, and make well-informed choices. Data gathering, data integration, data modelling, analysis of information, and data visualization are all part of intelligence for businesses.

Business Intelligence

Business Intelligence Business Intelligence SQL Data Visualization

What Do Data Scientists Do? A Guide to AI Maturity, Challenges, and Solutions

DataRobot Blog

SEPTEMBER 13, 2022

The future of business depends on artificial intelligence and machine learning. According to IDC , 83% of CEOs want their organizations to be more data-driven. Data scientists could be your key to unlocking the potential of the Information Revolution—but what do data scientists do? What Do Data Scientists Do?

Data Scientist

Data Scientist ML ML AI

Top 5 Data Warehouses to Supercharge Your Big Data Strategy

Women in Big Data

NOVEMBER 27, 2024

By maintaining historical data from disparate locations, a data warehouse creates a foundation for trend analysis and strategic decision-making. Integrating seamlessly with other Google Cloud services, BigQuery is a powerful solution for organizations seeking efficient and cost-effective large-scale data analysis.

Data Warehouse

Data Warehouse Big Data Big Data Azure

Capital One’s data-centric solutions to banking business challenges

Snorkel AI

MAY 12, 2023

Three experts from Capital One ’s data science team spoke as a panel at our Future of Data-Centric AI conference in 2022. Please welcome to the stage, Senior Director of Applied ML and Research, Bayan Bruss; Director of Data Science, Erin Babinski; and Head of Data and Machine Learning, Kishore Mosaliganti.

Machine Learning

Machine Learning Machine Learning ML ML

Capital One’s data-centric solutions to banking business challenges

Snorkel AI

MAY 12, 2023

Three experts from Capital One ’s data science team spoke as a panel at our Future of Data-Centric AI conference in 2022. Please welcome to the stage, Senior Director of Applied ML and Research, Bayan Bruss; Director of Data Science, Erin Babinski; and Head of Data and Machine Learning, Kishore Mosaliganti.

Machine Learning

Machine Learning Machine Learning ML ML

LLMOps vs. MLOps: Understanding the Differences

Iguazio

FEBRUARY 8, 2024

LLMOps (Large Language Model Operations), is a specialized domain within the broader field of machine learning operations (MLOps). LLMOps focuses specifically on the operational aspects of large language models (LLMs). Data Pipeline - Manages and processes various data sources. What is LLMOps?

ML

ML ML Data Scientist AI

Generative AI in Software Development

Mlearning.ai

JUNE 16, 2023

Generative AI can be used to automate the data modeling process by generating entity-relationship diagrams or other types of data models and assist in UI design process by generating wireframes or high-fidelity mockups. GPT-4 Data Pipelines: Transform JSON to SQL Schema Instantly Blockstream’s public Bitcoin API.

AI

AI AI Data Analysis Data Analysis

Demystifying Time Series Database: A Comprehensive Guide

Pickl AI

JULY 8, 2024

Security is Paramount Implement robust security measures to protect sensitive time series data. Integration with Data Pipelines and Analytics TSDBs often work in tandem with other data tools to create a comprehensive data ecosystem for analysis and insights generation.

Database

Database Data Pipeline Machine Learning Machine Learning

What Free Tools Pair Well With The Snowflake AI Data Cloud?

phData

OCTOBER 17, 2024

By leveraging version control, testing, and documentation features, dbt Core enables teams to ensure data quality and consistency across their pipelines while integrating seamlessly with modern data warehouses. Aside from migrations, Data Source is also great for data quality checks and can generate data pipelines.

AI

AI AI Data Quality SQL

Discover the Snowflake Architecture With All its Pros and Cons- NIX United

Mlearning.ai

FEBRUARY 16, 2023

Thus, the solution allows for scaling data workloads independently from one another and seamlessly handling data warehousing, data lakes , data sharing, and engineering. Machine Learning Integration Opportunities Organizations harness machine learning (ML) algorithms to make forecasts on the data.

Data Warehouse

Data Warehouse Business Intelligence Business Intelligence Database

Unlocking Tabular Data’s Hidden Potential

ODSC - Open Data Science

MAY 10, 2023

Many mistakenly equate tabular data with business intelligence rather than AI, leading to a dismissive attitude toward its sophistication. Standard data science practices could also be contributing to this issue. One might say that tabular data modeling is the original data-centric AI!

Data Scientist

Data Scientist Data Science Deep Learning Deep Learning

A Recipe For AI Strategy

ODSC - Open Data Science

FEBRUARY 8, 2024

How can we build up toward our vision in terms of solvable data problems and specific data products? data sources or simpler data models) of the data products we want to build? Data science — The discipline within the organization that makes data products. What are we working towards?

Data Science

Data Science AI AI Data Scientist

Top ETL Tools: Unveiling the Best Solutions for Data Integration

Pickl AI

JUNE 7, 2024

IBM Infosphere DataStage IBM Infosphere DataStage is an enterprise-level ETL tool that enables users to design, develop, and run data pipelines. Key Features: Graphical Framework: Allows users to design data pipelines with ease using a graphical user interface. Read Further: Azure Data Engineer Jobs.

ETL

ETL Data Quality Data Pipeline Data Warehouse

Mastering Version Control for ML Models: Best Practices You Need to Know

DagsHub

AUGUST 29, 2024

Source: Author Introduction Machine learning (ML) models, like other software, are constantly changing and evolving. Version control systems (VCS) play a key role in this area by offering a structured method to track changes made to models and handle versions of data and code used in these ML projects.

ML

ML ML Python Machine Learning

The Ultimate Modern Data Stack Migration Guide

phData

JULY 18, 2023

Advanced Analytics: Snowflake’s platform is purposefully engineered to cater to the demands of machine learning and AI-driven data science applications in a cost-effective manner. Efficient Sharing: Sigma provides several easy but secure ways to share data and analytics internally and externally.

Data Warehouse

Data Warehouse Analytics Analytics Cloud Data

The Evolution of Customer Data Modeling: From Static Profiles to Dynamic Customer 360

phData

SEPTEMBER 27, 2024

Introduction: The Customer Data Modeling Dilemma You know, that thing we’ve been doing for years, trying to capture the essence of our customers in neat little profile boxes? For years, we’ve been obsessed with creating these grand, top-down customer data models. Yeah, that one.

Data Models

Data Models Data Modeling Apache Kafka Data Lakes

How to Build an End-To-End ML Pipeline

The MLOps Blog

MAY 9, 2023

They run scripts manually to preprocess their training data, rerun the deployment scripts, manually tune their models, and spend their working hours keeping previously developed models up to date. Building end-to-end machine learning pipelines lets ML engineers build once, rerun, and reuse many times.

ML

ML ML Machine Learning Machine Learning

Data Scientists in the Age of AI Agents and AutoML

Towards AI

JANUARY 22, 2025

Simply put, focusing solely on data analysis, coding or modeling will no longer cuts it for most corporate jobs. These two languages cover most data science workflows. Additionally, languages like DAX can be helpful for specific use cases involving data models and dashboards.

Data Scientist

Data Scientist EDA Exploratory Data Analysis AI

Streamlining Process Configuration in Machine Learning with Hydra

Essential data engineering tools for 2023: Empowering for management and analysis

Trending Sources

Best 8 Data Version Control Tools for Machine Learning 2024

Architect a mature generative AI foundation on AWS

MLOps Landscape in 2023: Top Tools and Platforms

The Official Machine Learning and AI Platform of Hacktoberfest 2023

Building an efficient MLOps platform with OSS tools on Amazon ECS with AWS Fargate

Best 8 Experiment Tracking Tools for Machine Learning 2024

Best Data Engineering Tools Every Engineer Should Know

How to Manage Unstructured Data in AI and Machine Learning Projects

Data science vs data analytics: Unpacking the differences

The Data Dilemma: Exploring the Key Differences Between Data Science and Data Engineering

Self-Service Analytics for Google Cloud, now with Looker and Tableau

Find Your AI Solutions at the ODSC West AI Expo

Discover the Most Important Fundamentals of Data Engineering

Building Scalable AI Pipelines with MLOps: A Guide for Software Engineers

Comparing Tools For Data Processing Pipelines

Self-Service Analytics for Google Cloud, now with Looker and Tableau

How to Choose MLOps Tools: In-Depth Guide for 2024

Where Does Fivetran Fit into The Modern Data Stack?

How to use foundation models and trusted governance to manage AI workflow risk

What are Snowflake Dynamic Tables?

ML Collaboration: Best Practices From 4 ML Teams

What Industries are Hiring for Different Jobs in AI

Who is a BI Developer: Role, Responsibilities & Skills

What Do Data Scientists Do? A Guide to AI Maturity, Challenges, and Solutions

Top 5 Data Warehouses to Supercharge Your Big Data Strategy

Capital One’s data-centric solutions to banking business challenges

Capital One’s data-centric solutions to banking business challenges

LLMOps vs. MLOps: Understanding the Differences

Generative AI in Software Development

Demystifying Time Series Database: A Comprehensive Guide

What Free Tools Pair Well With The Snowflake AI Data Cloud?

Discover the Snowflake Architecture With All its Pros and Cons- NIX United

Unlocking Tabular Data’s Hidden Potential

A Recipe For AI Strategy

Top ETL Tools: Unveiling the Best Solutions for Data Integration

Mastering Version Control for ML Models: Best Practices You Need to Know

The Ultimate Modern Data Stack Migration Guide

The Evolution of Customer Data Modeling: From Static Profiles to Dynamic Customer 360

How to Build an End-To-End ML Pipeline

Data Scientists in the Age of AI Agents and AutoML

Stay Connected