2020, Data Pipeline and Data Science

Five Interesting Data Engineering Projects

KDnuggets

MARCH 17, 2020

As the role of the data engineer continues to grow in the field of data science, so are the many tools being developed to support wrangling all that data. Five of these tools are reviewed here (along with a few bonus tools) that you should pay attention to for your data pipeline work.

Data Engineering

Data Engineering Data Engineering Data Engineer Data Engineering

The 2021 Executive Guide To Data Science and AI

Applied Data Science

AUGUST 2, 2021

This post is a bitesize walk-through of the 2021 Executive Guide to Data Science and AI — a white paper packed with up-to-date advice for any CIO or CDO looking to deliver real value through data. Automation Automating data pipelines and models ➡️ 6. Team Building the right data science team is complex.

Data Science

Data Science Data Scientist ML ML

Feature Platforms?—?A New Paradigm in Machine Learning Operations (MLOps)

IBM Data Science in Practice

MARCH 8, 2023

VC Investment in AI firms rose from USD 3 billion in 2012 to close to USD 75 billion in 2020 This trend led to the proliferation of companies developing tools to address different pain points in the machine learning lifecycle. A feature platform should automatically process the data pipelines to calculate that feature.

Machine Learning

Machine Learning Machine Learning ML ML

Webinars

What’s New in Apache Airflow® 3.0—And How Will It Reshape Your Data Workflows?

MORE WEBINARS

How The Explosive Growth Of Data Access Affects Your Engineer’s Team Efficiency

Smart Data Collective

OCTOBER 17, 2022

In fact, you may have even heard about IDC’s new Global DataSphere Forecast, 2021-2025 , which projects that global data production and replication will expand at a compound annual growth rate of 23% during the projection period, reaching 181 zettabytes in 2025. zettabytes of data in 2020, a tenfold increase from 6.5

Big Data

Big Data Big Data Data Engineering Data Engineering

AIOps vs. MLOps: Harnessing big data for “smarter” ITOPs

IBM Journey to AI blog

AUGUST 12, 2024

Wearable devices (such as fitness trackers, smart watches and smart rings) alone generated roughly 28 petabytes (28 billion megabytes) of data daily in 2020. And in 2024, global daily data generation surpassed 402 million terabytes (or 402 quintillion bytes). Massive, in fact.

Big Data

Big Data Big Data ML ML

Pioneering computer vision: Aleksandr Timashov, ML developer

Dataconomy

AUGUST 22, 2024

In this interview, Aleksandr shares his unique experiences of leading groundbreaking projects in Computer Vision and Data Science at the Petronas global energy group (Malaysia). Please tell our readers about your background and how you got into Data Science and Machine Learning? Hello Aleksandr.

ML

ML ML Machine Learning Machine Learning

Best 8 Data Version Control Tools for Machine Learning 2024

DagsHub

DECEMBER 11, 2023

The following points illustrates some of the main reasons why data versioning is crucial to the success of any data science and machine learning project: Storage space One of the reasons of versioning data is to be able to keep track of multiple versions of the same data which obviously need to be stored as well.

Machine Learning

Machine Learning Machine Learning Data Lakes Data Science

Why We Started the Data Intelligence Project

Alation

JULY 7, 2022

Starting in the summer of 2020, students began using Alation to learn how to work with data and communicate around it effectively. Universities were only just beginning to plan formal academic data science programs, and the skills to be taught in those programs were still being identified. We’ve made incredible progress.

Data Scientist

Data Scientist Data Analyst Analytics Analytics

Santa Reins in his Data to Deliver the Holidays

Alation

DECEMBER 23, 2021

The elf teams used data engineering to improve gift matching and deployed big data to scale the naughty and nice list long ago , before either approach was even considered within our warmer climes. And Santa was hoping to make 2021 his most data-driven year yet. Get the latest data cataloging news and trends in your inbox.

Data Governance

Data Governance Data Pipeline Tableau Big Data

What is the Snowflake Data Cloud and How Much Does it Cost?

phData

NOVEMBER 9, 2023

In this blog, we’ll explain what makes up the Snowflake Data Cloud, how some of the key components work, and finally some estimates on how much it will cost your business to utilize Snowflake. What is the Snowflake Data Cloud?

Data Warehouse

Data Warehouse Data Lakes Clustering Cloud Data

ML Collaboration: Best Practices From 4 ML Teams

The MLOps Blog

DECEMBER 28, 2022

Union of business and data teams The success of ML projects lies in the strong collaboration between the data team and the business team. Such continuous alliance of the business team helps the data science team to create ML models that have the potential to add significant business value.

ML

ML ML Data Scientist Machine Learning

How to become an AI Architect?

Pickl AI

JULY 18, 2023

Solution Design Creating a high-level architectural design that encompasses data pipelines, model training, deployment strategies, and integration with existing systems. There are several online platforms offering courses in artificial intelligence, data science, machine learning and others. billion in 2020.

AI

AI AI Machine Learning Machine Learning

Introduction to LangChain for Including AI from Large Language Models (LLMs) Inside Data…

Heartbeat

JANUARY 5, 2024

Introduction to LangChain for Including AI from Large Language Models (LLMs) Inside Data Applications and Data Pipelines This article will provide an overview of LangChain, the problems it addresses, its use cases, and some of its limitations. Python : Great for including AI in Python-based software or data pipelines.

AI

AI AI Data Pipeline Deep Learning

How SnapLogic built a text-to-pipeline application with Amazon Bedrock to translate business intent into action

Flipboard

NOVEMBER 24, 2023

Iris was designed to use machine learning (ML) algorithms to predict the next steps in building a data pipeline. He holds PhD and MS degrees in Electrical Engineering from the University of Texas at Austin and an MS in Computer Science from Georgia Institute of Technology.

Database

Database AWS ETL SQL

Snowflake Snowpark: cloud SQL and Python ML pipelines

Snorkel AI

MAY 26, 2023

What’s really important in the before part is having production-grade machine learning data pipelines that can feed your model training and inference processes. And that’s really key for taking data science experiments into production. And so that’s where we got started as a cloud data warehouse.

SQL

SQL ML ML Python

Snowflake Snowpark: cloud SQL and Python ML pipelines

Snorkel AI

MAY 26, 2023

What’s really important in the before part is having production-grade machine learning data pipelines that can feed your model training and inference processes. And that’s really key for taking data science experiments into production. And so that’s where we got started as a cloud data warehouse.

SQL

SQL ML ML Python

When his hobbies went on hiatus, this Kaggler made fighting COVID-19 with data his mission | A…

Kaggle

JULY 29, 2020

David: My technical background is in ETL, data extraction, data engineering and data analytics. I spent over a decade of my career developing large-scale data pipelines to transform both structured and unstructured data into formats that can be utilized in downstream systems.

ETL

ETL Data Scientist Data Science Machine Learning

Data Science Current

Five Interesting Data Engineering Projects

The 2021 Executive Guide To Data Science and AI

Webinars

Trending Sources

Feature Platforms?—?A New Paradigm in Machine Learning Operations (MLOps)

Webinars

How The Explosive Growth Of Data Access Affects Your Engineer’s Team Efficiency

AIOps vs. MLOps: Harnessing big data for “smarter” ITOPs

Pioneering computer vision: Aleksandr Timashov, ML developer

Best 8 Data Version Control Tools for Machine Learning 2024

Why We Started the Data Intelligence Project

Santa Reins in his Data to Deliver the Holidays

What is the Snowflake Data Cloud and How Much Does it Cost?

ML Collaboration: Best Practices From 4 ML Teams

How to become an AI Architect?

Introduction to LangChain for Including AI from Large Language Models (LLMs) Inside Data…

How SnapLogic built a text-to-pipeline application with Amazon Bedrock to translate business intent into action

Snowflake Snowpark: cloud SQL and Python ML pipelines

Snowflake Snowpark: cloud SQL and Python ML pipelines

When his hobbies went on hiatus, this Kaggler made fighting COVID-19 with data his mission | A…

Stay Connected