Cloud Data, ETL and ML - Data Science Current

Cloud Data

ETL

AWS re:Invent 2023 Amazon Redshift Sessions Recap

Flipboard

DECEMBER 18, 2023

Amazon Redshift powers data-driven decisions for tens of thousands of customers every day with a fully managed, AI-powered cloud data warehouse, delivering the best price-performance for your analytics workloads. Learn more about the AWS zero-ETL future with newly launched AWS databases integrations with Amazon Redshift.

AWS

AWS Data Warehouse ETL SQL

How to Build ETL Data Pipeline in ML

The MLOps Blog

MAY 17, 2023

From data processing to quick insights, robust pipelines are a must for any ML system. Often the Data Team, comprising Data and ML Engineers , needs to build this infrastructure, and this experience can be painful. However, efficient use of ETL pipelines in ML can help make their life much easier.

ETL

ETL Data Pipeline ML ML

Join 17,000+

professionals

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Webinars

What’s New in Apache Airflow® 3.0—And How Will It Reshape Your Data Workflows?

MORE WEBINARS

Trending Sources

The power of remote engine execution for ETL/ELT data pipelines

IBM Journey to AI blog

MAY 15, 2024

Two of the more popular methods, extract, transform, load (ETL ) and extract, load, transform (ELT) , are both highly performant and scalable. Data engineers build data pipelines, which are called data integration tasks or jobs, as incremental steps to perform data operations and orchestrate these data pipelines in an overall workflow.

Data Pipeline

Data Pipeline ETL SQL Database

Webinars

What’s New in Apache Airflow® 3.0—And How Will It Reshape Your Data Workflows?

MORE WEBINARS

Bring your own AI using Amazon SageMaker with Salesforce Data Cloud

AWS Machine Learning Blog

AUGUST 4, 2023

As a result, businesses can accelerate time to market while maintaining data integrity and security, and reduce the operational burden of moving data from one location to another. With Einstein Studio, a gateway to AI tools on the data platform, admins and data scientists can effortlessly create models with a few clicks or using code.

AWS

AWS ML ML Data Scientist

Data Science Career Paths: Analyst, Scientist, Engineer – What’s Right for You?

How to Learn Machine Learning

APRIL 26, 2025

Data Cleaning and Preparation The tasks of cleaning and preparing the data take place before the analysis. This includes duplicate removal, missing value treatment, variable transformation, and normalization of data. Data Architect Designs complex databases and blueprints for data management systems.

Data Science

Data Science Data Analyst Data Scientist Machine Learning

How to Maximize Time to Value with Fivetran and dbt

phData

OCTOBER 17, 2023

The story is all too common – a business user requests some data, the data team creates/prioritizes a ticket, and said ticket is completed after some number of months (or weeks if you’re lucky) – just to have the data be wrong, and the whole process starts again. Those are scary for data teams to change.

ETL

ETL Data Pipeline Data Engineering Data Engineer

How Does Snowpark Work?

phData

FEBRUARY 7, 2024

The Snowflake Data Cloud is a leading cloud data platform that provides various features and services for data storage, processing, and analysis. A new feature that Snowflake offers is called Snowpark, which provides an intuitive library for querying and processing data at scale in Snowflake.

Python

Python ML ML SQL

Ensure Success with Trusted Data When Moving To The Cloud

Precisely

JUNE 2, 2023

As companies strive to leverage AI/ML, location intelligence, and cloud analytics into their portfolio of tools, siloed mainframe data often stands in the way of forward momentum. Many organizations are using CDC technology to push data to cloud data platforms like Snowflake, Redshift, Databricks, and Kafka.

Data Silos

Data Silos ETL Data Quality Data Pipeline

Exploring the AI and data capabilities of watsonx

IBM Journey to AI blog

JULY 17, 2023

is our enterprise-ready next-generation studio for AI builders, bringing together traditional machine learning (ML) and new generative AI capabilities powered by foundation models. Watsonx.data allows customers to augment data warehouses such as Db2 Warehouse and Netezza and optimize workloads for performance and cost. IBM watsonx.ai

AI AI Machine Learning Machine Learning

What is ThoughtSpot? Everything You Need to Know

phData

SEPTEMBER 4, 2024

ThoughtSpot was designed to be low-code and easy for anyone to use across a business to generate insights and explore data. ThoughSpot can easily connect to top cloud data platforms such as Snowflake AI Data Cloud , Oracle, SAP HANA, and Google BigQuery.

Analytics

Analytics Analytics SQL ETL

The Ultimate Modern Data Stack Migration Guide

phData

JULY 18, 2023

Why Migrate to a Modern Data Stack? With the birth of cloud data warehouses, data applications, and generative AI , processing large volumes of data faster and cheaper is more approachable and desired than ever. Legacy tools force users to manually build out processes that can be automated by the Modern Data Stack.

Data Warehouse

Data Warehouse Analytics Analytics SQL

Alation 2023.1: Easing Self-Service for the Modern Data Stack with Databricks and dbt Labs

Alation

APRIL 4, 2023

The people navigating these increasingly chaotic landscapes need a single place to find, understand, and use data with total confidence. Expanded Integration with Databricks Unity Catalog Unity Catalog is Databricks ’ governance and admin layer for all lakehouse data and AI assets, including files, tables, ML models, and dashboards.

DataOps

DataOps Data Engineering Data Engineer Data Engineering

What Is a Data Fabric and How Does a Data Catalog Support It?

Alation

JANUARY 25, 2022

This “analysis” is made possible in large part through machine learning (ML); the patterns and connections ML detects are then served to the data catalog (and other tools), which these tools leverage to make people- and machine-facing recommendations about data management and data integrations.

DataOps

DataOps SQL ML ML

Beginner’s Guide To GCP BigQuery (Part 1)

Mlearning.ai

JULY 10, 2023

In my 7 years of Data Science journey, I’ve been exposed to a number of different databases including but not limited to Oracle Database, MS SQL, MySQL, EDW, and Apache Hadoop. You can use stored procedures to handle complex ETL processes, make API calls, and perform data validation.

SQL

SQL Database Apache Hadoop Data Science

Fivetran Modern Data Stack Conference 2023: Key Takeaways

Alation

APRIL 14, 2023

Last week, the Alation team had the privilege of joining IT professionals, business leaders, and data analysts and scientists for the Modern Data Stack Conference in San Francisco. In “The modern data stack is dead, long live the modern data stack!” Cloud costs are growing prohibitive.

Data Pipeline

Data Pipeline Data Warehouse Cloud Data ETL

The Evolution of Customer Data Modeling: From Static Profiles to Dynamic Customer 360

phData

SEPTEMBER 27, 2024

If the event log is your customer’s diary, think of persistent staging as their scrapbook – a place where raw customer data is collected, organized, and kept for future reference. In traditional ETL (Extract, Transform, Load) processes in CDPs, staging areas were often temporary holding pens for data.

Data Modeling

Data Modeling Data Models Apache Kafka Data Lakes

AWS re:Invent 2023 Amazon Redshift Sessions Recap

How to Build ETL Data Pipeline in ML

Webinars

Trending Sources

The power of remote engine execution for ETL/ELT data pipelines

Webinars

Bring your own AI using Amazon SageMaker with Salesforce Data Cloud

Data Science Career Paths: Analyst, Scientist, Engineer – What’s Right for You?

How to Maximize Time to Value with Fivetran and dbt

How Does Snowpark Work?

Ensure Success with Trusted Data When Moving To The Cloud

Exploring the AI and data capabilities of watsonx

What is ThoughtSpot? Everything You Need to Know

The Ultimate Modern Data Stack Migration Guide

Alation 2023.1: Easing Self-Service for the Modern Data Stack with Databricks and dbt Labs

What Is a Data Fabric and How Does a Data Catalog Support It?

Beginner’s Guide To GCP BigQuery (Part 1)

Fivetran Modern Data Stack Conference 2023: Key Takeaways

The Evolution of Customer Data Modeling: From Static Profiles to Dynamic Customer 360

Stay Connected