Article, ETL and SQL - Data Science Current

SQL and Data Integration: ETL and ELT

KDnuggets

JANUARY 19, 2023

In this article, we will discuss use cases and methods for using ETL (Extract, Transform, Load) and ELT (Extract, Load, Transform) processes along with SQL to integrate data from various sources.

ETL

ETL SQL Data Engineering Data Engineering

From Blob Storage to SQL Database Using Azure Data Factory

Analytics Vidhya

APRIL 29, 2022

This article was published as a part of the Data Science Blogathon. Introduction Azure data factory (ADF) is a cloud-based ETL (Extract, Transform, Load) tool and data integration service which allows you to create a data-driven workflow. In this article, I’ll show […]. In this article, I’ll show […].

Azure

Azure SQL Database ETL

Go vs. Python for Modern Data Workflows: Need Help Deciding?

KDnuggets

JUNE 19, 2025

Blog Top Posts About Topics AI Career Advice Computer Vision Data Engineering Data Science Language Models Machine Learning MLOps NLP Programming Python SQL Datasets Events Resources Cheat Sheets Recommendations Tech Briefs Advertise Join Newsletter Go vs. Python for Modern Data Workflows: Need Help Deciding?

Python

Python Natural Language Processing Data Science Machine Learning

Mosaic AI Announcements at Data + AI Summit 2025

databricks

JUNE 11, 2025

AI Functions in SQL: Now Faster and Multi-Modal AI Functions enable users to easily access the power of generative AI directly from within SQL. Figure 3: Document intelligence arrives at Databricks with the introduction of ai_parse in SQL.

AI

AI AI SQL Data Science

What Is a Lakebase?

databricks

JUNE 11, 2025

REGISTER Login Try Databricks Blog / Announcements / Article What Is a Lakebase? It eliminates fragile ETL pipelines and complex infrastructure, enabling teams to move faster and deliver intelligent applications on a unified data platform In this blog, we propose a new architecture for OLTP databases called a lakebase.

Database

Database Data Lakes ETL Analytics

Understand Apache Drill and its Working

Analytics Vidhya

AUGUST 29, 2022

This article was published as a part of the Data Science Blogathon. This requires developing a lot of ETL jobs and transforming the data to guarantee a consistent structure for making it available at any next step in the […].

ETL

ETL Data Scientist Data Science Analytics

Introducing Agent Bricks: Auto-Optimized Agents Using Your Data

databricks

JUNE 11, 2025

Get a Demo DATA + AI SUMMIT JUNE 9–12 | SAN FRANCISCO Data + AI Summit is almost here — don’t miss the chance to join us in San Francisco!

Analytics

Analytics Analytics Data Science AI

Introducing Databricks One

databricks

JUNE 12, 2025

Join now Ready to get started? Get a Demo DATA + AI SUMMIT Data + AI Summit Happening Now Watch the free livestream of the keynotes!

Data Engineering

Data Engineering Data Engineering Data Engineer Data Engineering

5 Error Handling Patterns in Python (Beyond Try-Except)

KDnuggets

JUNE 6, 2025

Blog Top Posts About Topics AI Career Advice Computer Vision Data Engineering Data Science Language Models Machine Learning MLOps NLP Programming Python SQL Datasets Events Resources Cheat Sheets Recommendations Tech Briefs Advertise Join Newsletter 5 Error Handling Patterns in Python (Beyond Try-Except) Stop letting errors crash your app.

Python

Python Natural Language Processing Data Science Machine Learning

Announcing managed MCP servers with Unity Catalog and Mosaic AI Integration

databricks

JUNE 18, 2025

As the MCP standard continues to evolve, we’ll update these to support the latest and greatest.

AI

AI AI Data Science Artificial Intelligence

How AI Is Changing SQL for the Better

Dataversity

OCTOBER 16, 2024

Structured query language (SQL) is one of the most popular programming languages, with nearly 52% of programmers using it in their work. SQL has outlasted many other programming languages due to its stability and reliability.

SQL

SQL AI AI ETL

ETL Process Explained: Essential Steps for Effective Data Management

Pickl AI

OCTOBER 17, 2024

Summary: The ETL process, which consists of data extraction, transformation, and loading, is vital for effective data management. Introduction The ETL process is crucial in modern data management. What is ETL? ETL stands for Extract, Transform, Load.

ETL

ETL Data Warehouse Data Quality SQL

Integrating AWS Data Lake and RDS MS SQL: A Guide to Writing and Retrieving Data Securely

Dataversity

MARCH 26, 2024

Writing data to an AWS data lake and retrieving it to populate an AWS RDS MS SQL database involves several AWS services and a sequence of steps for data transfer and transformation. This process leverages AWS S3 for the data lake storage, AWS Glue for ETL operations, and AWS Lambda for orchestration.

Data Lakes

Data Lakes AWS SQL ETL

Maximising Efficiency with ETL Data: Future Trends and Best Practices

Pickl AI

OCTOBER 17, 2024

Summary: This article explores the significance of ETL Data in Data Management. It highlights key components of the ETL process, best practices for efficiency, and future trends like AI integration and real-time processing, ensuring organisations can leverage their data effectively for strategic decision-making.

ETL

ETL Data Warehouse Data Quality Data Governance

End-to-End model training and deployment with Amazon SageMaker Unified Studio

Flipboard

JULY 3, 2025

Data engineers can create and manage extract, transform, and load (ETL) pipelines directly within Unified Studio using Visual ETL. It also supports unified access across different compute runtimes such as Amazon Redshift and Athena for SQL, Amazon EMR Serverless , Amazon EMR on EC2, and AWS Glue for Spark.

ML

ML AWS ML Data Engineering

How to Translate SQL Scripts Into Matillion Jobs

phData

JULY 12, 2023

Unlike traditional methods that rely on complex SQL queries for orchestration, Matillion Jobs provides a more streamlined approach. By converting SQL scripts into Matillion Jobs , users can take advantage of the platform’s advanced features for job orchestration, scheduling, and sharing. What is Matillion ETL?

SQL

SQL ETL Database Data Pipeline

How to Translate SQL Scripts Into Matillion Jobs

phData

APRIL 21, 2023

Unlike traditional methods that rely on complex SQL queries for orchestration, Matillion Jobs provide a more streamlined approach. By converting SQL scripts into Matillion Jobs , users can take advantage of the platform’s advanced features for job orchestration, scheduling, and sharing. If they are not, the query can be stopped.

SQL

SQL ETL Database Data Pipeline

Difference Between JDBC and ODBC in Database Connectivity

Pickl AI

NOVEMBER 5, 2024

Summary: This article highlights the primary differences between JDBC and ODBC and their unique applications and use cases. This article clarifies the key distinctions between these two database connectivity options, helping readers choose the most suitable one for their projects. In 2022, the global ODBC market was valued at $1.2

Database

Database SQL Python Database Administration

Data Science Career Paths: Analyst, Scientist, Engineer – What’s Right for You?

How to Learn Machine Learning

APRIL 26, 2025

The processes of SQL, Python scripts, and web scraping libraries such as BeautifulSoup or Scrapy are used for carrying out the data collection. Tools like Python (with pandas and NumPy), R, and ETL platforms like Apache NiFi or Talend are used for data preparation before analysis. How to Choose the Right Data Science Career Path?

Data Science

Data Science Data Analyst Data Scientist Machine Learning

AWS at Databricks Data + AI Summit 2025

databricks

JUNE 4, 2025

Get a Demo DATA + AI SUMMIT JUNE 9–12 | SAN FRANCISCO Data + AI Summit is almost here — don’t miss the chance to join us in San Francisco!

AWS

AWS AI AI Data Science

AWS Athena and Glue a Powerful Combo?

Towards AI

APRIL 3, 2024

So if you are familiar with the Standard SQL queries, you are good to go!! The sample data used in this article can be downloaded from the link below, Fruit and Vegetable Prices How much do fruits and vegetables cost? Create a Glue Job to perform ETL operations on your data. For this article, we will run the job on demand.

AWS

AWS Database ETL Big Data

Optimizing Matillion Workflows: A Guide to Visual Design and Best Practices

phData

APRIL 28, 2025

A Matillion pipeline is a collection of jobs that extract, load, and transform (ETL/ELT) data from various sources into a target system, such as a cloud data warehouse like Snowflake. Intuitive Workflow Design Workflows should be easy to follow and visually organized, much like clean, well-structured SQL or Python code.

AI

AI AI SQL ETL

Introduction to Power BI Datamarts

ODSC - Open Data Science

JUNE 12, 2023

This article is an excerpt from the book Expert Data Modeling with Power BI, Third Edition by Soheil Bakhshi, a completely updated and revised edition of the bestselling guide to Power BI and data modeling. Then we have some other ETL processes to constantly land the past 5 years of data into the Datamarts.

Power BI

Power BI Data Warehouse ETL Data Preparation

Software Engineering Patterns for Machine Learning

The MLOps Blog

SEPTEMBER 7, 2023

From writing code for doing exploratory analysis, experimentation code for modeling, ETLs for creating training datasets, Airflow (or similar) code to generate DAGs, REST APIs, streaming jobs, monitoring jobs, etc. Implementing these practices can enhance the efficiency and consistency of ETL workflows.

Machine Learning

Machine Learning Machine Learning ETL ML

Data Version Control for Data Lakes: Handling the Changes in Large Scale

ODSC - Open Data Science

SEPTEMBER 27, 2023

In this article, we will delve into the concept of data lakes, explore their differences from data warehouses and relational databases, and discuss the significance of data version control in the context of large-scale data management. Schema Enforcement: Data warehouses use a “schema-on-write” approach. You can connect with her on Linkedin.

Data Lakes

Data Lakes Data Warehouse Database Big Data

What Are Snowflake’s Best Features for Data Transformation?

phData

AUGUST 8, 2024

Putting the T for Transformation in ELT (ETL) is essential to any data pipeline. They let you create virtual tables from the results of an SQL query. Stored Procedures In any data warehousing solution, stored procedures encapsulate SQL logic into repeatable routines, but Snowflake has some tricks up its sleeve.

SQL

SQL Data Pipeline Python ETL

The Data Dilemma: Exploring the Key Differences Between Data Science and Data Engineering

Pickl AI

JULY 25, 2023

They create data pipelines, ETL processes, and databases to facilitate smooth data flow and storage. With expertise in programming languages like Python , Java , SQL, and knowledge of big data technologies like Hadoop and Spark, data engineers optimize pipelines for data scientists and analysts to access valuable insights efficiently.

Data Engineering

Data Engineering Data Engineering Data Engineering Data Engineer

Top Data Analytics Skills and Platforms for 2023

ODSC - Open Data Science

APRIL 3, 2023

Data Wrangling: Data Quality, ETL, Databases, Big Data The modern data analyst is expected to be able to source and retrieve their own data for analysis. Competence in data quality, databases, and ETL (Extract, Transform, Load) are essential. SQL excels with big data and statistics, making it important in order to query databases.

Analytics

Analytics Analytics Data Analyst Data Science

How to Pull Data From On-prem Systems Using Fivetran’s HVA Connectors

phData

OCTOBER 20, 2023

Some of the databases supported by Fivetran are: Snowflake Data Cloud (BETA) MySQL PostgreSQL SAP ERP SQL Server Oracle In this blog, we will review how to pull Data from on-premise Systems using Fivetran to a specific target or destination. HVA also allows the capture of changes directly from various DBMS articles.

Database

Database SQL ETL Data Warehouse

How to Shift from Data Science to Data Engineering

ODSC - Open Data Science

JANUARY 18, 2024

These areas may include SQL, database design, data warehousing, distributed systems, cloud platforms (AWS, Azure, GCP), and data pipelines. ETL (Extract, Transform, Load) This is a core data engineering process for moving data from one or more sources to a destination, typically a data warehouse or data lake. First, articles.

Data Engineering

Data Engineering Data Engineering Data Engineering Data Engineer

Beginner’s Guide To GCP BigQuery (Part 1)

Mlearning.ai

JULY 10, 2023

In my 7 years of Data Science journey, I’ve been exposed to a number of different databases including but not limited to Oracle Database, MS SQL, MySQL, EDW, and Apache Hadoop. Now let’s get into the main topic of the article. For the part 1 of this article, I wanted to cover Tables, Views, Stored Procedures, and Materialized Views.

SQL

SQL Database Apache Hadoop Data Science

How to Set up a CICD Pipeline for Snowflake to Automate Data Pipelines

phData

JUNE 14, 2023

In recent years, data engineering teams working with the Snowflake Data Cloud platform have embraced the continuous integration/continuous delivery (CI/CD) software development process to develop data products and manage ETL/ELT workloads more efficiently.

Data Pipeline

Data Pipeline Database SQL Data Engineering

How to Use Exploratory Notebooks [Best Practices]

The MLOps Blog

OCTOBER 20, 2023

And that’s what we’re going to focus on in this article, which is the second in my series on Software Patterns for Data Science & ML Engineering. In this article, we’ll talk about Jupyter notebooks specifically from a business and product point of view. There are several ways to use SQl wit Jupyter notebooks. documentation.

SQL

SQL Database Data Scientist Python

The Modern Data Stack Explained: What The Future Holds

Alation

JANUARY 17, 2023

In this article, I will explain the modern data stack in detail, list some benefits, and discuss what the future holds. Reverse ETL tools. The modern data stack is also the consequence of a shift in analysis workflow, fromextract, transform, load (ETL) to extract, load, transform (ELT). A Note on the Shift from ETL to ELT.

Data Warehouse

Data Warehouse ETL Tableau Cloud Data

Spark Vs. Hadoop – All You Need to Know

Pickl AI

SEPTEMBER 19, 2024

Summary: This article compares Spark vs Hadoop, highlighting Spark’s fast, in-memory processing and Hadoop’s disk-based, batch processing model. This article explores Spark vs. Hadoop, focusing on their strengths, weaknesses, and use cases. Spark SQL Spark SQL is a module that works with structured and semi-structured data.

Hadoop

Hadoop Big Data Big Data Clustering

Comparing Tools For Data Processing Pipelines

The MLOps Blog

MARCH 15, 2023

Enables users to trigger their custom transformations via SQL and dbt. Talend Overview While Talend’s Open Studio for Data Integration is free-to-download software to start a basic data integration or an ETL project, it also comes powered with more advanced features which come with a price tag. Uses secure protocols for data security.

Data Pipeline

Data Pipeline ETL Data Quality SQL

Top 50+ Data Analyst Interview Questions & Answers

Pickl AI

APRIL 26, 2024

It covers essential topics such as SQL queries, data visualization, statistical analysis, machine learning concepts, and data manipulation techniques. This article aims to guide you through the intricacies of Data Analyst interviews, offering valuable insights with a comprehensive list of top questions. How do you join tables in SQL?

Data Analyst

Data Analyst Data Analysis Data Analysis Machine Learning

Discover the Most Important Fundamentals of Data Engineering

Pickl AI

NOVEMBER 4, 2024

This article explores the key fundamentals of Data Engineering, highlighting its significance and providing a roadmap for professionals seeking to excel in this vital field. Key components of data warehousing include: ETL Processes: ETL stands for Extract, Transform, Load. ETL is vital for ensuring data quality and integrity.

Data Engineering

Data Engineering Data Engineering Data Engineering Data Engineer

How to Build Machine Learning Systems With a Feature Store

The MLOps Blog

JANUARY 26, 2024

In this article, I’ll introduce you to a unified architecture for ML systems built around the idea of FTI pipelines and a feature store as the central component. Related article How to Build ETL Data Pipelines for ML See also MLOps and FTI pipelines testing Once you have built an ML system, you have to operate, maintain, and update it.

Machine Learning

Machine Learning Machine Learning ML ML

What is ThoughtSpot? Everything You Need to Know

phData

SEPTEMBER 4, 2024

This article was co-written by Lynda Chao & Tess Newkold With the growing interest in AI-powered analytics, ThoughtSpot stands out as a leader among legacy BI solutions known for its self-service search-driven analytics capabilities. Suppose your business requires more robust capabilities across your technology stack. Why Use ThoughtSpot?

Analytics

Analytics Analytics SQL ETL

Importing Data in Python Cheat Sheet with Comprehensive Tutorial

Pickl AI

NOVEMBER 14, 2023

In this Importing Data in Python Cheat Sheet article, we will explore the essential techniques and libraries that will make data import a breeze. Importing from SQL databases Python has excellent support for interacting with databases. In pandas, we can import data from various file formats like JSON, SQL, Microsoft Excel, etc.

Python

Python SQL Database Data Analysis

From zero to BI hero: Launching your business intelligence career

Dataconomy

MARCH 24, 2023

In this article, we will explore the importance of BI in today’s business landscape, the skills and qualifications needed for a career in BI, and the opportunities available in this growing field. A career path in BI can be a lucrative and rewarding choice for those with interest in data analysis and problem-solving.

Business Intelligence

Business Intelligence Business Intelligence Data Analysis Data Analysis

From zero to BI hero: Launching your business intelligence career

Dataconomy

MARCH 24, 2023

In this article, we will explore the importance of BI in today’s business landscape, the skills and qualifications needed for a career in BI, and the opportunities available in this growing field. A career path in BI can be a lucrative and rewarding choice for those with interest in data analysis and problem-solving.

Business Intelligence

Business Intelligence Business Intelligence Data Analysis Data Analysis

Bringing Declarative Pipelines to the Apache Spark™ Open Source Project

databricks

JUNE 12, 2025

Over the past decade, we’ve seen Apache Spark evolve from a powerful general-purpose compute engine into a critical layer of the Open Lakehouse Architecture - with Spark SQL, Structured Streaming, open table formats, and unified governance serving as pillars for modern data platforms. Join now Ready to get started?

SQL

SQL Data Engineering Data Engineering Data Engineer

SQL and Data Integration: ETL and ELT

From Blob Storage to SQL Database Using Azure Data Factory

Trending Sources

Go vs. Python for Modern Data Workflows: Need Help Deciding?

Mosaic AI Announcements at Data + AI Summit 2025

What Is a Lakebase?

Understand Apache Drill and its Working

Introducing Agent Bricks: Auto-Optimized Agents Using Your Data

Introducing Databricks One

5 Error Handling Patterns in Python (Beyond Try-Except)

Announcing managed MCP servers with Unity Catalog and Mosaic AI Integration

How AI Is Changing SQL for the Better

ETL Process Explained: Essential Steps for Effective Data Management

Integrating AWS Data Lake and RDS MS SQL: A Guide to Writing and Retrieving Data Securely

Maximising Efficiency with ETL Data: Future Trends and Best Practices

End-to-End model training and deployment with Amazon SageMaker Unified Studio

How to Translate SQL Scripts Into Matillion Jobs

How to Translate SQL Scripts Into Matillion Jobs

Difference Between JDBC and ODBC in Database Connectivity

Data Science Career Paths: Analyst, Scientist, Engineer – What’s Right for You?

AWS at Databricks Data + AI Summit 2025

AWS Athena and Glue a Powerful Combo?

Optimizing Matillion Workflows: A Guide to Visual Design and Best Practices

Introduction to Power BI Datamarts

Software Engineering Patterns for Machine Learning

Data Version Control for Data Lakes: Handling the Changes in Large Scale

What Are Snowflake’s Best Features for Data Transformation?

The Data Dilemma: Exploring the Key Differences Between Data Science and Data Engineering

Top Data Analytics Skills and Platforms for 2023

How to Pull Data From On-prem Systems Using Fivetran’s HVA Connectors

How to Shift from Data Science to Data Engineering

Beginner’s Guide To GCP BigQuery (Part 1)

How to Set up a CICD Pipeline for Snowflake to Automate Data Pipelines

How to Use Exploratory Notebooks [Best Practices]

The Modern Data Stack Explained: What The Future Holds

Spark Vs. Hadoop – All You Need to Know

Comparing Tools For Data Processing Pipelines

Top 50+ Data Analyst Interview Questions & Answers

Discover the Most Important Fundamentals of Data Engineering

How to Build Machine Learning Systems With a Feature Store

What is ThoughtSpot? Everything You Need to Know

Importing Data in Python Cheat Sheet with Comprehensive Tutorial

From zero to BI hero: Launching your business intelligence career

From zero to BI hero: Launching your business intelligence career

Bringing Declarative Pipelines to the Apache Spark™ Open Source Project

Stay Connected