Cloud Data, Data Engineering and Python

What’s New with Azure Databricks: Unified Governance, Open Formats, and AI-Native Workloads

databricks

JULY 15, 2025

Cross-cloud data governance with Unity Catalog supports accessing S3 data from Azure Databricks. This enables organizations to enforce consistent security, auditing, and data lineage across cloud boundaries. Mirrored Azure Databricks Catalog is now Generally Available.

Azure

Azure Power BI ETL AI

Data Science Career Paths: Analyst, Scientist, Engineer – What’s Right for You?

How to Learn Machine Learning

APRIL 26, 2025

The field of data science is now one of the most preferred and lucrative career options available in the area of data because of the increasing dependence on data for decision-making in businesses, which makes the demand for data science hires peak.

Data Science

Data Science Data Analyst Data Scientist Machine Learning

How PayU built a secure enterprise AI assistant using Amazon Bedrock

Flipboard

JULY 15, 2025

Rahul Ghosh is a seasoned Data & AI Engineer with deep expertise in cloud-based data architectures, large-scale data processing, and modern AI technologies, including generative AI, LLMs, Retrieval Augmented Generation (RAG), and agent-based systems.

AWS

AWS AI AI SQL

Webinars

Precision in Motion: Why Process Optimization Is the Future of Manufacturing

Airflow Best Practices for ETL/ELT Pipelines

MORE WEBINARS

Top 10 Python Scripts for use in Matillion for Snowflake

phData

OCTOBER 28, 2024

Modern low-code/no-code ETL tools allow data engineers and analysts to build pipelines seamlessly using a drag-and-drop and configure approach with minimal coding. One such option is the availability of Python Components in Matillion ETL, which allows us to run Python code inside the Matillion instance.

Python

Python ETL AWS Database

5 Features Of Snowflake That Data Engineers Must Know

Analytics Vidhya

OCTOBER 19, 2021

This article was published as a part of the Data Science Blogathon Snowflake is a cloud data platform that comes with a lot of unique features when compared to traditional on-premise RDBMS systems. The post 5 Features Of Snowflake That Data Engineers Must Know appeared first on Analytics Vidhya.

Data Engineering

Data Engineering Data Engineering Data Engineering Data Engineer

Why using Infrastructure as Code for developing Cloud-based Data Warehouse Systems?

Data Science Blog

SEPTEMBER 19, 2023

By automating the provisioning and management of cloud resources through code, IaC brings a host of advantages to the development and maintenance of Data Warehouse Systems in the cloud. So why using IaC for Cloud Data Infrastructures? using for loops in Python).

Data Warehouse

Data Warehouse Azure SQL Database

Getting Started With Snowflake Data Platform

Analytics Vidhya

JULY 8, 2021

ArticleVideo Book This article was published as a part of the Data Science Blogathon Introduction Snowflake is a cloud data platform solution with unique features. The post Getting Started With Snowflake Data Platform appeared first on Analytics Vidhya.

Cloud Data

Cloud Data Data Science Analytics Analytics

How to Connect Snowflake to Python

phData

JANUARY 5, 2023

Python is the top programming language used by data engineers in almost every industry. Python has proven proficient in setting up pipelines, maintaining data flows, and transforming data with its simple syntax and proficiency in automation. Why Connect Snowflake to Python?

Python

Python Data Engineering Data Engineering Data Engineering

A Guide to Choose the Best Data Science Bootcamp

Data Science Dojo

JULY 3, 2024

Data science bootcamps are intensive short-term educational programs designed to equip individuals with the skills needed to enter or advance in the field of data science. They cover a wide range of topics, ranging from Python, R, and statistics to machine learning and data visualization.

Data Science

Data Science Machine Learning Machine Learning Data Visualization

Recapping the Cloud Amplifier and Snowflake Demo

Towards AI

JANUARY 28, 2024

To start, get to know some key terms from the demo: Snowflake: The centralized source of truth for our initial data Magic ETL: Domo’s tool for combining and preparing data tables ERP: A supplemental data source from Salesforce Geographic: A supplemental data source (i.e., Instagram) used in the demo Why Snowflake?

ETL

ETL Python Database Data Preparation

DataRobot Flies Higher with Zepl Acquisition, Adding Cloud Native Notebook Solution to AI Platform

DataRobot

MAY 11, 2021

Founded in 2016 by the creator of Apache Zeppelin, Zepl provides a self-service data science notebook solution for advanced data scientists to do exploratory, code-centric work in Python, R, and Scala. Data Exploration, Visualization, and First-Class Integration. And Even More to Come in 2021.

Data Scientist

Data Scientist Citizen Data Scientist Data Science AI

Snowflake Snowpark: cloud SQL and Python ML pipelines

Snorkel AI

MAY 26, 2023

[link] Ahmad Khan, head of artificial intelligence and machine learning strategy at Snowflake gave a presentation entitled “Scalable SQL + Python ML Pipelines in the Cloud” about his company’s Snowpark service at Snorkel AI’s Future of Data-Centric AI virtual conference in August 2022. Welcome everybody.

SQL

SQL ML ML Python

Snowflake Snowpark: cloud SQL and Python ML pipelines

Snorkel AI

MAY 26, 2023

[link] Ahmad Khan, head of artificial intelligence and machine learning strategy at Snowflake gave a presentation entitled “Scalable SQL + Python ML Pipelines in the Cloud” about his company’s Snowpark service at Snorkel AI’s Future of Data-Centric AI virtual conference in August 2022. Welcome everybody.

SQL

SQL ML ML Python

How to Build Effective Data Pipelines in Snowpark

phData

AUGUST 6, 2024

Organizations must ensure their data pipelines are well designed and implemented to achieve this, especially as their engagement with cloud data platforms such as the Snowflake Data Cloud grows. For customers in Snowflake, Snowpark is a powerful tool for building these effective and scalable data pipelines.

Data Pipeline

Data Pipeline Python Data Engineering Data Engineering

Accelerating AI/ML development at BMW Group with Amazon SageMaker Studio

Flipboard

NOVEMBER 24, 2023

JuMa is a service of BMW Group’s AI platform for its data analysts, ML engineers, and data scientists that provides a user-friendly workspace with an integrated development environment (IDE). It is powered by Amazon SageMaker Studio and provides JupyterLab for Python and Posit Workbench for R.

ML

ML ML AWS Data Scientist

How Does Snowpark Work?

phData

FEBRUARY 7, 2024

The Snowflake Data Cloud is a leading cloud data platform that provides various features and services for data storage, processing, and analysis. A new feature that Snowflake offers is called Snowpark, which provides an intuitive library for querying and processing data at scale in Snowflake.

Python

Python ML ML SQL

Boosting developer productivity: How Deloitte uses Amazon SageMaker Canvas for no-code/low-code machine learning

AWS Machine Learning Blog

DECEMBER 1, 2023

Proper data preparation leads to better model performance and more accurate predictions. SageMaker Canvas allows interactive data exploration, transformation, and preparation without writing any SQL or Python code. SageMaker Canvas recently added a Chat with data option. On the Create menu, choose Document.

Machine Learning

Machine Learning Machine Learning Data Preparation ML

Best Practices When Developing Matillion Jobs

phData

SEPTEMBER 2, 2024

Best practices are a pivotal part of any software development, and data engineering is no exception. This ensures the data pipelines we create are robust, durable, and secure, providing the desired data to the organization effectively and consistently. Below are the best practices. Schedules cannot be edited or executed.

ETL

ETL Data Warehouse SQL Database

Optimize pet profiles for Purina’s Petfinder application using Amazon Rekognition Custom Labels and AWS Step Functions

AWS Machine Learning Blog

OCTOBER 18, 2023

Deployment with the AWS CDK The Step Functions state machine and associated infrastructure (including Lambda functions, CodeBuild projects, and Systems Manager parameters) are deployed with the AWS CDK using Python. He is passionate about helping customers to build scalable and modern data analytics solutions to gain insights from the data.

AWS

AWS ML ML Machine Learning

New DataRobot and Snowflake Integrations: Seamless Data Prep, Model Deployment, and Monitoring

DataRobot Blog

MARCH 16, 2023

The DataRobot team has been working hard on new integrations that make data scientists more agile and meet the needs of enterprise IT, starting with Snowflake. We’ve tightened the loop between ML data prep , experimentation and testing all the way through to putting models into production.

Data Scientist

Data Scientist ML ML Data Preparation

Alation 2022.1: Customize Your Data Catalog

Alation

MARCH 1, 2022

Through Impact Analysis, users can determine if a problem occurred with data upstream, and locate the impacted data downstream. With robust data lineage, data engineers can find and fix issues fast and prevent them from recurring. Similarly, analysts gain a clear view of how data is created.

Data Warehouse

Data Warehouse Data Lakes Cloud Data Database

Best Practices For Using Snowflake With KNIME

phData

MARCH 29, 2023

However, many analysts and other data professionals run into two common problems: They are not given direct access to their database They lack the skills in SQL to write the queries themselves The traditional solution to these problems is to rely on IT and data engineering teams. What can be done?

Database

Database SQL Analytics Analytics

Exploring the AI and data capabilities of watsonx

IBM Journey to AI blog

JULY 17, 2023

Within watsonx.ai, users can take advantage of open-source frameworks like PyTorch, TensorFlow and scikit-learn alongside IBM’s entire machine learning and data science toolkit and its ecosystem tools for code-based and visual data science capabilities. Savings may vary depending on configurations, workloads and vendor.

AI

AI AI Machine Learning Machine Learning

The Modern Data Stack Explained: What The Future Holds

Alation

JANUARY 17, 2023

These tools are used to manage big data, which is defined as data that is too large or complex to be processed by traditional means. How Did the Modern Data Stack Get Started? The rise of cloud computing and cloud data warehousing has catalyzed the growth of the modern data stack. Data scientists.

Data Warehouse

Data Warehouse ETL Tableau Cloud Data

What Are The Best Third-Party Data Ingestion Tools For Snowflake?

phData

FEBRUARY 14, 2023

Matillion Matillion is a complete ETL tool that integrates with an extensive list of pre-built data source connectors, loads data into cloud data environments such as Snowflake, and then performs transformations to make data consumable by analytics tools such as Tableau and PowerBI.

Data Warehouse

Data Warehouse Azure AWS Database

How to Build ETL Data Pipeline in ML

The MLOps Blog

MAY 17, 2023

This article explores the importance of ETL pipelines in machine learning, a hands-on example of building ETL pipelines with a popular tool, and suggests the best ways for data engineers to enhance and sustain their pipelines. Create a python function file that consists of all the ETL tasks – etl_functions.py.

ETL

ETL Data Pipeline ML ML

What is ThoughtSpot? Everything You Need to Know

phData

SEPTEMBER 4, 2024

ThoughtSpot is a cloud-based AI-powered analytics platform that uses natural language processing (NLP) or natural language query (NLQ) to quickly query results and generate visualizations without the user needing to know any SQL or table relations. Suppose your business requires more robust capabilities across your technology stack.

Analytics

Analytics Analytics SQL ETL

Getting Started With Snowflake: Best Practices For Launching

phData

DECEMBER 4, 2023

However, if there’s one thing we’ve learned from years of successful cloud data implementations here at phData, it’s the importance of: Defining and implementing processes Building automation, and Performing configuration …even before you create the first user account. You can use whatever works best for your technology.

Clustering

Clustering Database SQL Data Pipeline

Process Mining – Ist Celonis wirklich so gut? Ein Praxisbericht.

Data Science Blog

SEPTEMBER 3, 2024

Celonis versucht Machine Learning innerhalb der Plattform aus einer Hand anzubieten und hat auch eigene Python-Bibleotheken dafür entwickelt. Reduzierte Personalkosten , sind oft dann gegeben, wenn interne Data Engineers verfügbar sind, die die Datenmodelle intern entwickeln. Bisher dreht sich hier viel eher noch um z.

Data Science

Data Science Power BI Azure Data Warehouse

Data Science Current

What’s New with Azure Databricks: Unified Governance, Open Formats, and AI-Native Workloads

Data Science Career Paths: Analyst, Scientist, Engineer – What’s Right for You?

Webinars

Trending Sources

How PayU built a secure enterprise AI assistant using Amazon Bedrock

Webinars

Top 10 Python Scripts for use in Matillion for Snowflake

5 Features Of Snowflake That Data Engineers Must Know

Why using Infrastructure as Code for developing Cloud-based Data Warehouse Systems?

Getting Started With Snowflake Data Platform

How to Connect Snowflake to Python

A Guide to Choose the Best Data Science Bootcamp

Recapping the Cloud Amplifier and Snowflake Demo

DataRobot Flies Higher with Zepl Acquisition, Adding Cloud Native Notebook Solution to AI Platform

Snowflake Snowpark: cloud SQL and Python ML pipelines

Snowflake Snowpark: cloud SQL and Python ML pipelines

How to Build Effective Data Pipelines in Snowpark

Accelerating AI/ML development at BMW Group with Amazon SageMaker Studio

How Does Snowpark Work?

Boosting developer productivity: How Deloitte uses Amazon SageMaker Canvas for no-code/low-code machine learning

Best Practices When Developing Matillion Jobs

Optimize pet profiles for Purina’s Petfinder application using Amazon Rekognition Custom Labels and AWS Step Functions

New DataRobot and Snowflake Integrations: Seamless Data Prep, Model Deployment, and Monitoring

Alation 2022.1: Customize Your Data Catalog

Best Practices For Using Snowflake With KNIME

Exploring the AI and data capabilities of watsonx

The Modern Data Stack Explained: What The Future Holds

What Are The Best Third-Party Data Ingestion Tools For Snowflake?

How to Build ETL Data Pipeline in ML

What is ThoughtSpot? Everything You Need to Know

Getting Started With Snowflake: Best Practices For Launching

Process Mining – Ist Celonis wirklich so gut? Ein Praxisbericht.

Stay Connected