Cloud Data, Python and SQL - Data Science Current

What’s New with Azure Databricks: Unified Governance, Open Formats, and AI-Native Workloads

databricks

JULY 15, 2025

AI/BI Genie is now generally available, empowering business users to ask data questions in natural language and receive accurate, explainable answers. Powered by Data Intelligence, Genie learns from organizational usage patterns and metadata to generate SQL, charts, and summaries grounded in trusted data.

Azure

Azure Power BI AI AI

How PayU built a secure enterprise AI assistant using Amazon Bedrock

Flipboard

JULY 15, 2025

These agents follow a combination of RAG and text-to-SQL approaches. These approaches provide precise, context-aware responses while maintaining data governance. Implementation details of the text-to-SQL workflow is described in the following diagram. Access to these agents is governed by user roles and job functions.

AWS

AWS AI AI SQL

Data Science Career Paths: Analyst, Scientist, Engineer – What’s Right for You?

How to Learn Machine Learning

APRIL 26, 2025

Data can be generated from databases, sensors, social media platforms, APIs, logs, and web scraping. Data can be in structured (like tables in databases), semi-structured (like XML or JSON), or unstructured (like text, audio, and images) form. Deployment and Monitoring Once a model is built, it is moved to production.

Data Science

Data Science Data Analyst Data Scientist Machine Learning

Webinars

Precision in Motion: Why Process Optimization Is the Future of Manufacturing

Airflow Best Practices for ETL/ELT Pipelines

MORE WEBINARS

How to Split Text For Vector Embeddings in Snowflake

phData

NOVEMBER 28, 2024

“ Vector Databases are completely different from your cloud data warehouse.” – You might have heard that statement if you are involved in creating vector embeddings for your RAG-based Gen AI applications. This process is repeated until the entire text is divided into coherent segments. Return the chunks as an ARRAY.

Python

Python Database SQL Machine Learning

Optimizing Matillion Workflows: A Guide to Visual Design and Best Practices

phData

APRIL 28, 2025

A Matillion pipeline is a collection of jobs that extract, load, and transform (ETL/ELT) data from various sources into a target system, such as a cloud data warehouse like Snowflake. Intuitive Workflow Design Workflows should be easy to follow and visually organized, much like clean, well-structured SQL or Python code.

AI

AI AI SQL ETL

Top 10 Python Scripts for use in Matillion for Snowflake

phData

OCTOBER 28, 2024

One such option is the availability of Python Components in Matillion ETL, which allows us to run Python code inside the Matillion instance. In this blog, we will describe 10 such Python Scripts that can provide a blueprint for using the Python component efficiently in Matillion ETL for Snowflake AI Data Cloud.

Python

Python ETL AWS Database

FeatureByte Releases FeatureByte SDK in Open Source

insideBIGDATA

MAY 13, 2023

FeatureByte, an AI startup formed by a team of data science experts, announced the release of its open-source FeatureByte SDK. The SDK allows data scientists to use Python to create state-of-the-art features and deploy feature pipelines in minutes – all with just a few lines of code.

Data Scientist

Data Scientist Cloud Data SQL Data Science

Why using Infrastructure as Code for developing Cloud-based Data Warehouse Systems?

Data Science Blog

SEPTEMBER 19, 2023

By automating the provisioning and management of cloud resources through code, IaC brings a host of advantages to the development and maintenance of Data Warehouse Systems in the cloud. So why using IaC for Cloud Data Infrastructures? using for loops in Python).

Data Warehouse

Data Warehouse Azure SQL Database

Data Science News from Microsoft Ignite 2019

Data Science 101

NOVEMBER 7, 2019

Microsoft just held one of its largest conferences of the year, and a few major announcements were made which pertain to the cloud data science world. Azure Synapse Analytics can be seen as a merge of Azure SQL Data Warehouse and Azure Data Lake. Python support has been available for a while.

Data Science

Data Science Azure SQL Machine Learning

Snowflake Snowpark: cloud SQL and Python ML pipelines

Snorkel AI

MAY 26, 2023

[link] Ahmad Khan, head of artificial intelligence and machine learning strategy at Snowflake gave a presentation entitled “Scalable SQL + Python ML Pipelines in the Cloud” about his company’s Snowpark service at Snorkel AI’s Future of Data-Centric AI virtual conference in August 2022. Welcome everybody.

SQL

SQL ML ML Python

Snowflake Snowpark: cloud SQL and Python ML pipelines

Snorkel AI

MAY 26, 2023

[link] Ahmad Khan, head of artificial intelligence and machine learning strategy at Snowflake gave a presentation entitled “Scalable SQL + Python ML Pipelines in the Cloud” about his company’s Snowpark service at Snorkel AI’s Future of Data-Centric AI virtual conference in August 2022. Welcome everybody.

SQL

SQL ML ML Python

A Guide to Choose the Best Data Science Bootcamp

Data Science Dojo

JULY 3, 2024

Data science bootcamps are intensive short-term educational programs designed to equip individuals with the skills needed to enter or advance in the field of data science. They cover a wide range of topics, ranging from Python, R, and statistics to machine learning and data visualization.

Data Science

Data Science Machine Learning Machine Learning Data Visualization

Business Intelligence for Fairs, Congresses and Exhibitions

Smart Data Collective

APRIL 14, 2021

Formerly known as Periscope, Sisense is a business intelligence tool ideal for cloud data teams. With this tool, analysts are able to visualize complex data models in Python, SQL, and R. This highly flexible and modern SQL editor comes bundled with an easy-to-use, attractive interface.

Business Intelligence

Business Intelligence Business Intelligence Tableau SQL

Exploring the Data Science vs Computer Science Debate

Data Science Dojo

SEPTEMBER 5, 2024

The common skills required within each are listed as follows: Computer Science Programming Skills : Proficiency in various programming languages such as Python, Java, and C++ is essential. Algorithms and Data Structures : Deep understanding of algorithms and data structures to develop efficient and effective software solutions.

Computer Science

Computer Science Computer Science Data Science Machine Learning

Exploring the Data Science vs Computer Science Debate

Data Science Dojo

SEPTEMBER 5, 2024

The common skills required within each are listed as follows: Computer Science Programming Skills : Proficiency in various programming languages such as Python, Java, and C++ is essential. Algorithms and Data Structures : Deep understanding of algorithms and data structures to develop efficient and effective software solutions.

Computer Science

Computer Science Computer Science Data Science Machine Learning

The Best Data Management Tools For Small Businesses

Smart Data Collective

APRIL 29, 2020

Usually the term refers to the practices, techniques and tools that allow access and delivery through different fields and data structures in an organisation. Data management approaches are varied and may be categorised in the following: Cloud data management. Master data management.

Data Warehouse

Data Warehouse Azure SQL ETL

Build ML features at scale with Amazon SageMaker Feature Store using data from Amazon Redshift

Flipboard

AUGUST 17, 2023

Amazon Redshift is the most popular cloud data warehouse that is used by tens of thousands of customers to analyze exabytes of data every day. If you are prompted to choose a kernel, choose Data Science as the image and Python 3 as the kernel, then choose Select.

ML

ML ML AWS Data Warehouse

Best Practices When Developing Matillion Jobs

phData

SEPTEMBER 2, 2024

Matillion Jobs are an important part of the modern data stack because we can create lightweight, low-code ETL/ELT processes using a GUI, reverse ETL (loading data back into application databases), LLM usage features, and store and transform data in multiple cloud data warehouses. Below are the best practices.

ETL

ETL Data Warehouse SQL Database

What are the Differences Between Snowflake UDF Languages?

phData

JUNE 23, 2023

The Snowflake Data Cloud was built natively for the cloud. When we think about cloud data transformations, one crucial building block is User Defined Functions (UDFs). Python Enabling a development team to use third-party packages can significantly reduce the need to reinvent the wheel.

SQL

SQL Python Algorithm Big Data

How Does Snowpark Work?

phData

FEBRUARY 7, 2024

The Snowflake Data Cloud is a leading cloud data platform that provides various features and services for data storage, processing, and analysis. A new feature that Snowflake offers is called Snowpark, which provides an intuitive library for querying and processing data at scale in Snowflake.

Python

Python ML ML SQL

Future insights and challenges in data analytics with Aksinia Chumachenko

Dataconomy

SEPTEMBER 27, 2024

It was my first job as a data analyst. It helped me to become familiar with popular tools such as Excel and SQL and to develop my analytical thinking. The time I spent at Renault helped me realize that data analytics is something I would be interested in pursuing as a full-time career.

Analytics

Analytics Analytics Big Data Big Data

Best Practices For Using Snowflake With KNIME

phData

MARCH 29, 2023

Services such as the Snowflake Data Cloud can house massive amounts of data and allows users to write queries to rapidly transform raw data into reports and further analyses. For somebody who cannot access their database directly or who lacks expert-level skills in SQL, this provides a significant advantage.

Database

Database SQL Analytics Analytics

How to Build Effective Data Pipelines in Snowpark

phData

AUGUST 6, 2024

Organizations must ensure their data pipelines are well designed and implemented to achieve this, especially as their engagement with cloud data platforms such as the Snowflake Data Cloud grows. For customers in Snowflake, Snowpark is a powerful tool for building these effective and scalable data pipelines.

Data Pipeline

Data Pipeline Python Data Engineering Data Engineer

Boosting developer productivity: How Deloitte uses Amazon SageMaker Canvas for no-code/low-code machine learning

AWS Machine Learning Blog

DECEMBER 1, 2023

Proper data preparation leads to better model performance and more accurate predictions. SageMaker Canvas allows interactive data exploration, transformation, and preparation without writing any SQL or Python code. SageMaker Canvas recently added a Chat with data option. On the Create menu, choose Document.

Machine Learning

Machine Learning Machine Learning Data Preparation ML

How Fivetran and dbt Help With ELT

phData

AUGUST 9, 2023

Open source big data tools like Hadoop were experimented with – these could land data into a repository first before transformation. Thus, the early data lakes began following more of the EL-style flow. But then, in the 2010s, cloud data warehouses, particularly ones like Snowflake , came along and really changed the game.

Data Warehouse

Data Warehouse ETL Cloud Data Big Data

Getting Started With Snowflake: Best Practices For Launching

phData

DECEMBER 4, 2023

However, if there’s one thing we’ve learned from years of successful cloud data implementations here at phData, it’s the importance of: Defining and implementing processes Building automation, and Performing configuration …even before you create the first user account. And once again, for loading data, do not use SQL Inserts.

Clustering

Clustering Database SQL Data Pipeline

How to use Snowflake Zero Copy Cloning in your CI/CD Pipelines

phData

MAY 11, 2023

There are many frameworks for testing software, but the right way to test the data and SQL scripts that change data are less obvious. This is because databases and the data therein are constantly changing. To truly test the effects of a deployment, you need to have an environment with the exact data that is in Production.

Database

Database SQL DataOps Data Warehouse

How Fivetran and Snowflake Optimize Supply Chain Operations

phData

MAY 25, 2023

Fivetran Fivetran is an automated data integration platform that offers a convenient solution for businesses to consolidate and sync data from disparate data sources. With over 160 data connectors available, Fivetran makes it easy to move supply chain data across any cloud data platform in the market.

Data Silos

Data Silos System Architecture Cloud Data Data Analyst

The Modern Data Stack Explained: What The Future Holds

Alation

JANUARY 17, 2023

These tools are used to manage big data, which is defined as data that is too large or complex to be processed by traditional means. How Did the Modern Data Stack Get Started? The rise of cloud computing and cloud data warehousing has catalyzed the growth of the modern data stack.

Data Warehouse

Data Warehouse ETL Tableau Cloud Data

Exploring the AI and data capabilities of watsonx

IBM Journey to AI blog

JULY 17, 2023

Within watsonx.ai, users can take advantage of open-source frameworks like PyTorch, TensorFlow and scikit-learn alongside IBM’s entire machine learning and data science toolkit and its ecosystem tools for code-based and visual data science capabilities. Savings may vary depending on configurations, workloads and vendor.

AI

AI AI Machine Learning Machine Learning

What are Snowflake’s Top Features?

phData

JUNE 3, 2024

Snowflake AI Data Cloud has become a premier cloud data warehousing solution. Maybe you’re just getting started looking into a cloud solution for your organization, or maybe you’ve already got Snowflake and are wondering what features you’re missing out on. Snowflake has you covered with Cortex.

Machine Learning

Machine Learning Machine Learning Database Cloud Data

What is ThoughtSpot? Everything You Need to Know

phData

SEPTEMBER 4, 2024

ThoughtSpot is a cloud-based AI-powered analytics platform that uses natural language processing (NLP) or natural language query (NLQ) to quickly query results and generate visualizations without the user needing to know any SQL or table relations. Why Use ThoughtSpot?

Analytics

Analytics Analytics SQL ETL

What Are The Best Third-Party Data Ingestion Tools For Snowflake?

phData

FEBRUARY 14, 2023

Matillion Matillion is a complete ETL tool that integrates with an extensive list of pre-built data source connectors, loads data into cloud data environments such as Snowflake, and then performs transformations to make data consumable by analytics tools such as Tableau and PowerBI.

Data Warehouse

Data Warehouse Azure AWS Database

Process Mining – Ist Celonis wirklich so gut? Ein Praxisbericht.

Data Science Blog

SEPTEMBER 3, 2024

Celonis unterscheidet sich von den meisten anderen Tools noch dahingehend, dass es versucht, die ganze Kette des Process Minings in einer einzigen und ausschließlichen Cloud-Anwendung in einer Suite bereitzustellen. Vielleicht haben wir auch das ein Stück weit Celonis zu verdanken. Aber auch andere Prozesse für andere Geschäftsprozesse z.

Data Science

Data Science Power BI Azure Data Warehouse

Data Science Current

What’s New with Azure Databricks: Unified Governance, Open Formats, and AI-Native Workloads

How PayU built a secure enterprise AI assistant using Amazon Bedrock

Webinars

Trending Sources

Data Science Career Paths: Analyst, Scientist, Engineer – What’s Right for You?

Webinars

How to Split Text For Vector Embeddings in Snowflake

Optimizing Matillion Workflows: A Guide to Visual Design and Best Practices

Top 10 Python Scripts for use in Matillion for Snowflake

FeatureByte Releases FeatureByte SDK in Open Source

Why using Infrastructure as Code for developing Cloud-based Data Warehouse Systems?

Data Science News from Microsoft Ignite 2019

Snowflake Snowpark: cloud SQL and Python ML pipelines

Snowflake Snowpark: cloud SQL and Python ML pipelines

A Guide to Choose the Best Data Science Bootcamp

Business Intelligence for Fairs, Congresses and Exhibitions

Exploring the Data Science vs Computer Science Debate

Exploring the Data Science vs Computer Science Debate

The Best Data Management Tools For Small Businesses

Build ML features at scale with Amazon SageMaker Feature Store using data from Amazon Redshift

Best Practices When Developing Matillion Jobs

What are the Differences Between Snowflake UDF Languages?

How Does Snowpark Work?

Future insights and challenges in data analytics with Aksinia Chumachenko

Best Practices For Using Snowflake With KNIME

How to Build Effective Data Pipelines in Snowpark

Boosting developer productivity: How Deloitte uses Amazon SageMaker Canvas for no-code/low-code machine learning

How Fivetran and dbt Help With ELT

Getting Started With Snowflake: Best Practices For Launching

How to use Snowflake Zero Copy Cloning in your CI/CD Pipelines

How Fivetran and Snowflake Optimize Supply Chain Operations

The Modern Data Stack Explained: What The Future Holds

Exploring the AI and data capabilities of watsonx

What are Snowflake’s Top Features?

What is ThoughtSpot? Everything You Need to Know

What Are The Best Third-Party Data Ingestion Tools For Snowflake?

Process Mining – Ist Celonis wirklich so gut? Ein Praxisbericht.

Stay Connected