Artificial Intelligence, Data Pipeline and Python

Streaming Langchain: Real-time Data Processing with AI

Data Science Dojo

NOVEMBER 25, 2024

As the world becomes more interconnected and data-driven, the demand for real-time applications has never been higher. Artificial intelligence (AI) and natural language processing (NLP) technologies are evolving rapidly to manage live data streams.

AI

AI AI Predictive Analytics Python

Transforming Your Data Pipeline with dbt(data build tool)

Analytics Vidhya

JUNE 14, 2024

While many ETL tools exist, dbt (data build tool) is emerging as a game-changer. This article dives into the core functionalities of dbt, exploring its unique strengths and how […] The post Transforming Your Data Pipeline with dbt(data build tool) appeared first on Analytics Vidhya.

Data Pipeline

Data Pipeline ETL Analytics Analytics

Improving air quality with generative AI

AWS Machine Learning Blog

JUNE 18, 2024

More than 170 tech teams used the latest cloud, machine learning and artificial intelligence technologies to build 33 solutions. The fundamental objective is to build a manufacturer-agnostic database, leveraging generative AI’s ability to standardize sensor outputs, synchronize data, and facilitate precise corrections.

AWS

AWS Python AI AI

Webinars

What’s New in Apache Airflow® 3.0—And How Will It Reshape Your Data Workflows?

MORE WEBINARS

Adversarial Learning with Keras and TensorFlow (Part 2): Implementing the Neural Structured Learning (NSL) Framework and Building a Data Pipeline

PyImageSearch

JANUARY 15, 2024

Home Table of Contents Adversarial Learning with Keras and TensorFlow (Part 2): Implementing the Neural Structured Learning (NSL) Framework and Building a Data Pipeline Adversarial Learning with NSL CIFAR-10 Dataset Configuring Your Development Environment Need Help Configuring Your Development Environment?

Data Pipeline

Data Pipeline Deep Learning Deep Learning Computer Science

Using Guardrails for Trustworthy AI, Projected AI Trends for 2024, and the Top Remote AI Jobs in…

ODSC - Open Data Science

DECEMBER 14, 2023

Google Unveils its Latest AI Model Gemini Google has just introduced Gemini, its anticipated AI model that promises to reshape the landscape of artificial intelligence. Industry, Opinion, Career Advice 7 Data Science & AI Trends That Will Define 2024 2023 was a huge year for artificial intelligence, and 2024 will be even bigger.

K-nearest Neighbors

K-nearest Neighbors AI AI Machine Learning

A Guide to Choose the Best Data Science Bootcamp

Data Science Dojo

JULY 3, 2024

Data science bootcamps are intensive short-term educational programs designed to equip individuals with the skills needed to enter or advance in the field of data science. They cover a wide range of topics, ranging from Python, R, and statistics to machine learning and data visualization.

Data Science

Data Science Machine Learning Machine Learning Data Visualization

Fine tune a generative AI application for Amazon Bedrock using Amazon SageMaker Pipeline decorators

AWS Machine Learning Blog

AUGUST 22, 2024

Building a deployment pipeline for generative artificial intelligence (AI) applications at scale is a formidable challenge because of the complexities and unique requirements of these systems. Generative AI applications require continuous ingestion, preprocessing, and formatting of vast amounts of data from various sources.

ML

ML ML Python AWS

Use Snowflake as a data source to train ML models with Amazon SageMaker

AWS Machine Learning Blog

MARCH 8, 2023

In order to train a model using data stored outside of the three supported storage services, the data first needs to be ingested into one of these services (typically Amazon S3). This requires building a data pipeline (using tools such as Amazon SageMaker Data Wrangler ) to move data into Amazon S3.

ML

ML ML AWS Python

Top NLP Skills, Frameworks, Platforms, and Languages for 2023

ODSC - Open Data Science

FEBRUARY 17, 2023

Cloud Computing, APIs, and Data Engineering NLP experts don’t go straight into conducting sentiment analysis on their personal laptops. Data Engineering Platforms Spark is still the leader for data pipelines but other platforms are gaining ground. Knowing some SQL is also essential.

Data Science

Data Science Deep Learning Deep Learning Natural Language Processing

AWS Machine Learning: A Beginner’s Guide

How to Learn Machine Learning

DECEMBER 24, 2024

You can easily: Store and process data using S3 and RedShift Create data pipelines with AWS Glue Deploy models through API Gateway Monitor performance with CloudWatch Manage access control with IAM This integrated ecosystem makes it easier to build end-to-end machine learning solutions.

Machine Learning

Machine Learning Machine Learning AWS ML

ODSC West 2023 Recap in Pictures

ODSC - Open Data Science

DECEMBER 5, 2023

On Wednesday, Peter Norvig, PhD, Engineering Director at Google and Education Fellow at the Stanford Institute for Human-Centered Artificial Intelligence (HAI) spoke about the human side of AI and how we can focus on using AI for the greater good, improving all stakeholders’ lives and the needs of all users.

Data Science

Data Science Artificial Intelligence Artificial Intelligence Machine Learning

The 2021 Executive Guide To Data Science and AI

Applied Data Science

AUGUST 2, 2021

Automation Automating data pipelines and models ➡️ 6. The most common data science languages are Python and R — SQL is also a must have skill for acquiring and manipulating data. The Data Engineer Not everyone working on a data science project is a data scientist.

Data Science

Data Science Data Scientist ML ML

Building a Dataset for Triplet Loss with Keras and TensorFlow

Flipboard

FEBRUARY 13, 2023

Project Structure Creating Our Configuration File Creating Our Data Pipeline Preprocessing Faces: Detection and Cropping Summary Citation Information Building a Dataset for Triplet Loss with Keras and TensorFlow In today’s tutorial, we will take the first step toward building our real-time face recognition application. The dataset.py

Data Pipeline

Data Pipeline Deep Learning Deep Learning Python

Journeying into the realms of ML engineers and data scientists

Dataconomy

MAY 16, 2023

With the explosion of big data and advancements in computing power, organizations can now collect, store, and analyze massive amounts of data to gain valuable insights. Machine learning, a subset of artificial intelligence , enables systems to learn and improve from data without being explicitly programmed.

Data Scientist

Data Scientist ML ML Machine Learning

11 Open Source Data Exploration Tools You Need to Know in 2023

ODSC - Open Data Science

FEBRUARY 24, 2023

These tools will help make your initial data exploration process easy. ydata-profiling GitHub | Website The primary goal of ydata-profiling is to provide a one-line Exploratory Data Analysis (EDA) experience in a consistent and fast solution. Output is a fully self-contained HTML application. You can watch it on demand here.

Exploratory Data Analysis

Exploratory Data Analysis Data Visualization Data Analysis Data Analysis

Training and Making Predictions with Siamese Networks and Triplet Loss

PyImageSearch

MARCH 20, 2023

Jump Right To The Downloads Section Training and Making Predictions with Siamese Networks and Triplet Loss In the second part of this series, we developed the modules required to build the data pipeline for our face recognition application. Figure 1: Overview of our Face Recognition Pipeline (source: image by the author).

Deep Learning

Deep Learning Deep Learning Data Pipeline Python

6 Remote AI Jobs to Look for in 2024

ODSC - Open Data Science

DECEMBER 19, 2023

The field of artificial intelligence is growing rapidly and with it the demand for professionals who have tangible experience in AI and AI-powered tools. Data Engineer Data engineers are responsible for the end-to-end process of collecting, storing, and processing data. billion in 2021 to $331.2 billion by 2026.

Data Scientist

Data Scientist Machine Learning Machine Learning AI

The Data Dilemma: Exploring the Key Differences Between Data Science and Data Engineering

Pickl AI

JULY 25, 2023

Data engineers are essential professionals responsible for designing, constructing, and maintaining an organization’s data infrastructure. They create data pipelines, ETL processes, and databases to facilitate smooth data flow and storage.

Data Engineering

Data Engineering Data Engineer Data Engineering Data Engineering

What Are AI Credits and How Can Data Scientists Use Them?

ODSC - Open Data Science

APRIL 23, 2025

With AI credits, teams can streamline the annotation process using intelligent suggestions and quality control mechanisms. Confluent Confluent provides a robust data streaming platform built around Apache Kafka. Modal Modal offers serverless compute tailored for data-intensive workloads.

Data Scientist

Data Scientist Azure Apache Kafka ML

40 Must-Know Data Science Skills and Frameworks for 2023

ODSC - Open Data Science

FEBRUARY 2, 2023

This doesn’t mean anything too complicated, but could range from basic Excel work to more advanced reporting to be used for data visualization later on. Computer Science and Computer Engineering Similar to knowing statistics and math, a data scientist should know the fundamentals of computer science as well.

Data Science

Data Science Data Scientist Computer Science Computer Science

Snowflake Snowpark: cloud SQL and Python ML pipelines

Snorkel AI

MAY 26, 2023

[link] Ahmad Khan, head of artificial intelligence and machine learning strategy at Snowflake gave a presentation entitled “Scalable SQL + Python ML Pipelines in the Cloud” about his company’s Snowpark service at Snorkel AI’s Future of Data-Centric AI virtual conference in August 2022. Welcome everybody.

SQL

SQL ML ML Python

Snowflake Snowpark: cloud SQL and Python ML pipelines

Snorkel AI

MAY 26, 2023

[link] Ahmad Khan, head of artificial intelligence and machine learning strategy at Snowflake gave a presentation entitled “Scalable SQL + Python ML Pipelines in the Cloud” about his company’s Snowpark service at Snorkel AI’s Future of Data-Centric AI virtual conference in August 2022. Welcome everybody.

SQL

SQL ML ML Python

Optimize pet profiles for Purina’s Petfinder application using Amazon Rekognition Custom Labels and AWS Step Functions

AWS Machine Learning Blog

OCTOBER 18, 2023

Purina used artificial intelligence (AI) and machine learning (ML) to automate animal breed detection at scale. Deployment with the AWS CDK The Step Functions state machine and associated infrastructure (including Lambda functions, CodeBuild projects, and Systems Manager parameters) are deployed with the AWS CDK using Python.

AWS

AWS ML ML Machine Learning

Reducing hallucinations in LLM agents with a verified semantic cache using Amazon Bedrock Knowledge Bases

AWS Machine Learning Blog

FEBRUARY 21, 2025

This setup uses the AWS SDK for Python (Boto3) to interact with AWS services. Rajesh Nedunuri is a Senior Data Engineer within the Amazon Worldwide Returns and ReCommerce Data Services team. He specializes in designing, building, and optimizing large-scale data solutions.

AWS

AWS Natural Language Processing Machine Learning Machine Learning

Building an efficient MLOps platform with OSS tools on Amazon ECS with AWS Fargate

AWS Machine Learning Blog

SEPTEMBER 18, 2024

The ZMP analyzes billions of structured and unstructured data points to predict consumer intent by using sophisticated artificial intelligence (AI) to personalize experiences at scale. Airflow for workflow orchestration Airflow schedules and manages complex workflows, defining tasks and dependencies in Python code.

AWS

AWS Machine Learning Machine Learning ML

Learn AI Together — Towards AI Community Newsletter #26

Towards AI

MAY 30, 2024

Prime_otter_86438 is working on a Python library to make ML training and running models on any microcontroller in real time for classification easy for beginners. They are seeking assistance from an expert to improve the model and make the Python package easier for the end user. If this sounds fun, connect with them in the thread!

AI

AI AI Data Pipeline Deep Learning

Explain text classification model predictions using Amazon SageMaker Clarify

AWS Machine Learning Blog

JANUARY 25, 2023

This field is often referred to as explainable artificial intelligence (XAI). Amazon SageMaker Clarify is a feature of Amazon SageMaker that enables data scientists and ML engineers to explain the predictions of their ML models. Solution overview SageMaker algorithms have fixed input and output data formats.

Algorithm

Algorithm Natural Language Processing Machine Learning Machine Learning

Data science vs data analytics: Unpacking the differences

IBM Journey to AI blog

SEPTEMBER 19, 2023

Overview: Data science vs data analytics Think of data science as the overarching umbrella that covers a wide range of tasks performed to find patterns in large datasets, structure data for use, train machine learning models and develop artificial intelligence (AI) applications.

Data Science

Data Science Analytics Analytics Data Scientist

Understanding and predicting urban heat islands at Gramener using Amazon SageMaker geospatial capabilities

AWS Machine Learning Blog

APRIL 5, 2024

Phase 1: Data pipeline The Landsat 8 satellite captures detailed imagery of the area of interest every 15 days at 11:30 AM, providing a comprehensive view of the city’s landscape and environment. Data acquisition and preprocessing To implement the modules, Gramener used the SageMaker geospatial notebook within Amazon SageMaker Studio.

Clustering

Clustering ML ML AWS

How SnapLogic built a text-to-pipeline application with Amazon Bedrock to translate business intent into action

Flipboard

NOVEMBER 24, 2023

This use case highlights how large language models (LLMs) are able to become a translator between human languages (English, Spanish, Arabic, and more) and machine interpretable languages (Python, Java, Scala, SQL, and so on) along with sophisticated internal reasoning. Sandeep holds an MSc.

Database

Database AWS ETL SQL

Advanced Snowflake Features in Coalesce

phData

JULY 4, 2024

This blog will cover creating customized nodes in Coalesce, what new advanced features can already be used as nodes, and how to create them as part of your data pipeline. They’re essentially an entire data pipeline within itself. Snowflake even handles the orchestration and scheduling of the refresh.

SQL

SQL Data Pipeline Data Engineering Data Engineering

Triplet Loss with Keras and TensorFlow

Flipboard

MARCH 6, 2023

In the previous tutorial of this series, we built the dataset and data pipeline for our Siamese Network based Face Recognition application. Specifically, we looked at an overview of triplet loss and discussed what kind of data samples are required to train our model with the triplet loss. And that’s exactly what I do.

Deep Learning

Deep Learning Deep Learning Data Pipeline Computer Science

What Does a Data Engineering Job Involve in 2024?

ODSC - Open Data Science

JANUARY 30, 2024

Not only does it involve the process of collecting, storing, and processing data so that it can be used for analysis and decision-making, but these professionals are responsible for building and maintaining the infrastructure that makes this possible; and so much more. Think of data engineers as the architects of the data ecosystem.

Data Engineering

Data Engineering Data Engineer Data Engineering Data Engineering

Statistical Tools for Data-Driven Research

Pickl AI

AUGUST 16, 2024

Python When it comes to a powerful and versatile programming language, Python takes the lead. Python , with libraries such as NumPy, Pandas, and SciPy, is increasingly used for statistical analysis. Its versatility allows integration with web applications and data pipelines, making it a favourite among data scientists.

Hypothesis Testing

Hypothesis Testing Data Analysis Data Analysis Python

What Is Keras Core?

PyImageSearch

JULY 24, 2023

Going Beyond with Keras Core The Power of Keras Core: Expanding Your Deep Learning Horizons Show Me Some Code JAX Harnessing model.fit() Imports and Setup Data Pipeline Build a Custom Model Build the Image Classification Model Train the Model Evaluation Summary References Citation Information What Is Keras Core? What Is Keras Core?

Deep Learning

Deep Learning Deep Learning Machine Learning Machine Learning

How to become an AI Architect?

Pickl AI

JULY 18, 2023

The salary of an Artificial Intelligence Architect in India ranges between ₹ 18.0 An AI Architect is a skilled professional responsible for designing and implementing artificial intelligence solutions within an organization. from 2023 to 2030. Lakhs to ₹ 56.7 Their average annual salary is ₹ 31.8 Who is an AI Architect?

AI

AI AI Machine Learning Machine Learning

Robust time series forecasting with MLOps on Amazon SageMaker

AWS Machine Learning Blog

SEPTEMBER 28, 2023

SageMaker pipeline SageMaker Pipelines offers a user-friendly Python SDK to create integrated machine learning (ML) workflows. Our endpoint provides a single-step forecast for the provided time series data, presented as percentiles and the median, as shown in the following figure and table.

AWS

AWS Machine Learning Machine Learning ML

AI Mastery 2025: Skills to Stay Ahead in the Next Wave

ODSC - Open Data Science

JANUARY 28, 2025

Celebrating ODSCs 10-year milestone, McGovern delved into industry trends, in-demand skills, and emerging roles shaping the field of artificial intelligence as we approach2025. LLM Engineers: With job postings far exceeding the current talent pool, this role has become one of the hottest inAI.

AI

AI AI Machine Learning Machine Learning

Effective Project Management for Data Science: From Scoping to Ethical Deployment

ODSC - Open Data Science

OCTOBER 18, 2024

Although data scientists rightfully capture the spotlight, future-focused teams also include engineers building data pipelines, visualization experts, and project managers who integrate efforts across groups. Selecting Technologies The technology landscape enables advanced analytics and artificial intelligence to evolve quickly.

Data Science

Data Science Data Scientist Analytics Analytics

MLOps and the evolution of data science

IBM Journey to AI blog

AUGUST 11, 2023

Machine learning (ML), a subset of artificial intelligence (AI), is an important piece of data-driven innovation. Machine learning engineers take massive datasets and use statistical methods to create algorithms that are trained to find patterns and uncover key insights in data mining projects.

Data Science

Data Science Machine Learning Machine Learning ML

Evaluating Siamese Network Accuracy (F1-Score, Precision, and Recall) with Keras and TensorFlow

PyImageSearch

FEBRUARY 5, 2024

Implementing Precision and Recall Calculations in Python Now that we have defined and segregated our samples into True Positives , True Negatives , False Positives , and False Negatives , let us try to use them to compute specific metrics to evaluate our model. label_pred !=1 1 ), and their ground-truth label was positive ( label_gt =1 ).

Database

Database Data Pipeline Deep Learning Deep Learning

ODSC East 2025: A Sneak Peek at the Schedule

ODSC - Open Data Science

FEBRUARY 5, 2025

Monday, May 12thAI Bootcamp Day (VirtualOnly) The sessions, conducted entirely online, will focus on core data science topics, including Python programming, machine learning basics, statistical analysis, AI Agents, and everything needed to excel as an AI engineer.

Data Engineering

Data Engineering Data Engineering Data Engineer Data Engineering

Accelerating AI/ML development at BMW Group with Amazon SageMaker Studio

Flipboard

NOVEMBER 24, 2023

JuMa is a service of BMW Group’s AI platform for its data analysts, ML engineers, and data scientists that provides a user-friendly workspace with an integrated development environment (IDE). It is powered by Amazon SageMaker Studio and provides JupyterLab for Python and Posit Workbench for R.

ML

ML ML AWS AI

A Primer to Scaling Pandas

ODSC - Open Data Science

AUGUST 23, 2023

The effect is that you get to use your favorite pandas API, but your data pipelines run on one of the most battle-tested and heavily-optimized data infrastructures today — databases. You can start running your Python data workflows in your data warehouse today by signing up here !

Data Warehouse

Data Warehouse Data Science Database SQL

Streaming Langchain: Real-time Data Processing with AI

Transforming Your Data Pipeline with dbt(data build tool)

Webinars

Trending Sources

Improving air quality with generative AI

Webinars

Adversarial Learning with Keras and TensorFlow (Part 2): Implementing the Neural Structured Learning (NSL) Framework and Building a Data Pipeline

Using Guardrails for Trustworthy AI, Projected AI Trends for 2024, and the Top Remote AI Jobs in…

A Guide to Choose the Best Data Science Bootcamp

Fine tune a generative AI application for Amazon Bedrock using Amazon SageMaker Pipeline decorators

Use Snowflake as a data source to train ML models with Amazon SageMaker

Top NLP Skills, Frameworks, Platforms, and Languages for 2023

AWS Machine Learning: A Beginner’s Guide

ODSC West 2023 Recap in Pictures

The 2021 Executive Guide To Data Science and AI

Building a Dataset for Triplet Loss with Keras and TensorFlow

Journeying into the realms of ML engineers and data scientists

11 Open Source Data Exploration Tools You Need to Know in 2023

Training and Making Predictions with Siamese Networks and Triplet Loss

6 Remote AI Jobs to Look for in 2024

The Data Dilemma: Exploring the Key Differences Between Data Science and Data Engineering

What Are AI Credits and How Can Data Scientists Use Them?

40 Must-Know Data Science Skills and Frameworks for 2023

Snowflake Snowpark: cloud SQL and Python ML pipelines

Snowflake Snowpark: cloud SQL and Python ML pipelines

Optimize pet profiles for Purina’s Petfinder application using Amazon Rekognition Custom Labels and AWS Step Functions

Reducing hallucinations in LLM agents with a verified semantic cache using Amazon Bedrock Knowledge Bases

Building an efficient MLOps platform with OSS tools on Amazon ECS with AWS Fargate

Learn AI Together — Towards AI Community Newsletter #26

Explain text classification model predictions using Amazon SageMaker Clarify

Data science vs data analytics: Unpacking the differences

Understanding and predicting urban heat islands at Gramener using Amazon SageMaker geospatial capabilities

How SnapLogic built a text-to-pipeline application with Amazon Bedrock to translate business intent into action

Advanced Snowflake Features in Coalesce

Triplet Loss with Keras and TensorFlow

What Does a Data Engineering Job Involve in 2024?

Statistical Tools for Data-Driven Research

What Is Keras Core?

How to become an AI Architect?

Robust time series forecasting with MLOps on Amazon SageMaker

AI Mastery 2025: Skills to Stay Ahead in the Next Wave

Effective Project Management for Data Science: From Scoping to Ethical Deployment

MLOps and the evolution of data science

Evaluating Siamese Network Accuracy (F1-Score, Precision, and Recall) with Keras and TensorFlow

ODSC East 2025: A Sneak Peek at the Schedule

Accelerating AI/ML development at BMW Group with Amazon SageMaker Studio

A Primer to Scaling Pandas

Stay Connected