Data Science Current

Generative AI: A Self-Study Roadmap

KDnuggets

JULY 11, 2025

What started with curiosity about GPT-3 has evolved into a business necessity, with companies across industries racing to integrate text generation, image creation, and code synthesis into their products and workflows. Dynamic Prompt Systems : Production applications rarely use static prompts.

AI

AI AI Machine Learning Machine Learning

Mosaic AI Announcements at Data + AI Summit 2025

databricks

JUNE 11, 2025

Events Data + AI Summit Data + AI World Tour Data Intelligence Days Event Calendar Blog and Podcasts Databricks Blog Explore news, product announcements, and more Databricks Mosaic Research Blog Discover the latest in our Gen AI research Data Brew Podcast Let’s talk data! Since then, we’ve had thousands of customers bring AI into production.

AI

AI AI SQL Data Science

Muvera: Making multi-vector retrieval as fast as single-vector search

Hacker News

JUNE 26, 2025

Learn more about our Publications Learn more Publications Resources We make products, tools, and datasets available to everyone with the goal of building a more collaborative ecosystem. Lack of efficient sublinear search methods : Single-vector retrieval benefits from highly optimized algorithms (e.g.,

Algorithm

Algorithm Natural Language Processing Data Mining Data Mining

Webinars

Precision in Motion: Why Process Optimization Is the Future of Manufacturing

Airflow Best Practices for ETL/ELT Pipelines

MORE WEBINARS

Leveraging Data Beyond Text: Multimodal AI at Scale

Data Science Connect

JULY 27, 2025

Companies like Spotify, Perplexity, and Vinted rely on Vespa to power search, recommendations, and RAG at global scale. Video search use cases—like content licensing and ad analytics—showcase the need for token-level and patch-level retrieval. Why are tensors essential for multimodal search? Why go beyond text in AI systems?

AI

AI AI ML ML

Combine keyword and semantic search for text and images using Amazon Bedrock and Amazon OpenSearch Service

Flipboard

APRIL 24, 2025

Customers today expect to find products quickly and efficiently through intuitive search functionality. A seamless search journey not only enhances the overall user experience, but also directly impacts key business metrics such as conversion rates, average order value, and customer loyalty.

AWS

AWS Database Machine Learning Machine Learning

Build a reverse image search engine with Amazon Titan Multimodal Embeddings in Amazon Bedrock and AWS managed services

AWS Machine Learning Blog

NOVEMBER 13, 2024

In ecommerce, visual search technology revolutionizes how customers find products by enabling them to search for products using images instead of text. Shoppers often have a clear visual idea of what they want but struggle to describe it in words, leading to inefficient and broad text-based search results.

AWS

AWS Database K-nearest Neighbors AI

Graph RAG vs RAG: Which One Is Truly Smarter for AI Retrieval?

Data Science Dojo

AUGUST 7, 2025

In this comprehensive guide, we’ll explore the technical foundations, architectures, use cases, and best practices of graph rag versus traditional RAG, helping you understand which approach is best for your enterprise AI, research, or product development needs. Generation: The LLM produces a grounded, context-aware response.

AI

AI AI Database Data Science

Introducing Agent Bricks: Auto-Optimized Agents Using Your Data

databricks

JUNE 11, 2025

Events Data + AI Summit Data + AI World Tour Data Intelligence Days Event Calendar Blog and Podcasts Databricks Blog Explore news, product announcements, and more Databricks Mosaic Research Blog Discover the latest in our Gen AI research Data Brew Podcast Let’s talk data!

Analytics

Analytics Analytics Data Science AI

Why You Need RAG to Stay Relevant as a Data Scientist

KDnuggets

JUNE 11, 2025

If you are in the way of searching for jobs related to data science, you probably heard the term RAG. You want to search call_date for user_id = 10234. Thanks to this retriever, instead of looking at the entire document, RAG will only search the relevant part. If you search the entire document, you will spend a lot of tokens.

Data Scientist

Data Scientist Natural Language Processing Data Science Machine Learning

Comparing the Llama Models: Llama 3 vs Llama 3.1 vs Llama 3.2

Data Science Dojo

NOVEMBER 8, 2024

For instance, a chatbot powered by the Llama 3 model can provide accurate product recommendations and answer detailed questions. Businesses can deploy these chatbots to provide instant responses to common questions, guide users through troubleshooting procedures, and offer detailed information about products and services.

AI

AI AI

Build generative AI applications quickly with Amazon Bedrock IDE in Amazon SageMaker Unified Studio

AWS Machine Learning Blog

DECEMBER 4, 2024

When users pose questions through the natural language interface, the chat agent determines whether to query the structured data in Amazon Athena through the Amazon Bedrock IDE function, search the Amazon Bedrock knowledge base, or combine both sources for comprehensive insights. Use Amazon Athena SQL queries to provide insights.

AWS

AWS AI AI SQL

10 FREE AI Tools That’ll Save You 10+ Hours a Week

KDnuggets

JUNE 25, 2025

By Kanwal Mehreen , KDnuggets Technical Editor & Content Specialist on June 25, 2025 in Artificial Intelligence Image by Author | Ideogram Trust me, this isn’t one of those clickbait articles with shady affiliate links or forced product placements. She co-authored the ebook "Maximizing Productivity with ChatGPT". We all know them.

Natural Language Processing

Natural Language Processing Data Science AI AI

Implementing Approximate Nearest Neighbor Search with KD-Trees

PyImageSearch

DECEMBER 23, 2024

This is where Approximate Nearest Neighbor (ANN) search algorithms come into play. Recommendation systems on platforms like Netflix and Spotify use ANN to suggest movies and music based on user preferences, ensuring a seamless and personalized experience without the computational burden of exact searches.

K-nearest Neighbors

K-nearest Neighbors Algorithm Deep Learning Deep Learning

Cohere Embed multimodal embeddings model is now available on Amazon SageMaker JumpStart

AWS Machine Learning Blog

NOVEMBER 15, 2024

Overview of multimodal embeddings and multimodal RAG architectures Multimodal embeddings are mathematical representations that integrate information not only from text but from multiple data modalities—such as product images, graphs, and charts—into a unified vector space.

AWS

AWS Computer Science Computer Science Database

OpenSearch Vector Engine is now disk-optimized for low cost, accurate vector search

Flipboard

JANUARY 24, 2025

OpenSearch Vector Engine can now run vector search at a third of the cost on OpenSearch 2.17+ domains. You can now configure k-NN (vector) indexes to run on disk mode, optimizing it for memory-constrained environments, and enable low-cost, accurate vector search that responds in low hundreds of milliseconds.

K-nearest Neighbors

K-nearest Neighbors ML ML Algorithm

Perplexity acquires Carbon, a Seattle startup that helps developers connect data sources to LLMs

Flipboard

DECEMBER 18, 2024

The company’s four employees will join San Francisco-based Perplexity, which offers AI search products and has seen its valuation skyrocket this year. Tu was previously a tech leader and early employee at Los Angeles e-commerce company Italic, and held product roles at Wayfair, Flywire, and 6sense.

Computer Science

Computer Science Computer Science Database AI

Announcing managed MCP servers with Unity Catalog and Mosaic AI Integration

databricks

JUNE 18, 2025

Events Data + AI Summit Data + AI World Tour Data Intelligence Days Event Calendar Blog and Podcasts Databricks Blog Explore news, product announcements, and more Databricks Mosaic Research Blog Discover the latest in our Gen AI research Data Brew Podcast Let’s talk data!

AI

AI AI Data Science Artificial Intelligence

This AI lab wants to automate scientific discovery

Dataconomy

JULY 8, 2025

The goal is to address a well-documented problem: scientific productivity is declining. Automating science to reverse declining productivity Over the last few decades, researchers have observed that scientific discovery is becoming slower and more resource-intensive. In 2022, with the launch of ChatGPT 3.5

AI

AI AI Data Analysis Data Analysis

What’s New: Lakeflow Jobs Provides More Efficient Data Orchestration

databricks

JULY 24, 2025

Events Data + AI Summit Data + AI World Tour Data Intelligence Days Event Calendar Blog and Podcasts Databricks Blog Explore news, product announcements, and more Databricks Mosaic Research Blog Discover the latest in our Gen AI research Data Brew Podcast Let’s talk data!

Data Pipeline

Data Pipeline Data Engineering Data Engineer Data Engineering

Web-LLM Assistant: Bridging Local AI Models With Real-Time Web Intelligence

Towards AI

NOVEMBER 16, 2024

Enter Web-LLM Assistant, an innovative open-source project designed to overcome this limitation by integrating local LLMs with real-time web searching capabilities. Web-LLM Assistant is a sophisticated web search assistant that leverages… Read the full blog for free on Medium. Join thousands of data leaders on the AI newsletter.

AI

AI AI Artificial Intelligence Artificial Intelligence

Principal Financial Group uses QnABot on AWS and Amazon Q Business to enhance workforce productivity with generative AI

AWS Machine Learning Blog

NOVEMBER 15, 2024

The chatbot improved access to enterprise data and increased productivity across the organization. It empowers employees to be more creative, data-driven, efficient, prepared, and productive. The extensive amount of data employees must search to find appropriate answers for customers made it difficult and time-consuming to navigate.

AWS

AWS AI AI Machine Learning

From RAG to fabric: Lessons learned from building real-world RAGs at GenAIIC – Part 2

AWS Machine Learning Blog

NOVEMBER 15, 2024

Use cases we have worked on include: Technical assistance for field engineers – We built a system that aggregates information about a company’s specific products and field expertise. Ecommerce product search – We built several solutions to enhance the search capabilities on ecommerce websites to improve the shopping experience for customers.

Database

Database SQL Data Analysis Data Analysis

How AI platforms rank on data privacy in 2025

Dataconomy

JULY 9, 2025

Why privacy in Gen AI is a growing concern While Gen AI platforms offer clear productivity benefits, they often expose users to complex data privacy risks that are hard to detect. Meta and Microsoft, on the other hand, required users to search through unrelated documentation. Can users find information about model training?

AI

AI AI

Introducing Databricks One

databricks

JUNE 12, 2025

Events Data + AI Summit Data + AI World Tour Data Intelligence Days Event Calendar Blog and Podcasts Databricks Blog Explore news, product announcements, and more Databricks Mosaic Research Blog Discover the latest in our Gen AI research Data Brew Podcast Let’s talk data! That’s what led us to create Databricks One. What is Databricks One?

Data Engineering

Data Engineering Data Engineer Data Engineering Data Engineering

GenAI Demo Day Q2: Spotlight on GenAI Innovations That Deliver

Data Science Connect

JULY 26, 2025

Demo Highlights: Who Presented What Snowflake – Cortex Agents with Cortex Analyst & Search Speaker : James Chawarly, Sr. Cortex Analyst generates SQL from semantic models; Cortex Search pulls relevant info from documents. Director, Product @ Precisely GraphQL interface unifies parcel, building, flood, foot‑traffic, POI datasets.

SQL

SQL Clustering EDA Python

Unlocking the power of Model Context Protocol (MCP) on AWS

Flipboard

JUNE 3, 2025

The top-performing products were Product A, Product B, and Product C. The top-performing products were Product A, Product B, and Product C. million, representing a 12% growth compared to the previous quarter.

AWS

AWS AI AI Database

Manage access controls in generative AI-powered search applications using Amazon OpenSearch Service and Amazon Cognito

Flipboard

NOVEMBER 19, 2024

Organizations of all sizes and types are using generative AI to create products and solutions. A common adoption pattern is to introduce document search tools to internal teams, especially advanced document searches based on semantic search. The following diagram depicts the solution architecture.

AWS

AWS AI AI Big Data

What is an LLM Bootcamp? What Does Data Science Dojo Offer for Your Success?

Data Science Dojo

NOVEMBER 5, 2024

Whether you are a data professional looking to elevate your skills or a product leader aiming to leverage LLMs for business enhancement, this bootcamp offers a comprehensive curriculum tailored to meet diverse learning needs.

Data Science

Data Science Azure Natural Language Processing Database

How Do LLMs Work? Discover the Hidden Mechanics Behind ChatGPT

Data Science Dojo

JULY 23, 2025

From writing assistants and chatbots to code generators and search engines, large language models (LLMs) are transforming the way machines interact with human language. Tool-using agents: Agentic AI systems use LLMs to decide when to call tools like search engines or APIs. How do LLMs work? Explore top LLM use cases across industries.

Supervised Learning

Supervised Learning AI AI Data Scientist

Analyzing Your Excel Spreadsheets with NotebookLM

KDnuggets

JULY 29, 2025

. # Step 1: Preparing and Exporting Excel Spreadsheets Lets consider a quarterly business report with data on sales, expenses, profit, and customer satisfaction scores across different regions and product categories. What product showed the greatest profitability increase? Therefore, the average customer satisfaction score is 86.25% ".

Natural Language Processing

Natural Language Processing Data Science Machine Learning Machine Learning

Cohere Rerank 3.5 is now available in Amazon Bedrock through Rerank API

AWS Machine Learning Blog

DECEMBER 1, 2024

This powerful reranking model enables AWS customers to significantly improve their search relevance and content ranking capabilities. in Amazon Bedrock, we’re making enterprise-grade search technology more accessible and empowering organizations to enhance their information retrieval systems with minimal infrastructure management.

AWS

AWS ML ML AI

How I Automated My Machine Learning Workflow with Just 10 Lines of Python

Flipboard

JUNE 6, 2025

Sign in Sign out Contributor Portal Latest Editor’s Picks Deep Dives Contribute Newsletter Toggle Mobile Navigation LinkedIn X Toggle Search Search Data Science How I Automated My Machine Learning Workflow with Just 10 Lines of Python Use LazyPredict and PyCaret to skip the grunt work and jump straight to performance.

Machine Learning

Machine Learning Machine Learning Python Data Science

Agent Learning from Human Feedback (ALHF): A Databricks Knowledge Assistant Case Study

databricks

AUGUST 4, 2025

Events Data + AI Summit Data + AI World Tour Data Intelligence Days Event Calendar Blog and Podcasts Databricks Blog Explore news, product announcements, and more Databricks Mosaic Research Blog Discover the latest in our Gen AI research Data Brew Podcast Let’s talk data! ALHF powers the Databricks Agent Bricks product.

SQL

SQL Data Science Artificial Intelligence Artificial Intelligence

Llama 3.3 70B now available in Amazon SageMaker JumpStart

AWS Machine Learning Blog

DECEMBER 16, 2024

According to Meta, this efficiency gain translates to nearly five times more cost-effective inference operations, making it an attractive option for production deployments. These models are fully customizable for your use case with your data, and you can deploy them into production using either the UI or SDK. Search for Meta Llama 3.3

AWS

AWS ML ML Python

AI browser

Dataconomy

JULY 29, 2025

AI browsers leverage advanced technologies like natural language processing and web automation to deliver tailored search results and assist users in navigating vast amounts of information efficiently. Shopping comparisons: Consumers can make informed purchasing decisions by comparing products efficiently. What is an AI browser?

Natural Language Processing

Natural Language Processing AI AI Artificial Intelligence

Use language embeddings for zero-shot classification and semantic search with Amazon Bedrock

AWS Machine Learning Blog

FEBRUARY 13, 2025

In this post, we discuss what embeddings are, show how to practically use language embeddings, and explore how to use them to add functionality such as zero-shot classification and semantic search. Semantic search Users can search their articles using semantic search, as shown in the following screenshot.

AWS

AWS K-nearest Neighbors Clustering Algorithm

Building enterprise-scale RAG applications with Amazon S3 Vectors and DeepSeek R1 on Amazon SageMaker AI

Flipboard

JULY 17, 2025

Retrieval Augmented Generation (RAG) addresses these gaps by combining semantic search with generative AI , enabling models to retrieve relevant information from enterprise knowledge bases before responding. To monitor performance, you can track latency and throughput alongside output quality to verify production-readiness.

AI

AI AI Database AWS

Vector database

Dataconomy

JULY 7, 2025

By allowing for semantic similarity searches, vector databases are enhancing applications across various domains, from personalized content recommendations to advanced natural language processing. Vector databases are specialized systems designed to store, manage, and facilitate the search of vector embeddings.

Database

Database K-nearest Neighbors Natural Language Processing Algorithm

Your guide to generative AI and ML at AWS re:Invent 2024

AWS Machine Learning Blog

NOVEMBER 19, 2024

These sessions, featuring Amazon Q Business , Amazon Q Developer , Amazon Q in QuickSight , and Amazon Q Connect , span the AI/ML, DevOps and Developer Productivity, Analytics, and Business Applications topics. Learn how Amazon Q Business goes beyond search to enable AI-powered actions.

AWS

AWS ML ML AI

How Indeed builds and deploys fine-tuned LLMs on Amazon SageMaker

AWS Machine Learning Blog

OCTOBER 11, 2024

Since our founding nearly two decades ago, machine learning (ML) and artificial intelligence (AI) have been at the heart of building data-driven products that better match job seekers with the right roles and get people hired. How can we provide production LLM inference at Indeed’s scale with favorable latency and costs?

AWS

AWS ML ML Artificial Intelligence

Optimize RAG in production environments using Amazon SageMaker JumpStart and Amazon OpenSearch Service

Flipboard

JULY 2, 2025

The process begins with a user query, which is used to search a comprehensive knowledge corpus. In the previous post , we showed how to build a RAG application on SageMaker JumpStart using Facebook AI Similarity Search (Faiss). This enriched input allows the model to generate more accurate and contextually appropriate responses.

AWS

AWS Clustering K-nearest Neighbors Algorithm

Automate actions across enterprise applications using Amazon Q Business plugins

AWS Machine Learning Blog

DECEMBER 10, 2024

Amazon Q Business is a generative AI-powered assistant that enhances employee productivity by solving problems, generating content, and providing insights across enterprise data sources. Sarah searches Amazon Q Business for guidance on troubleshooting the issue.

AWS

AWS AI AI

Achieving 10,000x training data reduction with high-fidelity labels

Hacker News

AUGUST 7, 2025

Learn more about our Publications Learn more Publications Resources We make products, tools, and datasets available to everyone with the goal of building a more collaborative ecosystem. Note that this initial data set is typically highly imbalanced, since in production traffic only very few (<1%) ads are actually clickbait.

Clustering

Clustering Natural Language Processing Data Mining Data Mining

Google and Salesforce just signed a huge $2.5 billion AI deal

Dataconomy

FEBRUARY 25, 2025

As part of the agreement, Agentforce will leverage Googles Gemini models, enabling it to process complex tasks involving images, audio, and video, as well as use real-time insights grounded in Google Search with Vertex AI.

AI

AI AI Artificial Intelligence Artificial Intelligence

Generative AI: A Self-Study Roadmap

Mosaic AI Announcements at Data + AI Summit 2025

Webinars

Trending Sources

Muvera: Making multi-vector retrieval as fast as single-vector search

Webinars

Leveraging Data Beyond Text: Multimodal AI at Scale

Combine keyword and semantic search for text and images using Amazon Bedrock and Amazon OpenSearch Service

Build a reverse image search engine with Amazon Titan Multimodal Embeddings in Amazon Bedrock and AWS managed services

Graph RAG vs RAG: Which One Is Truly Smarter for AI Retrieval?

Introducing Agent Bricks: Auto-Optimized Agents Using Your Data

Why You Need RAG to Stay Relevant as a Data Scientist

Comparing the Llama Models: Llama 3 vs Llama 3.1 vs Llama 3.2

Build generative AI applications quickly with Amazon Bedrock IDE in Amazon SageMaker Unified Studio

10 FREE AI Tools That’ll Save You 10+ Hours a Week

Implementing Approximate Nearest Neighbor Search with KD-Trees

Cohere Embed multimodal embeddings model is now available on Amazon SageMaker JumpStart

OpenSearch Vector Engine is now disk-optimized for low cost, accurate vector search

Perplexity acquires Carbon, a Seattle startup that helps developers connect data sources to LLMs

Announcing managed MCP servers with Unity Catalog and Mosaic AI Integration

This AI lab wants to automate scientific discovery

What’s New: Lakeflow Jobs Provides More Efficient Data Orchestration

Web-LLM Assistant: Bridging Local AI Models With Real-Time Web Intelligence

Principal Financial Group uses QnABot on AWS and Amazon Q Business to enhance workforce productivity with generative AI

From RAG to fabric: Lessons learned from building real-world RAGs at GenAIIC – Part 2

How AI platforms rank on data privacy in 2025

Introducing Databricks One

GenAI Demo Day Q2: Spotlight on GenAI Innovations That Deliver

Unlocking the power of Model Context Protocol (MCP) on AWS

Manage access controls in generative AI-powered search applications using Amazon OpenSearch Service and Amazon Cognito

What is an LLM Bootcamp? What Does Data Science Dojo Offer for Your Success?

How Do LLMs Work? Discover the Hidden Mechanics Behind ChatGPT

Analyzing Your Excel Spreadsheets with NotebookLM

Cohere Rerank 3.5 is now available in Amazon Bedrock through Rerank API

How I Automated My Machine Learning Workflow with Just 10 Lines of Python

Agent Learning from Human Feedback (ALHF): A Databricks Knowledge Assistant Case Study

Llama 3.3 70B now available in Amazon SageMaker JumpStart

AI browser

Use language embeddings for zero-shot classification and semantic search with Amazon Bedrock

Building enterprise-scale RAG applications with Amazon S3 Vectors and DeepSeek R1 on Amazon SageMaker AI

Vector database

Your guide to generative AI and ML at AWS re:Invent 2024

How Indeed builds and deploys fine-tuned LLMs on Amazon SageMaker

Optimize RAG in production environments using Amazon SageMaker JumpStart and Amazon OpenSearch Service

Automate actions across enterprise applications using Amazon Q Business plugins

Achieving 10,000x training data reduction with high-fidelity labels

Google and Salesforce just signed a huge $2.5 billion AI deal

Stay Connected