AWS, Blog and Data Modeling - Data Science Current

Accelerating ML experimentation with enhanced security: AWS PrivateLink support for Amazon SageMaker with MLflow

AWS Machine Learning Blog

DECEMBER 9, 2024

It simplifies the often complex and time-consuming tasks involved in setting up and managing an MLflow environment, allowing ML administrators to quickly establish secure and scalable MLflow environments on AWS. AWS CodeArtifact , which provides a private PyPI repository so that SageMaker can use it to download necessary packages.

AWS

AWS ML ML Data Scientist

Announcing managed MCP servers with Unity Catalog and Mosaic AI Integration

databricks

JUNE 18, 2025

Events Data + AI Summit Data + AI World Tour Data Intelligence Days Event Calendar Blog and Podcasts Databricks Blog Explore news, product announcements, and more Databricks Mosaic Research Blog Discover the latest in our Gen AI research Data Brew Podcast Let’s talk data!

AI

AI AI Data Science Artificial Intelligence

Integrate foundation models into your code with Amazon Bedrock

AWS Machine Learning Blog

NOVEMBER 6, 2024

Prerequisites Before you dive into the integration process, make sure you have the following prerequisites in place: AWS account – You’ll need an AWS account to access and use Amazon Bedrock. You can interact with Amazon Bedrock using AWS SDKs available in Python, Java, Node.js, and more.

AWS

AWS Python Machine Learning Machine Learning

Webinars

Precision in Motion: Why Process Optimization Is the Future of Manufacturing

Airflow Best Practices for ETL/ELT Pipelines

MORE WEBINARS

Enhance speech synthesis and video generation models with RLHF using audio and video segmentation in Amazon SageMaker

AWS Machine Learning Blog

NOVEMBER 21, 2024

We guide you through deploying the necessary infrastructure using AWS CloudFormation , creating an internal labeling workforce, and setting up your first labeling job. This precision helps models learn the fine details that separate natural from artificial-sounding speech. We demonstrate how to use Wavesurfer.js

AWS

AWS AI AI Natural Language Processing

How Rocket Companies modernized their data science solution on AWS

AWS Machine Learning Blog

FEBRUARY 21, 2025

The Hadoop environment was hosted on Amazon Elastic Compute Cloud (Amazon EC2) servers, managed in-house by Rockets technology team, while the data science experience infrastructure was hosted on premises. Communication between the two systems was established through Kerberized Apache Livy (HTTPS) connections over AWS PrivateLink.

Data Science

Data Science AWS Hadoop Data Scientist

Build a conversational data assistant, Part 2 – Embedding generative business intelligence with Amazon Q in QuickSight

AWS Machine Learning Blog

JULY 11, 2025

First, we define Pydantic data models to structure the FM output: class QTopicQuestionPair(BaseModel): """A question related to a Q topic.""" topic_id: str = Field(., First, we define Pydantic data models to structure the FM output: class QTopicQuestionPair(BaseModel): """A question related to a Q topic.""" topic_id: str = Field(.,

Business Intelligence

Business Intelligence Business Intelligence SQL AWS

Accelerating Mixtral MoE fine-tuning on Amazon SageMaker with QLoRA

AWS Machine Learning Blog

NOVEMBER 22, 2024

Using this approach, you can focus on developing and refining the model while using the fully managed training infrastructure provided by SageMaker Training. Implementation details We spin up the cluster by calling the SageMaker control plane through APIs or the AWS Command Line Interface (AWS CLI) or using the SageMaker AWS SDK.

Clustering

Clustering AWS ML ML

Improve governance of models with Amazon SageMaker unified Model Cards and Model Registry

AWS Machine Learning Blog

NOVEMBER 13, 2024

Because SageMaker Model Cards and SageMaker Model Registry were built on separate APIs, it was challenging to associate the model information and gain a comprehensive view of the model development lifecycle. Integrating model information and then sharing it across different stages became increasingly difficult.

ML

ML ML AWS Data Preparation

Security best practices to consider while fine-tuning models in Amazon Bedrock

AWS Machine Learning Blog

JANUARY 24, 2025

In this post, we delve into the essential security best practices that organizations should consider when fine-tuning generative AI models. Security in Amazon Bedrock Cloud security at AWS is the highest priority. Amazon Bedrock prioritizes security through a comprehensive approach to protect customer data and AI workloads.

AWS

AWS AI AI Machine Learning

Data Scientist Job Description – What Companies Look For in 2025

Pickl AI

JUNE 5, 2025

As Indian companies across industries increasingly embrace data-driven decision-making, artificial intelligence (AI), and automation, the demand for skilled data scientists continues to surge. Validation techniques ensure models perform well on unseen data. Big Data: Apache Hadoop, Apache Spark.

Data Scientist

Data Scientist Data Science Power BI Machine Learning

Create a SageMaker inference endpoint with custom model & extended container

AWS Machine Learning Blog

JANUARY 27, 2025

For this post, we use the us-east-1 AWS Region: Have access to a POSIX based (Mac/Linux) system or SageMaker notebooks. Both MMCV and Prithvi are third-party models which have not undergone AWS security reviews, so please review these models yourself or use at your own risk.

AWS

AWS Deep Learning Deep Learning Machine Learning

How InsuranceDekho transformed insurance agent interactions using Amazon Bedrock and generative AI

AWS Machine Learning Blog

NOVEMBER 18, 2024

Secure model access – Secure, private model access using AWS PrivateLink gives controlled data transfer for inference without traversing the public internet, maintaining data privacy and helping to adhere to compliance requirements.

AI

AI AI Database AWS

Architect a mature generative AI foundation on AWS

Flipboard

MAY 30, 2025

Scaling and load balancing The gateway can handle load balancing across different servers, model instances, or AWS Regions so that applications remain responsive. The AWS Solutions Library offers solution guidance to set up a multi-provider generative AI gateway. Model versions should be managed centrally in a model registry.

AWS

AWS AI AI Database

Top 10 Highest Paying Jobs in India

Pickl AI

JULY 10, 2025

Summary: This blog highlights the Top 10 highest paying jobs in India, covering roles in tech, finance, and healthcare. can help you build a rewarding career in data science through accessible and beginner-friendly learning paths. Certifications matter – Google, AWS, CFA, PMP, or domain-specific certifications can give you an edge.

Data Scientist

Data Scientist Machine Learning Machine Learning Data Science

Use Amazon Bedrock Intelligent Prompt Routing for cost and latency benefits

AWS Machine Learning Blog

APRIL 22, 2025

Our goal is to enable you to set up automated, optimal routing between large language models (LLMs) through Amazon Bedrock Intelligent Prompt Routing and its deep understanding of model behaviors within each model family, which incorporates state-of-the-art methods for training routers for different sets of models, tasks and prompts.

AWS

AWS Machine Learning Machine Learning Deep Learning

Ask HN: Who wants to be hired? (July 2025)

Hacker News

JULY 1, 2025

I've created docker containers from scratch and set up AWS Fargate and all the related services to run them and connect them to a public IP address. Proficient in Python, Java, React, AWS, Snowflake, and distributed systems. Résumé/CV: https://www.dropbox.com/scl/fi/5j9r1z2uaaq7hz50v1kfl/Resume.

Python

Python AWS SQL ML

Dynamic text-to-SQL for enterprise workloads with Amazon Bedrock Agents

AWS Machine Learning Blog

APRIL 14, 2025

Text-to-SQL empowers people to explore data and draw insights using natural language, without requiring specialized database knowledge. Amazon Web Services (AWS) has helped many customers connect this text-to-SQL capability with their own data, which means more employees can generate insights.

SQL

SQL Database AWS Data Models

Build your gen AI–based text-to-SQL application using RAG, powered by Amazon Bedrock (Claude 3 Sonnet and Amazon Titan for embedding)

AWS Machine Learning Blog

MARCH 18, 2025

To evaluate the models accuracy and track the mechanism, we store every user input and output in Amazon Simple Storage Service (Amazon S3). Prerequisites To create this solution, complete the following prerequisites: Sign up for an AWS account if you dont already have one. Sonnet on Amazon Bedrock. Business Analyst at Amazon.

SQL

SQL Database AI AI

How GoDaddy built a category generation system at scale with batch inference for Amazon Bedrock

AWS Machine Learning Blog

MARCH 13, 2025

The security measures are inherently integrated into the AWS services employed in this architecture. We used a dataset that consisted of 30 labeled data points and 100,000 unlabeled test data points. If youre interested in working with the AWS Generative AI Innovation Center, please reach out.

AWS

AWS AI AI Machine Learning

Best Data Engineering Tools Every Engineer Should Know

Pickl AI

MARCH 19, 2025

Data engineering is all about collecting, organising, and moving data so businesses can make better decisions. Handling massive amounts of data would be a nightmare without the right tools. In this blog, well explore the best data engineering tools that make data work easier, faster, and more reliable.

Data Engineering

Data Engineering Data Engineering Data Engineer Data Engineering

Announcing general availability of Amazon Bedrock Knowledge Bases GraphRAG with Amazon Neptune Analytics

AWS Machine Learning Blog

MARCH 7, 2025

Today, Amazon Web Services (AWS) announced the general availability of Amazon Bedrock Knowledge Bases GraphRAG (GraphRAG), a capability in Amazon Bedrock Knowledge Bases that enhances Retrieval-Augmented Generation (RAG) with graph data in Amazon Neptune Analytics. For Data source details , select Amazon S3 as your data source.

Analytics

Analytics Analytics AWS Natural Language Processing

Ask HN: Who is hiring? (July 2025)

Hacker News

JULY 1, 2025

It seems like that's not the main focus of your org, but I was pleased to see a reference to RCV in your blog: [0] [0]: https://goodparty.org/blog/article/final-five-voting-explain. On the backend we're using 100% Go with AWS primitives. Profitable, 15+ yrs stable, 100% employee-owned.

Python

Python AWS ML ML

Modernizing data science lifecycle management with AWS and Wipro

AWS Machine Learning Blog

JANUARY 5, 2024

This post was written in collaboration with Bhajandeep Singh and Ajay Vishwakarma from Wipro’s AWS AI/ML Practice. Many organizations have been using a combination of on-premises and open source data science solutions to create and manage machine learning (ML) models.

AWS

AWS Data Science ML ML

Object-centric Process Mining on Data Mesh Architectures

Data Science Blog

NOVEMBER 15, 2023

New big data architectures and, above all, data sharing concepts such as Data Mesh are ideal for creating a common database for many data products and applications. The Event Log Data Model for Process Mining Process Mining as an analytical system can very well be imagined as an iceberg.

Data Models

Data Models Data Modeling Business Intelligence Business Intelligence

Derive meaningful and actionable operational insights from AWS Using Amazon Q Business

AWS Machine Learning Blog

JULY 17, 2024

As a customer, you rely on Amazon Web Services (AWS) expertise to be available and understand your specific environment and operations. Amazon Q Business is a fully managed, secure, generative-AI powered enterprise chat assistant that enables natural language interactions with your organization’s data.

AWS

AWS AI AI Artificial Intelligence

Develop and train large models cost-efficiently with Metaflow and AWS Trainium

AWS Machine Learning Blog

APRIL 29, 2024

In 2024, however, organizations are using large language models (LLMs), which require relatively little focus on NLP, shifting research and development from modeling to the infrastructure needed to support LLM workflows. This often means the method of using a third-party LLM API won’t do for security, control, and scale reasons.

AWS

AWS ML ML Python

Building an efficient MLOps platform with OSS tools on Amazon ECS with AWS Fargate

AWS Machine Learning Blog

SEPTEMBER 18, 2024

In addition to its groundbreaking AI innovations, Zeta Global has harnessed Amazon Elastic Container Service (Amazon ECS) with AWS Fargate to deploy a multitude of smaller models efficiently. Hosted on Amazon ECS with tasks run on Fargate, this platform streamlines the end-to-end ML workflow, from data ingestion to model deployment.

AWS

AWS Machine Learning Machine Learning ML

Data Mesh Architecture on Cloud for BI, Data Science and Process Mining

Data Science Blog

JULY 23, 2023

Data Mesh on Azure Cloud with Databricks and Delta Lake for Applications of Business Intelligence, Data Science and Process Mining. However, this concept on the Azure Cloud is just an example and can easily be implemented on the Google Cloud (GCP), Amazon Cloud (AWS) and now even on the SAP Cloud (Datasphere) using Databricks.

Data Science

Data Science Azure Power BI Business Intelligence

Frugality meets Accuracy: Cost-efficient training of GPT NeoX and Pythia models with AWS Trainium

AWS Machine Learning Blog

DECEMBER 12, 2023

In this post, we’ll summarize training procedure of GPT NeoX on AWS Trainium , a purpose-built machine learning (ML) accelerator optimized for deep learning training. M tokens/$) trained such models with AWS Trainium without losing any model quality. We’ll outline how we cost-effectively (3.2 billion in Pythia.

AWS

AWS Machine Learning Deep Learning Deep Learning

How Carrier predicts HVAC faults using AWS Glue and Amazon SageMaker

AWS Machine Learning Blog

SEPTEMBER 5, 2023

The solution framework is scalable as more equipment is installed and can be reused for a variety of downstream modeling tasks. In this post, we show how the Carrier and AWS teams applied ML to predict faults across large fleets of equipment using a single model. The effective precision of the trained model is 91.6%.

AWS

AWS ML ML Machine Learning

Why using Infrastructure as Code for developing Cloud-based Data Warehouse Systems?

Data Science Blog

SEPTEMBER 19, 2023

This ensures that the data models and queries developed by data professionals are consistent with the underlying infrastructure. Enhanced Security and Compliance Data Warehouses often store sensitive information, making security a paramount concern. appeared first on Data Science Blog.

Data Warehouse

Data Warehouse Azure SQL Database

Becoming a Data Engineer: 7 Tips to Take Your Career to the Next Level

Data Science Connect

JANUARY 27, 2023

It is the process of collecting, storing, managing, and analyzing large amounts of data, and data engineers are responsible for designing and implementing the systems and infrastructure that make this possible. Learn about data modeling: Data modeling is the process of creating a conceptual representation of data.

Data Engineering

Data Engineering Data Engineering Data Engineer Data Engineering

How Light & Wonder built a predictive maintenance solution for gaming machines on AWS

AWS Machine Learning Blog

JUNE 22, 2023

Working with AWS, Light & Wonder recently developed an industry-first secure solution, Light & Wonder Connect (LnW Connect), to stream telemetry and machine health data from roughly half a million electronic gaming machines distributed across its casino customer base globally when LnW Connect reaches its full potential.

AWS

AWS ML ML Machine Learning

Apply fine-grained data access controls with AWS Lake Formation in Amazon SageMaker Data Wrangler

AWS Machine Learning Blog

AUGUST 21, 2023

You can streamline the process of feature engineering and data preparation with SageMaker Data Wrangler and finish each stage of the data preparation workflow (including data selection, purification, exploration, visualization, and processing at scale) within a single visual interface.

AWS

AWS Data Lakes Clustering Data Preparation

Governing the ML lifecycle at scale, Part 1: A framework for architecting ML workloads using Amazon SageMaker

AWS Machine Learning Blog

OCTOBER 20, 2023

Customers of every size and industry are innovating on AWS by infusing machine learning (ML) into their products and services. Recent developments in generative AI models have further sped up the need of ML adoption across industries. The architecture maps the different capabilities of the ML platform to AWS accounts.

ML

ML ML Data Lakes AWS

Implementing Knowledge Bases for Amazon Bedrock in support of GDPR (right to be forgotten) requests

AWS Machine Learning Blog

MAY 31, 2024

With the Amazon Bedrock serverless experience, you can get started quickly, privately customize FMs with your own data, and integrate and deploy them into your applications using the Amazon Web Services (AWS) tools without having to manage infrastructure. You can Refer to the FAIR blog and 5 Actionable steps to GDPR Compliance.

AWS

AWS Machine Learning Machine Learning Database

Build well-architected IDP solutions with a custom lens – Part 4: Performance efficiency

AWS Machine Learning Blog

NOVEMBER 22, 2023

The AWS Well-Architected Framework provides a systematic way for organizations to learn operational and architectural best practices for designing and operating reliable, secure, efficient, cost-effective, and sustainable workloads in the cloud. These resources introduce common AWS services for IDP workloads and suggested workflows.

AWS

AWS ML ML Machine Learning

Automate the deployment of an Amazon Forecast time-series forecasting model

AWS Machine Learning Blog

MAY 4, 2023

Forecast uses ML to learn not only the best algorithm for each item, but also the best ensemble of algorithms for each item, automatically creating the best model for your data. The console and AWS CLI methods are best suited for quick experimentation to check the feasibility of time series forecasting using your data.

AWS

AWS ML ML Data Scientist

What Are the Best Data Modeling Methodologies & Processes for My Data Lake?

phData

SEPTEMBER 19, 2023

However, to fully harness the potential of a data lake, effective data modeling methodologies and processes are crucial. Data modeling plays a pivotal role in defining the structure, relationships, and semantics of data within a data lake. Consistency of data throughout the data lake.

Data Lakes

Data Lakes Data Models Data Modeling Data Warehouse

How Axfood enables accelerated machine learning throughout the organization using Amazon SageMaker

AWS Machine Learning Blog

FEBRUARY 27, 2024

In this post, we share how Axfood, a large Swedish food retailer, improved operations and scalability of their existing artificial intelligence (AI) and machine learning (ML) operations by prototyping in close collaboration with AWS experts and using Amazon SageMaker.

Machine Learning

Machine Learning Machine Learning ML ML

Deploy a Hugging Face (PyAnnote) speaker diarization model on Amazon SageMaker as an asynchronous endpoint

AWS Machine Learning Blog

APRIL 25, 2024

We provide a comprehensive guide on how to deploy speaker segmentation and clustering solutions using SageMaker on the AWS Cloud. Solution overview Amazon Transcribe is the go-to service for speaker diarization in AWS. Hugging Face is a popular open source hub for machine learning (ML) models.

AWS

AWS ML ML Python

Say Goodbye to Costly BERT Inference: Turbocharge with AWS Inferentia2 and Hugging Face…

Mlearning.ai

JUNE 7, 2023

AWS Inferentia accelerators are custom-built machine learning inference chips designed by Amazon Web Services (AWS) to optimize inference workloads on the AWS platform. The AWS Inferentia chips are designed with a focus on delivering high performance, low latency, and cost efficiency for inference workloads.

AWS

AWS Deep Learning Deep Learning AI

Best practices for prompt engineering with Meta Llama 3 for Text-to-SQL use cases

AWS Machine Learning Blog

AUGUST 30, 2024

With the rapid growth of generative artificial intelligence (AI), many AWS customers are looking to take advantage of publicly available foundation models (FMs) and technologies. This includes Meta Llama 3, Meta’s publicly available large language model (LLM).

SQL

SQL AWS Database AI

Fine-tune Meta Llama 3.1 models using torchtune on Amazon SageMaker

AWS Machine Learning Blog

SEPTEMBER 19, 2024

In this post, AWS collaborates with Meta’s PyTorch team to showcase how you can use Meta’s torchtune library to fine-tune Meta Llama-like architectures while using a fully-managed environment provided by Amazon SageMaker Training. cat config_l3.1_8b_lora.yaml # Model Arguments model: _component_: torchtune.models.llama3_1.lora_llama3_1_8b

AWS

AWS ML ML Machine Learning

Accelerating ML experimentation with enhanced security: AWS PrivateLink support for Amazon SageMaker with MLflow

Announcing managed MCP servers with Unity Catalog and Mosaic AI Integration

Webinars

Trending Sources

Integrate foundation models into your code with Amazon Bedrock

Webinars

Enhance speech synthesis and video generation models with RLHF using audio and video segmentation in Amazon SageMaker

How Rocket Companies modernized their data science solution on AWS

Build a conversational data assistant, Part 2 – Embedding generative business intelligence with Amazon Q in QuickSight

Accelerating Mixtral MoE fine-tuning on Amazon SageMaker with QLoRA

Improve governance of models with Amazon SageMaker unified Model Cards and Model Registry

Security best practices to consider while fine-tuning models in Amazon Bedrock

Data Scientist Job Description – What Companies Look For in 2025

Create a SageMaker inference endpoint with custom model & extended container

How InsuranceDekho transformed insurance agent interactions using Amazon Bedrock and generative AI

Architect a mature generative AI foundation on AWS

Top 10 Highest Paying Jobs in India

Use Amazon Bedrock Intelligent Prompt Routing for cost and latency benefits

Ask HN: Who wants to be hired? (July 2025)

Dynamic text-to-SQL for enterprise workloads with Amazon Bedrock Agents

Build your gen AI–based text-to-SQL application using RAG, powered by Amazon Bedrock (Claude 3 Sonnet and Amazon Titan for embedding)

How GoDaddy built a category generation system at scale with batch inference for Amazon Bedrock

Best Data Engineering Tools Every Engineer Should Know

Announcing general availability of Amazon Bedrock Knowledge Bases GraphRAG with Amazon Neptune Analytics

Ask HN: Who is hiring? (July 2025)

Modernizing data science lifecycle management with AWS and Wipro

Object-centric Process Mining on Data Mesh Architectures

Derive meaningful and actionable operational insights from AWS Using Amazon Q Business

Develop and train large models cost-efficiently with Metaflow and AWS Trainium

Building an efficient MLOps platform with OSS tools on Amazon ECS with AWS Fargate

Data Mesh Architecture on Cloud for BI, Data Science and Process Mining

Frugality meets Accuracy: Cost-efficient training of GPT NeoX and Pythia models with AWS Trainium

How Carrier predicts HVAC faults using AWS Glue and Amazon SageMaker

Why using Infrastructure as Code for developing Cloud-based Data Warehouse Systems?

Becoming a Data Engineer: 7 Tips to Take Your Career to the Next Level

How Light & Wonder built a predictive maintenance solution for gaming machines on AWS

Apply fine-grained data access controls with AWS Lake Formation in Amazon SageMaker Data Wrangler

Governing the ML lifecycle at scale, Part 1: A framework for architecting ML workloads using Amazon SageMaker

Implementing Knowledge Bases for Amazon Bedrock in support of GDPR (right to be forgotten) requests

Build well-architected IDP solutions with a custom lens – Part 4: Performance efficiency

Automate the deployment of an Amazon Forecast time-series forecasting model

What Are the Best Data Modeling Methodologies & Processes for My Data Lake?

How Axfood enables accelerated machine learning throughout the organization using Amazon SageMaker

Deploy a Hugging Face (PyAnnote) speaker diarization model on Amazon SageMaker as an asynchronous endpoint

Say Goodbye to Costly BERT Inference: Turbocharge with AWS Inferentia2 and Hugging Face…

Best practices for prompt engineering with Meta Llama 3 for Text-to-SQL use cases

Fine-tune Meta Llama 3.1 models using torchtune on Amazon SageMaker

Stay Connected