Artificial Intelligence, Data Lakes and ML

Governing ML lifecycle at scale: Best practices to set up cost and usage visibility of ML workloads in multi-account environments

AWS Machine Learning Blog

NOVEMBER 14, 2024

By setting up automated policy enforcement and checks, you can achieve cost optimization across your machine learning (ML) environment. The following table provides examples of a tagging dictionary used for tagging ML resources. A reference architecture for the ML platform with various AWS services is shown in the following diagram.

ML

ML ML AWS Machine Learning

Streaming Machine Learning Without a Data Lake

ODSC - Open Data Science

MAY 31, 2023

Be sure to check out his talk, “ Apache Kafka for Real-Time Machine Learning Without a Data Lake ,” there! The combination of data streaming and machine learning (ML) enables you to build one scalable, reliable, but also simple infrastructure for all machine learning tasks using the Apache Kafka ecosystem.

Data Lakes

Data Lakes Machine Learning Machine Learning Apache Kafka

Precise Software Solutions implements ML as a service on AWS to save time and money for federal agency

Flipboard

JANUARY 6, 2025

The agency wanted to use AI [artificial intelligence] and ML to automate document digitization, and it also needed help understanding each document it digitizes, says Duan. The demand for modernization is growing, and Precise can help government agencies adopt AI/ML technologies.

AWS

AWS ML ML Machine Learning

Webinars

What’s New in Apache Airflow® 3.0—And How Will It Reshape Your Data Workflows?

MORE WEBINARS

Unstructured data management and governance using AWS AI/ML and analytics services

Flipboard

OCTOBER 25, 2023

After decades of digitizing everything in your enterprise, you may have an enormous amount of data, but with dormant value. However, with the help of AI and machine learning (ML), new software tools are now available to unearth the value of unstructured data. These services write the output to a data lake.

AWS

AWS ML ML Analytics

Accelerating AI/ML development at BMW Group with Amazon SageMaker Studio

Flipboard

NOVEMBER 24, 2023

With that, the need for data scientists and machine learning (ML) engineers has grown significantly. Data scientists and ML engineers require capable tooling and sufficient compute for their work. Data scientists and ML engineers require capable tooling and sufficient compute for their work.

ML

ML ML AWS AI

MAS AI/ML Modernization Accelerator: Air Compressor Use Case

IBM Data Science in Practice

JANUARY 9, 2024

By Carolyn Saplicki , IBM Data Scientist Industries are constantly seeking innovative solutions to maximize efficiency, minimize downtime, and reduce costs. One groundbreaking technology that has emerged as a game-changer is asset performance management (APM) artificial intelligence (AI).

ML

ML ML AI AI

Your guide to generative AI and ML at AWS re:Invent 2023

AWS Machine Learning Blog

NOVEMBER 22, 2023

Now all you need is some guidance on generative AI and machine learning (ML) sessions to attend at this twelfth edition of re:Invent. In addition to several exciting announcements during keynotes, most of the sessions in our track will feature generative AI in one form or another, so we can truly call our track “Generative AI and ML.”

AWS

AWS ML ML AI

Build a financial research assistant using Amazon Q Business and Amazon QuickSight for generative AI–powered insights

Flipboard

MAY 14, 2025

Their information is split between two types of data: unstructured data (such as PDFs, HTML pages, and documents) and structured data (such as databases, data lakes, and real-time reports). Different types of data typically require different tools to access them. Traditionally, businesses face a challenge.

AWS

AWS AI AI Database

Perform generative AI-powered data prep and no-code ML over any size of data using Amazon SageMaker Canvas

AWS Machine Learning Blog

AUGUST 15, 2024

Starting today, you can interactively prepare large datasets, create end-to-end data flows, and invoke automated machine learning (AutoML) experiments on petabytes of data—a substantial leap from the previous 5 GB limit. Organizations often struggle to extract meaningful insights and value from their ever-growing volume of data.

ML

ML ML Data Preparation AWS

How to Ensure Your New Cloud Data Lake Is Secure

Dataversity

MARCH 24, 2021

Enterprises migrating on-prem data environments to the cloud in pursuit of more robust, flexible, and integrated analytics and AI/ML capabilities are fueling a surge in cloud data lake implementations. The post How to Ensure Your New Cloud Data Lake Is Secure appeared first on DATAVERSITY.

Data Lakes

Data Lakes Cloud Data ML ML

10 Top LLM Companies You Must Know About

Data Science Dojo

SEPTEMBER 10, 2024

LLM companies are businesses that specialize in developing and deploying Large Language Models (LLMs) and advanced machine learning (ML) models. Open AI In the rapidly evolving field of artificial intelligence, OpenAI stands out as a leading force in the LLM world. What are LLM Companies?

Machine Learning

Machine Learning Machine Learning Natural Language Processing ML

Harmonize data using AWS Glue and AWS Lake Formation FindMatches ML to build a customer 360 view

Flipboard

JUNE 26, 2023

Companies are faced with the daunting task of ingesting all this data, cleansing it, and using it to provide outstanding customer experience. Typically, companies ingest data from multiple sources into their data lake to derive valuable insights from the data. Run the AWS Glue ML transform job.

AWS

AWS ML ML ETL

Real-Time ML with Spark and SBERT, AI Coding Assistants, Data Lake Vendors, and ODSC East…

ODSC - Open Data Science

JUNE 1, 2023

Real-Time ML with Spark and SBERT, AI Coding Assistants, Data Lake Vendors, and ODSC East Highlights Getting Up to Speed on Real-Time Machine Learning with Spark and SBERT Learn more about real-time machine learning by using this approach that uses Apache Spark and SBERT. Well, these libraries will give you a solid start.

Data Lakes

Data Lakes ML ML Citizen Data Scientist

Unlock the power of data governance and no-code machine learning with Amazon SageMaker Canvas and Amazon DataZone

AWS Machine Learning Blog

AUGUST 21, 2024

Amazon DataZone is a data management service that makes it quick and convenient to catalog, discover, share, and govern data stored in AWS, on-premises, and third-party sources. Enterprises can use no-code ML solutions to streamline their operations and optimize their decision-making without extensive administrative overhead.

Machine Learning

Machine Learning Machine Learning Data Governance ML

How Twilio generated SQL using Looker Modeling Language data with Amazon Bedrock

AWS Machine Learning Blog

AUGUST 8, 2024

As one of the largest AWS customers, Twilio engages with data, artificial intelligence (AI), and machine learning (ML) services to run their daily workloads. Data is the foundational layer for all generative AI and ML applications. The following diagram illustrates the solution architecture.

SQL

SQL Data Lakes Data Analyst AWS

Shaping the future: OMRON’s data-driven journey with AWS

AWS Machine Learning Blog

APRIL 3, 2025

Amazon AppFlow was used to facilitate the smooth and secure transfer of data from various sources into ODAP. Additionally, Amazon Simple Storage Service (Amazon S3) served as the central data lake, providing a scalable and cost-effective storage solution for the diverse data types collected from different systems.

AWS

AWS Data Governance Data Silos SQL

8 Data Lake Vendors to Make Your Data Life Easier in 2023

ODSC - Open Data Science

JUNE 7, 2023

To make your data management processes easier, here’s a primer on data lakes, and our picks for a few data lake vendors worth considering. What is a data lake? First, a data lake is a centralized repository that allows users or an organization to store and analyze large volumes of data.

Data Lakes

Data Lakes Azure Data Warehouse Hadoop

Democratize ML on Salesforce Data Cloud with no-code Amazon SageMaker Canvas

AWS Machine Learning Blog

NOVEMBER 27, 2023

SageMaker endpoints can be registered to the Salesforce Data Cloud to activate predictions in Salesforce. SageMaker Canvas provides a no-code experience to access data from Salesforce Data Cloud and build, test, and deploy models using just a few clicks. On the Create menu, choose Tabular to create a tabular dataset.

ML

ML ML AWS SQL

Retrieval-Augmented Generation with LangChain, Amazon SageMaker JumpStart, and MongoDB Atlas semantic search

Flipboard

NOVEMBER 17, 2023

Amazon SageMaker enables enterprises to build, train, and deploy machine learning (ML) models. Amazon SageMaker JumpStart provides pre-trained models and data to help you get started with ML. MongoDB vector data store MongoDB Atlas Vector Search is a new feature that allows you to store and search vector data in MongoDB.

K-nearest Neighbors

K-nearest Neighbors AWS Clustering Database

How Thomson Reuters built an AI platform using Amazon SageMaker to accelerate delivery of ML projects

AWS Machine Learning Blog

JANUARY 13, 2023

Since then, TR has achieved many more milestones as its AI products and services are continuously growing in number and variety, supporting legal, tax, accounting, compliance, and news service professionals worldwide, with billions of machine learning (ML) insights generated every year. The challenges. Solution overview.

ML

ML ML AWS Data Scientist

How Northpower used computer vision with AWS to automate safety inspection risk assessments

AWS Machine Learning Blog

SEPTEMBER 27, 2024

Specifically, we cover the computer vision and artificial intelligence (AI) techniques used to combine datasets into a list of prioritized tasks for field teams to investigate and mitigate. Data preparation SageMaker Ground Truth employs a human workforce made up of Northpower volunteers to annotate a set of 10,000 images.

AWS

AWS Data Lakes ML ML

Cloud Data Science News Beta #1

Data Science 101

NOVEMBER 11, 2019

It combines data warehousing and data lakes into a simple query interface for a simple and fast analytics service. Data Science Announcements from Microsoft Ignite Many other services were announced such as: Azure Quantum, Project Silica, R support in Azure ML, and Visual Studio Online. Amazon Web Services.

Cloud Data

Cloud Data Data Science Azure Clustering

How Rocket Companies modernized their data science solution on AWS

AWS Machine Learning Blog

FEBRUARY 21, 2025

Data exploration and model development were conducted using well-known machine learning (ML) tools such as Jupyter or Apache Zeppelin notebooks. Apache Hive was used to provide a tabular interface to data stored in HDFS, and to integrate with Apache Spark SQL. This also led to a backlog of data that needed to be ingested.

Data Science

Data Science AWS Hadoop Data Scientist

AWS re:Invent 2023 Amazon Redshift Sessions Recap

Flipboard

DECEMBER 18, 2023

Customers use Amazon Redshift as a key component of their data architecture to drive use cases from typical dashboarding to self-service analytics, real-time analytics, machine learning (ML), data sharing and monetization, and more. Discover how you can use Amazon Redshift to build a data mesh architecture to analyze your data.

AWS

AWS Data Warehouse ETL SQL

Introducing the Amazon Comprehend flywheel for MLOps

AWS Machine Learning Blog

MARCH 1, 2023

This combination of great models and continuous adaptation is what will lead to a successful machine learning (ML) strategy. MLOps focuses on the intersection of data science and data engineering in combination with existing DevOps practices to streamline model delivery across the ML development lifecycle.

Data Lakes

Data Lakes AWS ML ML

Why companies need to accelerate data warehousing solution modernization

IBM Journey to AI blog

APRIL 24, 2023

By running reports on historical data, a data warehouse can clarify what systems and processes are working and what methods need improvement. Data warehouse is the base architecture for artificial intelligence and machine learning (AI/ML) solutions as well.

Data Warehouse

Data Warehouse Data Lakes Database Big Data

Amazon SageMaker Feature Store now supports cross-account sharing, discovery, and access

AWS Machine Learning Blog

FEBRUARY 13, 2024

Amazon SageMaker Feature Store is a fully managed, purpose-built repository to store, share, and manage features for machine learning (ML) models. Features are inputs to ML models used during training and inference. SageMaker Feature Store now makes it effortless to share, discover, and access feature groups across AWS accounts.

AWS

AWS ML ML Machine Learning

Improving air quality with generative AI

AWS Machine Learning Blog

JUNE 18, 2024

More than 170 tech teams used the latest cloud, machine learning and artificial intelligence technologies to build 33 solutions. The attempt is disadvantaged by the current focus on data cleaning, diverting valuable skills away from building ML models for sensor calibration.

AWS

AWS Python AI AI

Azure Machine Learning – Empowering Your Data Science Journey

How to Learn Machine Learning

MAY 2, 2025

Azure Machine Learning is Microsoft’s enterprise-grade service that provides a comprehensive environment for data scientists and ML engineers to build, train, deploy, and manage machine learning models at scale. You can explore its capabilities through the official Azure ML Studio documentation. Awesome, right?

Azure

Azure Machine Learning Machine Learning Data Science

Data Engineering for IoT Applications: Unleashing the Power of the Internet of Things

Data Science Connect

JULY 28, 2023

Cloud-Based IoT Platforms Cloud-based IoT platforms offer scalable storage and computing resources for handling the massive influx of IoT data. These platforms provide data engineers with the flexibility to develop and deploy IoT applications efficiently.

Internet of Things

Internet of Things Data Engineering Data Engineer Data Engineering

Use the Amazon SageMaker and Salesforce Data Cloud integration to power your Salesforce apps with AI/ML

AWS Machine Learning Blog

AUGUST 4, 2023

If you are a returning user to SageMaker Studio, in order to ensure Salesforce Data Cloud is enabled, upgrade to the latest Jupyter and SageMaker Data Wrangler kernels. This completes the setup to enable data access from Salesforce Data Cloud to SageMaker Studio to build AI and machine learning (ML) models.

ML

ML ML AWS AI

Simplify continuous learning of Amazon Comprehend custom models using Comprehend flywheel

AWS Machine Learning Blog

MARCH 1, 2023

Flywheel creates a data lake (in Amazon S3) in your account where all the training and test data for all versions of the model are managed and stored. Periodically, the new labeled data (to retrain the model) can be made available to flywheel by creating datasets. One for the data lake for Comprehend flywheel.

Data Lakes

Data Lakes AWS ML ML

Using Azure ML to Train a Serengeti Data Model for Animal Identification

ODSC - Open Data Science

MAY 8, 2023

Article on Azure ML by Bethany Jepchumba and Josh Ndemenge of Microsoft In this article, I will cover how you can train a model using Notebooks in Azure Machine Learning Studio. Using Azure ML, you can train your model in three ways: Automated ML: This is where you upload your data and have the workspace automatically train on your behalf.

Azure

Azure ML ML Data Models

Achieve your AI goals with an open data lakehouse approach

IBM Journey to AI blog

OCTOBER 4, 2023

Artificial intelligence (AI) is now at the forefront of how enterprises work with data to help reinvent operations, improve customer experiences, and maintain a competitive advantage. It’s no longer a nice-to-have, but an integral part of a successful data strategy.

Data Lakes

Data Lakes Data Warehouse AI AI

Exploring the hyper-competitive future of customer experience

IBM Journey to AI blog

JANUARY 19, 2024

Staying ahead of key technology trends By now, it’s abundantly clear that technologies like artificial intelligence (AI) and machine learning (ML) will revolutionize how customer-centric organizations interact and deliver value to all stakeholders, especially their customers.

Artificial Intelligence

Artificial Intelligence Artificial Intelligence ML ML

Media Mix Modeling, ML Safety Concerns with LLMs, and Data Engineering Cloud Options

ODSC - Open Data Science

APRIL 27, 2023

5 Concerns for ML Safety in the Era of LLMs and Generative AI The growth of large language models and generative AI has spurred new concerns for ML safety and cybersecurity. 5 Data Engineering and Data Science Cloud Options for 2023 AI development is incredibly resource intensive.

Data Engineering

Data Engineering Data Engineer Data Engineering Data Engineering

Well-rounded technical architecture for a RAG implementation on AWS

Flipboard

FEBRUARY 19, 2025

In the age of generative artificial intelligence (AI), data isnt just kingits the entire kingdom. This is complemented by Amazon DynamoDB, which provides millisecond response times for data retrieval and automatic scaling to handle varying workloads.

AWS

AWS Cloud Computing Natural Language Processing Data Lakes

Generative AI operating models in enterprise organizations with Amazon Bedrock

AWS Machine Learning Blog

JANUARY 29, 2025

However, even in a decentralized model, often LOBs must align with central governance controls and obtain approvals from the CCoE team for production deployment, adhering to global enterprise standards for areas such as access policies, model risk management, data privacy, and compliance posture, which can introduce governance complexities.

AWS

AWS AI AI Database

Apply fine-grained data access controls with AWS Lake Formation in Amazon SageMaker Data Wrangler

AWS Machine Learning Blog

AUGUST 21, 2023

Amazon SageMaker Data Wrangler reduces the time it takes to collect and prepare data for machine learning (ML) from weeks to minutes. Data is frequently kept in data lakes that can be managed by AWS Lake Formation , giving you the ability to implement fine-grained access control using a straightforward grant or revoke procedure.

AWS

AWS Data Lakes Clustering Data Preparation

Using Azure ML to Train a Serengeti Data Model, Fast Option Pricing with DL, and How To Connect a…

ODSC - Open Data Science

MARCH 30, 2023

Using Azure ML to Train a Serengeti Data Model, Fast Option Pricing with DL, and How To Connect a GPU to a Container Using Azure ML to Train a Serengeti Data Model for Animal Identification In this article, we will cover how you can train a model using Notebooks in Azure Machine Learning Studio.

Azure

Azure ML ML Data Models

10 Things AWS Can Do for Your SaaS Company

Smart Data Collective

FEBRUARY 20, 2022

Data storage databases. Your SaaS company can store and protect any amount of data using Amazon Simple Storage Service (S3), which is ideal for data lakes, cloud-native applications, and mobile apps. Artificial intelligence (AI). Well, let’s find out.

AWS

AWS Cloud Computing Data Lakes Database

Build generative AI–powered Salesforce applications with Amazon Bedrock

AWS Machine Learning Blog

JULY 29, 2024

SageMaker endpoints can be registered with Salesforce Data Cloud to activate predictions in Salesforce. He has over 10 years of experience in planning, building, launching, and managing world-class solutions for enterprise customers, including AI/ML and cloud solutions. Follow him on LinkedIn. You can connect with him on LinkedIn.

AWS

AWS AI AI ML

Reducing hallucinations in LLM agents with a verified semantic cache using Amazon Bedrock Knowledge Bases

AWS Machine Learning Blog

FEBRUARY 21, 2025

Previously, he was a Data & Machine Learning Engineer at AWS, where he worked closely with customers to develop enterprise-scale data infrastructure, including data lakes, analytics dashboards, and ETL pipelines.

AWS

AWS Natural Language Processing Machine Learning Machine Learning

eSentire delivers private and secure generative AI interactions to customers with Amazon SageMaker

AWS Machine Learning Blog

JUNE 21, 2024

To accomplish this, eSentire built AI Investigator, a natural language query tool for their customers to access security platform data by using AWS generative artificial intelligence (AI) capabilities. eSentire has over 2 TB of signal data stored in their Amazon Simple Storage Service (Amazon S3) data lake.

AWS

AWS AI AI Natural Language Processing

Governing ML lifecycle at scale: Best practices to set up cost and usage visibility of ML workloads in multi-account environments

Streaming Machine Learning Without a Data Lake

Webinars

Trending Sources

Precise Software Solutions implements ML as a service on AWS to save time and money for federal agency

Webinars

Unstructured data management and governance using AWS AI/ML and analytics services

Accelerating AI/ML development at BMW Group with Amazon SageMaker Studio

MAS AI/ML Modernization Accelerator: Air Compressor Use Case

Your guide to generative AI and ML at AWS re:Invent 2023

Build a financial research assistant using Amazon Q Business and Amazon QuickSight for generative AI–powered insights

Perform generative AI-powered data prep and no-code ML over any size of data using Amazon SageMaker Canvas

How to Ensure Your New Cloud Data Lake Is Secure

10 Top LLM Companies You Must Know About

Harmonize data using AWS Glue and AWS Lake Formation FindMatches ML to build a customer 360 view

Real-Time ML with Spark and SBERT, AI Coding Assistants, Data Lake Vendors, and ODSC East…

Unlock the power of data governance and no-code machine learning with Amazon SageMaker Canvas and Amazon DataZone

How Twilio generated SQL using Looker Modeling Language data with Amazon Bedrock

Shaping the future: OMRON’s data-driven journey with AWS

8 Data Lake Vendors to Make Your Data Life Easier in 2023

Democratize ML on Salesforce Data Cloud with no-code Amazon SageMaker Canvas

Retrieval-Augmented Generation with LangChain, Amazon SageMaker JumpStart, and MongoDB Atlas semantic search

How Thomson Reuters built an AI platform using Amazon SageMaker to accelerate delivery of ML projects

How Northpower used computer vision with AWS to automate safety inspection risk assessments

Cloud Data Science News Beta #1

How Rocket Companies modernized their data science solution on AWS

AWS re:Invent 2023 Amazon Redshift Sessions Recap

Introducing the Amazon Comprehend flywheel for MLOps

Why companies need to accelerate data warehousing solution modernization

Amazon SageMaker Feature Store now supports cross-account sharing, discovery, and access

Improving air quality with generative AI

Azure Machine Learning – Empowering Your Data Science Journey

Data Engineering for IoT Applications: Unleashing the Power of the Internet of Things

Use the Amazon SageMaker and Salesforce Data Cloud integration to power your Salesforce apps with AI/ML

Simplify continuous learning of Amazon Comprehend custom models using Comprehend flywheel

Using Azure ML to Train a Serengeti Data Model for Animal Identification

Achieve your AI goals with an open data lakehouse approach

Exploring the hyper-competitive future of customer experience

Media Mix Modeling, ML Safety Concerns with LLMs, and Data Engineering Cloud Options

Well-rounded technical architecture for a RAG implementation on AWS

Generative AI operating models in enterprise organizations with Amazon Bedrock

Apply fine-grained data access controls with AWS Lake Formation in Amazon SageMaker Data Wrangler

Using Azure ML to Train a Serengeti Data Model, Fast Option Pricing with DL, and How To Connect a…

10 Things AWS Can Do for Your SaaS Company

Build generative AI–powered Salesforce applications with Amazon Bedrock

Reducing hallucinations in LLM agents with a verified semantic cache using Amazon Bedrock Knowledge Bases

eSentire delivers private and secure generative AI interactions to customers with Amazon SageMaker

Stay Connected