article thumbnail

Techniques and approaches for monitoring large language models on AWS

AWS Machine Learning Blog

Large Language Models (LLMs) have revolutionized the field of natural language processing (NLP), improving tasks such as language translation, text summarization, and sentiment analysis. Through AWS Step Functions orchestration, the function calls Amazon Comprehend to detect the sentiment and toxicity.

AWS 109
article thumbnail

Large language model inference over confidential data using AWS Nitro Enclaves

AWS Machine Learning Blog

In this post, we discuss how Leidos worked with AWS to develop an approach to privacy-preserving large language model (LLM) inference using AWS Nitro Enclaves. LLMs are designed to understand and generate human-like language, and are used in many industries, including government, healthcare, financial, and intellectual property.

AWS 88
professionals

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Fast and cost-effective LLaMA 2 fine-tuning with AWS Trainium

AWS Machine Learning Blog

In this post, we walk through how to fine-tune Llama 2 on AWS Trainium , a purpose-built accelerator for LLM training, to reduce training times and costs. We review the fine-tuning scripts provided by the AWS Neuron SDK (using NeMo Megatron-LM), the various configurations we used, and the throughput results we saw.

AWS 95
article thumbnail

Reduce Amazon SageMaker inference cost with AWS Graviton

AWS Machine Learning Blog

In this post, we focus on how you can take advantage of the AWS Graviton3 -based Amazon Elastic Compute Cloud (EC2) C7g instances to help reduce inference costs by up to 50% relative to comparable EC2 instances for real-time inference on Amazon SageMaker. 4xlarge (AWS Graviton3) is about 50% of the c5.4xlarge and 40% of c6i.4xlarge;

AWS 76
article thumbnail

Video?—?Transformer training shootout: AWS Trainium vs. NVIDIA A10G

Julien Simon

Video — Transformer training shootout: AWS Trainium vs. NVIDIA A10G In this video, I compare the cost-performance of AWS Trainium, a new custom chip designed by AWS, with NVIDIA A10G GPUs. Then, I run a natural language processing job, fine-tuning the BERT Large model on the full Yelp review datatset.

AWS 40
article thumbnail

Accenture creates a Knowledge Assist solution using generative AI services on AWS

AWS Machine Learning Blog

To help tackle this challenge, Accenture collaborated with AWS to build an innovative generative AI solution called Knowledge Assist. By using AWS generative AI services, the team has developed a system that can ingest and comprehend massive amounts of unstructured enterprise content.

AWS 94
article thumbnail

Reduce energy consumption of your machine learning workloads by up to 90% with AWS purpose-built accelerators

Flipboard

There are several ways AWS is enabling ML practitioners to lower the environmental impact of their workloads. Inferentia and Trainium are AWS’s recent addition to its portfolio of purpose-built accelerators specifically designed by Amazon’s Annapurna Labs for ML inference and training workloads. times higher inference throughput.

AWS 94