Remove 2015 Remove Clustering Remove Data Science
article thumbnail

Top 6 Kubernetes use cases

IBM Journey to AI blog

But Docker lacked an automated “orchestration” tool, which made it time-consuming and complex for data science teams to scale applications. Nodes run the pods and are usually grouped in a Kubernetes cluster, abstracting the underlying physical hardware resources.

article thumbnail

How Meesho built a generalized feed ranker using Amazon SageMaker inference

AWS Machine Learning Blog

Meesho was founded in 2015 and today focuses on buyers and sellers across India. Model training Meesho used Amazon EMR with Apache Spark to process hundreds of millions of data points, depending on the model’s complexity. One of the major challenges was to run distributed training at scale.

AWS 123
professionals

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Fast and cost-effective LLaMA 2 fine-tuning with AWS Trainium

AWS Machine Learning Blog

Our high-level training procedure is as follows: for our training environment, we use a multi-instance cluster managed by the SLURM system for distributed training and scheduling under the NeMo framework. Dr. Huan works on AI and Data Science. He focuses on developing scalable machine learning algorithms. Youngsuk Park is a Sr.

AWS 128
article thumbnail

Exploring Google’s AI Tools: A Deep Dive into the Future of Data Science

ODSC - Open Data Science

During a recent episode of ODSC’s Ai X Podcast with Paige Bailey, Engineering Lead for Gen AI Development Experience at Google, we delved into the groundbreaking AI tools and platforms that are shaping the future of data science. Check out her talk, “ Data Science in the Age of Generative AI ,” there!

article thumbnail

Demand forecasting at Getir built with Amazon Forecast

AWS Machine Learning Blog

Getir was founded in 2015 and operates in Turkey, the UK, the Netherlands, Germany, France, Spain, Italy, Portugal, and the United States. Solution overview Six people from Getir’s data science team and infrastructure team worked together on this project. Getir is the pioneer of ultrafast grocery delivery.

Algorithm 100
article thumbnail

23 Best Free NLP Datasets for Machine Learning

Iguazio

The data file format comprises the Tweet’s polarity, IT, date, query, user and text. Twitter US Airline Sentiment Polarized Tweets from February 2015 about the large US airlines. Data is provided in a CSV file and SQLite database. Get the dataset here. Get the dataset here. Synonyms 12. Get the dataset here.

article thumbnail

Your guide to generative AI and ML at AWS re:Invent 2024

AWS Machine Learning Blog

Explore the model pre-training workflow from start to finish, including setting up clusters, troubleshooting convergence issues, and running distributed training to improve model performance. Gain hands-on experience in data management, model training, monitoring, and seamless deployment to production environments.

AWS 110