article thumbnail

Command-line Tools can be 235x Faster than your Hadoop Cluster (2014)

Hacker News

Adam Drake is an advisor to scale-up tech companies. He writes about ML/AI/crypto/data, leadership, and building tech teams.

Hadoop 115
article thumbnail

Implement smart document search index with Amazon Textract and Amazon OpenSearch

AWS Machine Learning Blog

You need permissions to deploy AWS CloudFormation templates, push to the Amazon Elastic Container Registry (Amazon ECR), create Amazon Identity and Access Management (AWS IAM) roles, Amazon Lambda functions, Amazon S3 buckets, Amazon Step Functions, Amazon OpenSearch cluster, and an Amazon Cognito user pool.

AWS 104
professionals

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Deep Learning for NLP: Word2Vec, Doc2Vec, and Top2Vec Demystified

Mlearning.ai

Doc2Vec was introduced in 2014 by a team of researchers led by Tomas Mikolov. Image taken from Efficient Estimation of Word Representation in Vector Space Top2Vec Top2Vec is an unsupervised machine-learning model designed for topic modelling and document clustering. To achieve this, Top2Vec utilizes the doc2vec model.

article thumbnail

Top 5 Use Cases of phData’s Advisor Tool

phData

Founded in 2014 by three leading cloud engineers, phData focuses on solving real-world data engineering, operations, and advanced analytics problems with the best cloud platforms and products. Over the years, one of our primary focuses became Snowflake and migrating customers to this leading cloud data platform.

article thumbnail

From Rulesets to Transformers: A Journey Through the Evolution of SOTA in NLP

Mlearning.ai

2014) Significant people : Geoffrey Hinton Yoshua Bengio Ilya Sutskever 5. Efficient Large-Scale Language Model Training on GPU Clusters Using Megatron-LM ” by Deepak Narayanan et al. Use Cases :Language Modeling, Question Answering, Text Generation Significant papers: “Attention is all you need” by Vaswani et al.

article thumbnail

Visualizing the Tour de France in the year I tackle the route

Cambridge Intelligence

It’s a busy chart, but I’m drawn to the cluster of larger team nodes in the top left. In 2014, London also hosted the finish of a stage that started in my hometown, Cambridge. Visualizing the Tour de France: the early years Hmmmm. Those “TDF 190# ” don’t look right – they’re clearly not teams – but I know what’s happened.

article thumbnail

Understanding and predicting urban heat islands at Gramener using Amazon SageMaker geospatial capabilities

AWS Machine Learning Blog

Among these models, the spatial fixed effect model yielded the highest mean R-squared value, particularly for the timeframe spanning 2014 to 2020. SageMaker Processing enables the flexible scaling of compute clusters to accommodate tasks of varying sizes, from processing a single city block to managing planetary-scale workloads.