2018, Data Scientist and ML - Data Science Current

Building Generative AI and ML solutions faster with AI apps from AWS partners using Amazon SageMaker

AWS Machine Learning Blog

DECEMBER 4, 2024

This increases the time it takes for customers to go from data to insights. Our customers want a simple and secure way to find the best applications, integrate the selected applications into their machine learning (ML) and generative AI development environment, manage and scale their AI projects.

AWS

AWS ML ML AI

Bridging the Gap: New Datasets Push Recommender Research Toward Real-World Scale

KDnuggets

JUNE 11, 2025

Spotify Million Playlist Released for RecSys 2018, this dataset helps analyze short-term and sequential listening behavior. Read the original article at Turing Post , the newsletter for over 90 000 professionals who are serious about AI and ML. Yelp Open Dataset Contains 8.6M reviews, but coverage is sparse and city-specific.

Natural Language Processing

Natural Language Processing Data Science Machine Learning Machine Learning

The most important unanswered questions of 2018 in Artificial Intelligence (AI) and Machine Learning (ML)

Dataconomy

JULY 16, 2018

Here is what a recent whitepaper by Dataiku reveals about Artificial intelligence and machine learning emphasising on the role of data scientists. This is the first part of an article series based on a whitepaper by Dataiku) The year 2018 was supposed to be the one. Let’s find out.

Artificial Intelligence

Artificial Intelligence Artificial Intelligence Machine Learning Machine Learning

Improving ML Datasets with Cleanlab, a Standard Framework for Data-Centric AI

ODSC - Open Data Science

MARCH 22, 2023

Be sure to check out his session, “ Improving ML Datasets with Cleanlab, a Standard Framework for Data-Centric AI ,” there! Anybody who has worked on a real-world ML project knows how messy data can be. Our goal is to enable all developers to find and fix data issues as effectively as today’s best data scientists.

ML

ML ML Data Scientist AI

Tensor Processing Units (TPUs)

Dataconomy

MARCH 19, 2025

They are essential for processing large amounts of data efficiently, particularly in deep learning applications. By 2018, these powerful tools were made available for third-party use, marking a significant milestone in the accessibility of high-performance computing resources.

Machine Learning

Machine Learning Machine Learning Natural Language Processing Deep Learning

Deploy large language models for a healthtech use case on Amazon SageMaker

AWS Machine Learning Blog

FEBRUARY 6, 2024

To support overarching pharmacovigilance activities, our pharmaceutical customers want to use the power of machine learning (ML) to automate the adverse event detection from various data sources, such as social media feeds, phone calls, emails, and handwritten notes, and trigger appropriate actions.

AWS

AWS ML ML Data Preparation

How to optimize your LinkedIn as a Data Scientist?

Pickl AI

MAY 16, 2023

Whether you are a Data Scientist or a college student, the LinkedIn platform can give you a plethora of options to explore and grow. In this blog, we will be uncovering the how you can optimize Data Scientist LinkedIn profile for Indian market , as well as approach a global audience.

Data Scientist

Data Scientist Data Science SQL Python

Using LLMs to fortify cyber defenses: Sophos’s insight on strategies for using LLMs with Amazon Bedrock and Amazon SageMaker

AWS Machine Learning Blog

NOVEMBER 26, 2024

By harnessing the power of threat intelligence, machine learning (ML), and artificial intelligence (AI), Sophos delivers a comprehensive range of advanced products and services. The Sophos Artificial Intelligence (AI) group (SophosAI) oversees the development and maintenance of Sophos’s major ML security technology.

Machine Learning

Machine Learning Machine Learning SQL ML

How Sportradar used the Deep Java Library to build production-scale ML platforms for increased performance and efficiency

AWS Machine Learning Blog

APRIL 19, 2023

Since 2018, our team has been developing a variety of ML models to enable betting products for NFL and NCAA football. Our data scientists train the model in Python using tools like PyTorch and save the model as PyTorch scripts. Business requirements We are the US squad of the Sportradar AI department.

ML

ML ML Deep Learning Deep Learning

Predict football punt and kickoff return yards with fat-tailed distribution using GluonTS

Flipboard

FEBRUARY 2, 2023

With advanced analytics derived from machine learning (ML), the NFL is creating new ways to quantify football, and to provide fans with the tools needed to increase their knowledge of the games within the game of football. Next, we present the data preprocessing and other transformation methods applied to the dataset.

Cross Validation

Cross Validation ML ML Machine Learning

Llama 4 family of models from Meta are now available in SageMaker JumpStart

AWS Machine Learning Blog

APRIL 7, 2025

This approach allows for greater flexibility and integration with existing AI and machine learning (AI/ML) workflows and pipelines. By providing multiple access points, SageMaker JumpStart helps you seamlessly incorporate pre-trained models into your AI/ML development efforts, regardless of your preferred interface or workflow.

AWS

AWS Machine Learning Machine Learning ML

Machine Learning Engineering in the Real World

ODSC - Open Data Science

SEPTEMBER 21, 2023

Secondly, to be a successful ML engineer in the real world, you cannot just understand the technology; you must understand the business. Some typical examples are given in the following table, along with some discussion as to whether or not ML would be an appropriate tool for solving the problem: Figure 1.1:

Machine Learning

Machine Learning Machine Learning ML ML

NLP-Powered Data Extraction for SLRs and Meta-Analyses

Towards AI

JULY 20, 2023

Natural Language Processing Getting desirable data out of published reports and clinical trials and into systematic literature reviews (SLRs) — a process known as data extraction — is just one of a series of incredibly time-consuming, repetitive, and potentially error-prone steps involved in creating SLRs and meta-analyses.

Natural Language Processing

Natural Language Processing ML ML Support Vector Machines

Predicting new and existing product sales in semiconductors using Amazon Forecast

AWS Machine Learning Blog

APRIL 6, 2023

& AWS Machine Learning Solutions Lab (MLSL) Machine learning (ML) is being used across a wide range of industries to extract actionable insights from data to streamline processes and improve revenue generation. We calculated the WAPE value of a model by splitting the data into test and validation sets.

Machine Learning

Machine Learning Machine Learning ML ML

Identifying defense coverage schemes in NFL’s Next Gen Stats

AWS Machine Learning Blog

FEBRUARY 10, 2023

Through a collaboration between the Next Gen Stats team and the Amazon ML Solutions Lab , we have developed the machine learning (ML)-powered stat of coverage classification that accurately identifies the defense coverage scheme based on the player tracking data. Each season consists of around 17,000 plays.

ML

ML ML Machine Learning Machine Learning

Train self-supervised vision transformers on overhead imagery with Amazon SageMaker

AWS Machine Learning Blog

AUGUST 16, 2023

Training machine learning (ML) models to interpret this data, however, is bottlenecked by costly and time-consuming human annotation efforts. The images document the land cover, or physical surface features, of ten European countries between June 2017 and May 2018. The following are a few example RGB images and their labels.

ML

ML ML Data Scientist AWS

How Marubeni is optimizing market decisions using AWS machine learning and analytics

AWS Machine Learning Blog

MARCH 8, 2023

MPII is using a machine learning (ML) bid optimization engine to inform upstream decision-making processes in power asset management and trading. This solution helps market analysts design and perform data-driven bidding strategies optimized for power asset profitability. Data comes from disparate sources in a number of formats.

AWS

AWS Machine Learning Machine Learning Analytics

3 Takeaways from Gartner’s 2018 Data and Analytics Summit

DataRobot Blog

APRIL 1, 2018

Today’s data management and analytics products have infused artificial intelligence (AI) and machine learning (ML) algorithms into their core capabilities. These modern tools will auto-profile the data, detect joins and overlaps, and offer recommendations. 2) Line of business is taking a more active role in data projects.

Analytics

Analytics Analytics Data Preparation Augmented Analytics

What a data scientist should know about machine learning kernels?

Mlearning.ai

APRIL 13, 2023

Photo by Robo Wunderkind on Unsplash In general , a data scientist should have a basic understanding of the following concepts related to kernels in machine learning: 1. Overall , understanding kernels and how to select and tune them is an important aspect of being a data scientist. What are kernels? Types of kernels.

Machine Learning

Machine Learning Machine Learning Data Scientist Support Vector Machines

AWS performs fine-tuning on a Large Language Model (LLM) to classify toxic speech for a large gaming company

AWS Machine Learning Blog

AUGUST 7, 2023

AWS ProServe solved this use case through a joint effort between the Generative AI Innovation Center (GAIIC) and the ProServe ML Delivery Team (MLDT). However, LLMs are not a new technology in the ML space. The new ML workflow now starts with a pre-trained model dubbed a foundation model.

AWS

AWS ML ML Data Science

Data Catalogs: A Category of Their Own

Alation

FEBRUARY 20, 2020

While this requires technology – AI, machine learning, log parsing, natural language processing,metadata management, this technology must be surfaced in a form accessible to business users – the data catalog. The Forrester Wave : Machine Learning Data Catalogs, Q2 2018. Subscribe to Alation's Blog.

Machine Learning

Machine Learning Machine Learning Natural Language Processing Analytics

Generate a counterfactual analysis of corn response to nitrogen with Amazon SageMaker JumpStart solutions

AWS Machine Learning Blog

APRIL 3, 2023

Causal inference Causality is all about understanding change, but how to formalize this in statistics and machine learning (ML) is not a trivial exercise. Note that this solution is currently available in the US West (Oregon) Region only. In this crop yield study, the nitrogen added as fertilizer and the yield outcomes might be confounded.

Database

Database AWS Machine Learning Machine Learning

Federated Learning on AWS with FedML: Health analytics without sharing sensitive data – Part 2

AWS Machine Learning Blog

JANUARY 13, 2023

To mitigate these challenges, we propose a federated learning (FL) framework, based on open-source FedML on AWS, which enables analyzing sensitive HCLS data. It involves training a global machine learning (ML) model from distributed health data held locally at different sites. Import the data loader into the training script.

AWS

AWS Analytics Analytics Machine Learning

23 Best Free NLP Datasets for Machine Learning

Iguazio

SEPTEMBER 20, 2023

To help with these efforts, we’ve compiled a list of the top NLP datasets for NLP projects that data scientists and data professionals can use for training their models. million articles from 20,000 news sources across a seven day period in 2017 and 2018. This list is a starting point for training your NLP models.

Machine Learning

Machine Learning Machine Learning Database Data Scientist

First Step to Object Detection Algorithms

Heartbeat

FEBRUARY 6, 2023

When you’re working on an enterprise scale, managing your ML models can be tricky. YOLOv3 is a newer version of YOLO and was released in 2018. During training, the model is presented with an image with its own real labels and learns to predict the class and position of each object in the image, as well as the corresponding mask.

Algorithm

Algorithm Deep Learning Deep Learning ML

Defined AI closes $11.8 million Series A Funding Round

Defined.ai blog

JANUARY 30, 2023

In January, we publicly unveiled our SaaS platform , which helps data scientists collect, enrich, and structure data to train AI and ML models. We also have big plans to grow and qualify our crowd on Neevo and ensure data security through GDPR compliance and ISO certifications. It’s been a big year for us so far.

AI

AI AI Data Scientist ML

Fine-tune and deploy Llama 2 models cost-effectively in Amazon SageMaker JumpStart with AWS Inferentia and AWS Trainium

AWS Machine Learning Blog

JANUARY 17, 2024

SageMaker Studio is an integrated development environment (IDE) that provides a single web-based visual interface where you can access purpose-built tools to perform all machine learning (ML) development steps, from preparing data to building, training, and deploying your ML models. He retired from EPFL in December 2016.nnIn

AWS

AWS Python Machine Learning Machine Learning

Create high-quality datasets with Amazon SageMaker Ground Truth and FiftyOne

AWS Machine Learning Blog

MAY 5, 2023

Solution overview Ground Truth is a fully self-served and managed data labeling service that empowers data scientists, machine learning (ML) engineers, and researchers to build high-quality datasets. For our example use case, we work with the Fashion200K dataset , released at ICCV 2017. Then import the relevant modules.

Machine Learning

Machine Learning Machine Learning AWS ML

A Vision for the Future: How Computer Vision is Transforming Robotics

Heartbeat

MARCH 14, 2023

Some recent examples: Robotic systems that learned to grab and manipulate things with human-like dexterity was demonstrated by Google Brain researchers in 2018 utilizing deep reinforcement learning. A combination of simulated and real-world data was used to train the system, enabling it to generalize to new objects and tasks.

Deep Learning

Deep Learning Deep Learning Algorithm Machine Learning

Best Colleges for Data Science Course Online in India

Pickl AI

APRIL 10, 2023

As per the recent report by Nasscom and Zynga, the number of data science jobs in India is set to grow from 2,720 in 2018 to 16,500 by 2025. Top 5 Colleges to Learn Data Science (Online Platforms) 1. also offers free classes on Machine Learning that cover the core concepts of ML. In addition, Pickl.AI

Data Science

Data Science Machine Learning Machine Learning Python

FM Summit shows Foundation Model hurdles and potential

Snorkel AI

JANUARY 18, 2023

Trends in Enterprise ML and the Potential Impact of Foundation Models Carlo Giovine, a partner at McKinsey QuantumBlack , together with David Harvey, a staff expert at the same firm, told the online audience that companies are not moving fast enough to capture the value potential of AI/ML.

ML

ML ML AI AI

Capital One’s data-centric solutions to banking business challenges

Snorkel AI

MAY 12, 2023

Piyush Puri: Please join me in welcoming to the stage our next speakers who are here to talk about data-centric AI at Capital One, the amazing team who may or may not have coined the term, “what’s in your wallet.” We’re here to talk to you all about data-centric AI. That’s data. It’s wonderful to be here.

Machine Learning

Machine Learning Machine Learning ML ML

Capital One’s data-centric solutions to banking business challenges

Snorkel AI

MAY 12, 2023

Piyush Puri: Please join me in welcoming to the stage our next speakers who are here to talk about data-centric AI at Capital One, the amazing team who may or may not have coined the term, “what’s in your wallet.” We’re here to talk to you all about data-centric AI. That’s data. It’s wonderful to be here.

Machine Learning

Machine Learning Machine Learning ML ML

Best practices for building robust generative AI applications with Amazon Bedrock Agents – Part 1

AWS Machine Learning Blog

OCTOBER 2, 2024

About the Authors Maira Ladeira Tanke is a Senior Generative AI Data Scientist at AWS. Mark’s work covers a wide range of use cases, with a primary interest in generative AI, agents, and scaling ML across the enterprise. Mark holds six AWS certifications, including the ML Specialty Certification.

AI

AI AI AWS Machine Learning

McKinsey QuantumBlack experts: exciting foundation model future

Snorkel AI

MARCH 21, 2023

Together with David Harvey, an engagement manager focused on scaling deployments and applied R&D at that same firm, they presented the session “Trends in Enterprise ML and the potential impact of Foundation Models” at Snorkel AI’s 2023 Foundation Model Virtual Summit. Our ML protocols need updating in several ways.

ML

ML ML AI AI

Unleashing the Power of Deep Learning: Revolutionizing Recommender Systems

Heartbeat

OCTOBER 18, 2023

Editorially independent, Heartbeat is sponsored and published by Comet, an MLOps platform that enables data scientists & ML teams to track, compare, explain, & optimize their experiments. We’re committed to supporting and inspiring developers and engineers from all walks of life.

Deep Learning

Deep Learning Deep Learning Machine Learning Machine Learning

RoBERTa: A Modified BERT Model for NLP

Heartbeat

MARCH 15, 2023

An open-source machine learning model called BERT was developed by Google in 2018 for NLP, but this model had some limitations, and due to this, a modified BERT model called RoBERTa (Robustly Optimized BERT Pre-Training Approach) was developed by the team at Facebook in the year 2019. We pay our contributors, and we don’t sell ads.

Deep Learning

Deep Learning Deep Learning Machine Learning Machine Learning

Memory Integration in LangChain Agents

Heartbeat

DECEMBER 14, 2023

LeCun received the 2018 Turing Award (often referred to as the "Nobel Prize of Computing"), together with Yoshua Bengio and Geoffrey Hinton, for their work on deep learning. He is also one of the main creators of the DjVu image compression technology (together with Léon Bottou and Patrick Haffner).

Deep Learning

Deep Learning Deep Learning Machine Learning Machine Learning

Explainability in AI and Machine Learning Systems: An Overview

Heartbeat

SEPTEMBER 13, 2023

Audience and Context Interpretability : Interpretability primarily targets researchers, data scientists, or experts interested in understanding the model's behavior and improving its performance. Why We Will Never Open Deep Learning's Black Box || Towards Data Science Brent, M., Russell, C. & & Watcher, S.

Machine Learning

Machine Learning Machine Learning AI AI

Foundation models: a guide

Snorkel AI

MARCH 1, 2023

Data scientists can build upon generalized FMs and fine-tune custom versions with domain-specific or task-specific training data. This model debuted in June 2020, but remained a tool for researchers and ML practitioners until its creator, OpenAI, debuted a consumer-friendly chat interface in November 2022.

Natural Language Processing

Natural Language Processing Supervised Learning Machine Learning Machine Learning

Implementing Agents in LangChain

Heartbeat

DECEMBER 8, 2023

LeCun received the 2018 Turing Award (often referred to as the "Nobel Prize of Computing"), together with Yoshua Bengio and Geoffrey Hinton, for their work on deep learning. He is also one of the main creators of the DjVu image compression technology (together with Léon Bottou and Patrick Haffner).

Deep Learning

Deep Learning Deep Learning AI AI

Meta-Learning: Learning to Learn in Machine Learning

Heartbeat

JANUARY 29, 2024

Let's run this command with the following code: # Training loop for epoch in range(num_epochs): for batch_idx, (support_set, query_set) in enumerate(train_loader): optimizer.zero_grad() # Move data to device (e.g., 2018) Reptile: A scalable meta-learning algorithm || OpenAI.com Xavier L. GPU) support_set = support_set.to(device)

Machine Learning

Machine Learning Machine Learning Natural Language Processing Algorithm

NLP in Legal Discovery: Unleashing Language Processing for Faster Case Analysis

Heartbeat

AUGUST 23, 2023

Editorially independent, Heartbeat is sponsored and published by Comet, an MLOps platform that enables data scientists & ML teams to track, compare, explain, & optimize their experiments. Natural language processing and machine learning as practical toolsets for archival processing. Law and word order: NLP in legal tech.

Natural Language Processing

Natural Language Processing Algorithm Artificial Intelligence Artificial Intelligence

Harvard professor: DataPerf and AI’s need for data benchmarks

Snorkel AI

APRIL 25, 2023

With that said, I’m actually a faculty member at Harvard, and one of my key goals is to help—both academically as well as from an industry perspective—work with MLCommons , which is a nonprofit organization focusing on accelerating benchmarks, datasets, and best practices for ML (machine learning). Learn more, live!

Machine Learning

Machine Learning Machine Learning ML ML

Building Generative AI and ML solutions faster with AI apps from AWS partners using Amazon SageMaker

Bridging the Gap: New Datasets Push Recommender Research Toward Real-World Scale

Trending Sources

The most important unanswered questions of 2018 in Artificial Intelligence (AI) and Machine Learning (ML)

Improving ML Datasets with Cleanlab, a Standard Framework for Data-Centric AI

Tensor Processing Units (TPUs)

Deploy large language models for a healthtech use case on Amazon SageMaker

How to optimize your LinkedIn as a Data Scientist?

Using LLMs to fortify cyber defenses: Sophos’s insight on strategies for using LLMs with Amazon Bedrock and Amazon SageMaker

How Sportradar used the Deep Java Library to build production-scale ML platforms for increased performance and efficiency

Predict football punt and kickoff return yards with fat-tailed distribution using GluonTS

Llama 4 family of models from Meta are now available in SageMaker JumpStart

Machine Learning Engineering in the Real World

NLP-Powered Data Extraction for SLRs and Meta-Analyses

Predicting new and existing product sales in semiconductors using Amazon Forecast

Identifying defense coverage schemes in NFL’s Next Gen Stats

Train self-supervised vision transformers on overhead imagery with Amazon SageMaker

How Marubeni is optimizing market decisions using AWS machine learning and analytics

3 Takeaways from Gartner’s 2018 Data and Analytics Summit

What a data scientist should know about machine learning kernels?

AWS performs fine-tuning on a Large Language Model (LLM) to classify toxic speech for a large gaming company

Data Catalogs: A Category of Their Own

Generate a counterfactual analysis of corn response to nitrogen with Amazon SageMaker JumpStart solutions

Federated Learning on AWS with FedML: Health analytics without sharing sensitive data – Part 2

23 Best Free NLP Datasets for Machine Learning

First Step to Object Detection Algorithms

Defined AI closes $11.8 million Series A Funding Round

Fine-tune and deploy Llama 2 models cost-effectively in Amazon SageMaker JumpStart with AWS Inferentia and AWS Trainium

Create high-quality datasets with Amazon SageMaker Ground Truth and FiftyOne

A Vision for the Future: How Computer Vision is Transforming Robotics

Best Colleges for Data Science Course Online in India

FM Summit shows Foundation Model hurdles and potential

Capital One’s data-centric solutions to banking business challenges

Capital One’s data-centric solutions to banking business challenges

Best practices for building robust generative AI applications with Amazon Bedrock Agents – Part 1

McKinsey QuantumBlack experts: exciting foundation model future

Unleashing the Power of Deep Learning: Revolutionizing Recommender Systems

RoBERTa: A Modified BERT Model for NLP

Memory Integration in LangChain Agents

Explainability in AI and Machine Learning Systems: An Overview

Foundation models: a guide

Implementing Agents in LangChain

Meta-Learning: Learning to Learn in Machine Learning

NLP in Legal Discovery: Unleashing Language Processing for Faster Case Analysis

Harvard professor: DataPerf and AI’s need for data benchmarks

Stay Connected