article thumbnail

Improving ML Datasets with Cleanlab, a Standard Framework for Data-Centric AI

ODSC - Open Data Science

This process is entirely automated, and when the same XGBoost model was re-trained on the cleaned data, it achieved 83% accuracy (with zero change to the modeling code). Previously, he was a senior scientist at Amazon Web Services developing AutoML and Deep Learning algorithms that now power ML applications at hundreds of companies.

ML 88
article thumbnail

Predict football punt and kickoff return yards with fat-tailed distribution using GluonTS

Flipboard

The player tracking data contains the player’s position, direction, acceleration, and more (in x,y coordinates). There are around 3,000 and 4,000 plays from four NFL seasons (2018–2021) for punt and kickoff plays, respectively. The data distribution for punt and kickoff are different.

professionals

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Present and future of data cubes: an European EO perspective

Mlearning.ai

It can be gradually “enriched” so the typical hierarchy of data is thus: Raw dataCleaned data ↓ Analysis-ready data ↓ Decision-ready data ↓ Decisions. For example, vector maps of roads of an area coming from different sources is the raw data. 2018, July). Remote Sensing, 12(24), 4033.

AWS 98
article thumbnail

Tableau: 9 years a Leader in Gartner Magic Quadrant for Analytics and Business Intelligence Platforms

Tableau

We also reached some incredible milestones with Tableau Prep, our easy-to-use, visual, self-service data prep product. In 2020, we added the ability to write to external databases so you can use clean data anywhere. Tableau Prep can now be used across more use cases and directly in the browser.

Tableau 70
article thumbnail

Tableau: 9 years a Leader in Gartner Magic Quadrant for Analytics and Business Intelligence Platforms

Tableau

We also reached some incredible milestones with Tableau Prep, our easy-to-use, visual, self-service data prep product. In 2020, we added the ability to write to external databases so you can use clean data anywhere. Tableau Prep can now be used across more use cases and directly in the browser.

Tableau 52
article thumbnail

Create high-quality datasets with Amazon SageMaker Ground Truth and FiftyOne

AWS Machine Learning Blog

In the following sections, we demonstrate how to do the following: Visualize the dataset in FiftyOne Clean the dataset with filtering and image deduplication in FiftyOne Pre-label the cleaned data with zero-shot classification in FiftyOne Label the smaller curated dataset with Ground Truth Inject labeled results from Ground Truth into FiftyOne and (..)

article thumbnail

Capital One’s data-centric solutions to banking business challenges

Snorkel AI

To borrow another example from Andrew Ng, improving the quality of data can have a tremendous impact on model performance. This is to say that clean data can better teach our models. Another benefit of clean, informative data is that we may also be able to achieve equivalent model performance with much less data.