Sat.Sep 07, 2019 - Fri.Sep 13, 2019

article thumbnail

Many Heads Are Better Than One: The Case For Ensemble Learning

KDnuggets

While ensembling techniques are notoriously hard to set up, operate, and explain, with the latest modeling, explainability and monitoring tools, they can produce more accurate and stable predictions. And better predictions can be better for business.

article thumbnail

A Data Scientist’s Guide to 8 Types of Sampling Techniques

Analytics Vidhya

Overview Sampling is a popular statistical concept – learn how it works in this article We will also talk about eight different types of. The post A Data Scientist’s Guide to 8 Types of Sampling Techniques appeared first on Analytics Vidhya.

professionals

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Everything you want to know about GDPR’s Right to be Forgotten in Blockchain

Dataconomy

What is the big problem with the right to be forgotten (right to erasure, Article 17) under the GDPR? As Blockchain generally is immutable, and the GDPR requires personal data to be deleted – many people therefore conclude that it is impossible to store any kind of personal data on. The post Everything you want to know about GDPR’s Right to be Forgotten in Blockchain appeared first on Dataconomy.

187
187
article thumbnail

The Role of Big Data In The Maintenance Industry

Smart Data Collective

As industry buzzwords, “Big Data” is one of those phrases that has become seemingly ubiquitous. Everyone wants to be using big data to better their operation. The maintenance department is no exception to this trend. Accordingly, maintenance teams are beginning to embrace the use of big data and analytics to improve performance. In emphasizing the use of “big data”, maintenance can establish predictive maintenance programs, which reduce downtime and save on maintenance costs.

Big Data 110
article thumbnail

What’s New in Apache Airflow® 3.0—And How Will It Reshape Your Data Workflows?

Speaker: Tamara Fingerlin, Developer Advocate

Apache Airflow® 3.0, the most anticipated Airflow release yet, officially launched this April. As the de facto standard for data orchestration, Airflow is trusted by over 77,000 organizations to power everything from advanced analytics to production AI and MLOps. With the 3.0 release, the top-requested features from the community were delivered, including a revamped UI for easier navigation, stronger security, and greater flexibility to run tasks anywhere at any time.

article thumbnail

Train sklearn 100x Faster

KDnuggets

As compute gets cheaper and time to market for machine learning solutions becomes more critical, we’ve explored options for speeding up model training. One of those solutions is to combine elements from Spark and scikit-learn into our own hybrid solution.

article thumbnail

Become a Video Analysis Expert: A Simple Approach to Automatically Generating Highlights using Python

Analytics Vidhya

Overview Build your own highlights package in Python using a simple approach That’s right – learn how automatic highlight generation works without using machine. The post Become a Video Analysis Expert: A Simple Approach to Automatically Generating Highlights using Python appeared first on Analytics Vidhya.

Python 284

More Trending

article thumbnail

How Big Data Is Transforming Social Media Marketing

Smart Data Collective

Big Data is among one of the most impressive tech advancements that have hit the marketing world in recent memory. While it has been tossed around as a buzzword in certain circles, Big Data is so much more than just a phrase. For a definition , Oracle recommends Gartner’s 2001 description of Big Data, which describes it as data containing a greater variety, getting to the source in increasing volume and at ever-higher velocity.

Big Data 106
article thumbnail

Classification vs Prediction

KDnuggets

It is important to distinguish prediction and classification. In many decision-making contexts, classification represents a premature decision, because classification combines prediction and decision making and usurps the decision maker in specifying costs of wrong decisions.

article thumbnail

4 Key Aspects of a Data Science Project Every Data Scientist and Leader Should Know

Analytics Vidhya

Overview A data-science-driven product consists of multiple aspects every leader needs to be aware of Machine learning algorithms are one part of a whole. The post 4 Key Aspects of a Data Science Project Every Data Scientist and Leader Should Know appeared first on Analytics Vidhya.

article thumbnail

AI Simplified: Supervised Machine Learning

DataRobot

It is well-known that the AI revolution is transforming industries and businesses around the world. In this AI Simplified video, we define supervised machine learning and share some ways the military leverages this technology to maintain safety and ensure preparedness.

article thumbnail

Agent Tooling: Connecting AI to Your Tools, Systems & Data

Speaker: Alex Salazar, CEO & Co-Founder @ Arcade | Nate Barbettini, Founding Engineer @ Arcade | Tony Karrer, Founder & CTO @ Aggregage

There’s a lot of noise surrounding the ability of AI agents to connect to your tools, systems and data. But building an AI application into a reliable, secure workflow agent isn’t as simple as plugging in an API. As an engineering leader, it can be challenging to make sense of this evolving landscape, but agent tooling provides such high value that it’s critical we figure out how to move forward.

article thumbnail

5 Reasons Why You Should Store Big Data In The Cloud

Smart Data Collective

Gone are the days when storage of information can only be done with the traditional remote servers which are located in a secluded location. Today, the in-thing is cloud data storage where information and data are stored electronically online. With this approach, you can store unlimited data online (in the cloud) and access it anywhere. Several essays and many articles have been written on storage clouds and benefits of the cloud , but this piece puts forward five of the biggest benefits that yo

article thumbnail

The 5 Graph Algorithms That Data Scientists Should Know

KDnuggets

In this post, I am going to be talking about some of the most important graph algorithms you should know and how to implement them using Python.

Algorithm 307
article thumbnail

WNS Analytics Wizard 2019: Top 3 Winners’ Solutions from our Biggest Data Science Hackathon

Analytics Vidhya

Overview Here’s a unique data science challenge we don’t come across often – a marketing analytics hackathon! We bring you the top 3 inspiring. The post WNS Analytics Wizard 2019: Top 3 Winners’ Solutions from our Biggest Data Science Hackathon appeared first on Analytics Vidhya.

article thumbnail

3 Reasons to Ditch Excel for FP&A Data Consolidation & Validation

DataRobot Blog

Financial Planning and Analysis (FP&A) business professionals are responsible for mapping out a company’s financial future. They transform company goals into actionable plans by analyzing the current state of financial management affairs, then take the time to create a roadmap plan that details how to reach the destination. . Creating those plans require ingesting massive amounts of data resources, aggregating, cleansing, and standardizing that data, and then performing analysis on the finis

article thumbnail

How to Modernize Manufacturing Without Losing Control

Speaker: Andrew Skoog, Founder of MachinistX & President of Hexis Representatives

Manufacturing is evolving, and the right technology can empower—not replace—your workforce. Smart automation and AI-driven software are revolutionizing decision-making, optimizing processes, and improving efficiency. But how do you implement these tools with confidence and ensure they complement human expertise rather than override it? Join industry expert Andrew Skoog as he explores how manufacturers can leverage automation to enhance operations, streamline workflows, and make smarter, data-dri

article thumbnail

AI Drives The Inception Of Three Cutting-Edge Smart Home Products

Smart Data Collective

Artificial intelligence is coming to our homes. A growing number of people use smart devices that are developed with state-of-the-art AI technology. The market for smart homes is going to rise as new AI advances bring big changes to the industry. One survey from last year found that only 12-16% of homes in the United States are equipped with smart devices.

AI 97
article thumbnail

10 Great Python Resources for Aspiring Data Scientists

KDnuggets

This is a collection of 10 interesting resources in the form of articles and tutorials for the aspiring data scientist new to Python, meant to provide both insight and practical instruction when starting on your journey.

article thumbnail

Talking with Coz: Pure Origins and the Future of Storage

DataCentric podcast

Want to hear a good origin story? Or about the future of data? You're in luck. As Pure Storage heads into its annual Pure Accelerate Conference in Austin next week, it's looking to celebrate its 10th anniversary. 10 years in which Pure has grown from a seed-stage start-up to a ~$4B publically traded company. And Pure continues to be a disrupter in the storage industry.

40
article thumbnail

Scikit-Learn vs mlr for Machine Learning

KDnuggets

How does the scikit-learn machine learning library for Python compare to the mlr package for R? Following along with a machine learning workflow through each approach, and see if you can gain a competitive advantage by knowing both frameworks.

article thumbnail

Automation, Evolved: Your New Playbook for Smarter Knowledge Work

Speaker: Frank Taliano

Documents are the backbone of enterprise operations, but they are also a common source of inefficiency. From buried insights to manual handoffs, document-based workflows can quietly stall decision-making and drain resources. For large, complex organizations, legacy systems and siloed processes create friction that AI is uniquely positioned to resolve.

article thumbnail

There is No Free Lunch in Data Science

KDnuggets

There is no such thing as a free lunch in life or data science. Here, we'll explore some science philosophy and discuss the No Free Lunch theorems to find out what they mean for the field of data science.

article thumbnail

Common Machine Learning Obstacles

KDnuggets

In this blog, Seth DeLand of MathWorks discusses two of the most common obstacles relate to choosing the right classification model and eliminating data overfitting.

article thumbnail

BERT is changing the NLP landscape

KDnuggets

BERT is changing the NLP landscape and making chatbots much smarter by enabling computers to better understand speech and respond intelligently in real-time.

AI 303
article thumbnail

A 2019 Guide to Speech Synthesis with Deep Learning

KDnuggets

In this article, we’ll look at research and model architectures that have been written and developed to do just that using deep learning.

article thumbnail

The 2nd Generation of Innovation Management: A Survival Guide

Speaker: Chris Townsend, VP of Product Marketing, Wellspring

Over the past decade, companies have embraced innovation with enthusiasm—Chief Innovation Officers have been hired, and in-house incubators, accelerators, and co-creation labs have been launched. CEOs have spoken with passion about “making everyone an innovator” and the need “to disrupt our own business.” But after years of experimentation, senior leaders are asking: Is this still just an experiment, or are we in it for the long haul?

article thumbnail

A Friendly Introduction to Support Vector Machines

KDnuggets

This article explains the Support Vector Machines (SVM) algorithm in an easy way.

article thumbnail

Version Control for Data Science: Tracking Machine Learning Models and Datasets

KDnuggets

I am a Git god, why do I need another version control system for Machine Learning Projects?

article thumbnail

The State of Transfer Learning in NLP

KDnuggets

This post expands on the NAACL 2019 tutorial on Transfer Learning in NLP organized by Matthew Peters, Swabha Swayamdipta, Thomas Wolf, and Sebastian Ruder. This post highlights key insights and takeaways and provides updates based on recent work.

289
289
article thumbnail

Can graph machine learning identify hate speech in online social networks?

KDnuggets

Online hate speech is a complex subject. Follow this demonstration using state-of-the-art graph neural network models to detect hateful users based on their activities on the Twitter social network.

article thumbnail

How to Achieve High-Accuracy Results When Using LLMs

Speaker: Ben Epstein, Stealth Founder & CTO | Tony Karrer, Founder & CTO, Aggregage

When tasked with building a fundamentally new product line with deeper insights than previously achievable for a high-value client, Ben Epstein and his team faced a significant challenge: how to harness LLMs to produce consistent, high-accuracy outputs at scale. In this new session, Ben will share how he and his team engineered a system (based on proven software engineering approaches) that employs reproducible test variations (via temperature 0 and fixed seeds), and enables non-LLM evaluation m

article thumbnail

OpenStreetMap Data to ML Training Labels for Object Detection

KDnuggets

I am really interested in creating a tight, clean pipeline for disaster relief applications, where we can use something like crowd sourced building polygons from OSM to train a supervised object detector to discover buildings in an unmapped location.

ML 270
article thumbnail

How DeepMind and Waymo are Using Evolutionary Competition to Train Self-Driving Vehicles

KDnuggets

Recently, Alphabet’s subsidiaries Waymo and DeepMind partnered to find a more efficient process to train self-driving vehicles algorithms and their work took them back to one of the cornerstones of our history as species: evolution.

Algorithm 242
article thumbnail

Ensemble Methods for Machine Learning: AdaBoost

KDnuggets

It turned out that, if we ask the weak algorithm to create a whole bunch of classifiers (all weak for definition), and then combine them all, what may figure out is a stronger classifier.

article thumbnail

Data Driven Government – Agenda, Washington, DC, Sep 25

KDnuggets

Data Driven Government is coming to Washington, DC, Sep 26, and includes a stellar lineup of experts who will share the emerging trends and best practices of government agencies in the current use of data analytics to enhance mission outcomes. Use code KDNUGGETS to get 15% off.

Analytics 224
article thumbnail

Apache Airflow® Best Practices: DAG Writing

Speaker: Tamara Fingerlin, Developer Advocate

In this new webinar, Tamara Fingerlin, Developer Advocate, will walk you through many Airflow best practices and advanced features that can help you make your pipelines more manageable, adaptive, and robust. She'll focus on how to write best-in-class Airflow DAGs using the latest Airflow features like dynamic task mapping and data-driven scheduling!