Sat.Sep 24, 2022 - Fri.Sep 30, 2022

article thumbnail

How to Correctly Select a Sample From a Huge Dataset in Machine Learning

KDnuggets

We explain how choosing a small, representative dataset from a large population can improve model training reliability.

article thumbnail

Get to Know All About Evaluation Metrics

Analytics Vidhya

This article was published as a part of the Data Science Blogathon. Introduction Evaluation metrics are used to measure the quality of the model. Selecting an appropriate evaluation metric is important because it can impact your selection of a model or decide whether to put your model into production. The mportance of cross-validation: Are evaluation metrics […].

professionals

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

A Year After: Has Blockchain Changed Advertising by 2022?

Smart Data Collective

Last decade made a pretty bold promise to digital advertising, which more than other industries suffers from insufficient transparency and a fraudulent environment. The IAB Tech Lab conferences , in particular, frequently gathered blockchain evangelists and ad tech experts who discussed how this technology would finally drive authentication to programmatic chains.

Database 145
article thumbnail

Design principles for data analysis

FlowingData

To teach, learn, and measure the process of analysis more concretely, Lucy D’Agostino McGowan, Roger D. Peng, and Stephanie C. Hicks explain their work in the Journal of Computational and Graphical Statistics : The design principles for data analysis are qualities or characteristics that are relevant to the analysis and can be observed or measured.

article thumbnail

What’s New in Apache Airflow® 3.0—And How Will It Reshape Your Data Workflows?

Speaker: Tamara Fingerlin, Developer Advocate

Apache Airflow® 3.0, the most anticipated Airflow release yet, officially launched this April. As the de facto standard for data orchestration, Airflow is trusted by over 77,000 organizations to power everything from advanced analytics to production AI and MLOps. With the 3.0 release, the top-requested features from the community were delivered, including a revamped UI for easier navigation, stronger security, and greater flexibility to run tasks anywhere at any time.

article thumbnail

Become an AI Artist Using Phraser and Stable Diffusion

KDnuggets

Generate the prompt using Phraser and create realistic art using the Diffusion model.

AI 399
article thumbnail

Data Warehousing with Snowflake and Other Alternatives

Analytics Vidhya

This article was published as a part of the Data Science Blogathon. Introduction Over the past few years, Snowflake has grown from a virtual unknown to a retailer with thousands of customers. Businesses have adopted Snowflake as migration from on-premise enterprise data warehouses (such as Teradata) or a more flexibly scalable and easier-to-manage alternative to […].

More Trending

article thumbnail

Maps of wildfire smoke pollution

FlowingData

Wildfire obviously damages the areas it comes in direct contact with, but wildfire smoke can stretch much farther. Based on research by Childs et al. , Mira Rojanasakul, for The New York Times, shows how pollution from smoke spread between 2006 and 2020. My kids’ rooms still have air filters from a few years ago, when a fire many miles away made the sky orange and our indoor environment smokey.

136
136
article thumbnail

Welcome to TensorFlow!

KDnuggets

TensorFlow in Action teaches you to construct, train, and deploy deep learning models using TensorFlow 2. In this practical tutorial, you’ll build reusable skills hands-on as you create production-ready applications.

article thumbnail

Top Blockchain Interview Questions

Analytics Vidhya

This article was published as a part of the Data Science Blogathon. Introduction Blockchain technology is a decentralized, distributed ledger that preserves a record of digital asset ownership. It is a means to save data and information in a secure digital format. They are well known for their critical function in cryptocurrency systems like Bitcoin, […].

article thumbnail

Data-Driven Companies Leverage OCR for Optimal Data Quality

Smart Data Collective

OCR is the latest new technology that data-driven companies are leveraging to extract data more effectively. There are a number of benefits of using it to your company’s advantage. OCR and Other Data Extraction Tools Have Promising ROIs for Brands. Big data is changing the state of modern business. A growing number of companies have leveraged big data to cut costs, improve customer engagement, have better compliance rates and earn solid brand reputations.

article thumbnail

Agent Tooling: Connecting AI to Your Tools, Systems & Data

Speaker: Alex Salazar, CEO & Co-Founder @ Arcade | Nate Barbettini, Founding Engineer @ Arcade | Tony Karrer, Founder & CTO @ Aggregage

There’s a lot of noise surrounding the ability of AI agents to connect to your tools, systems and data. But building an AI application into a reliable, secure workflow agent isn’t as simple as plugging in an API. As an engineering leader, it can be challenging to make sense of this evolving landscape, but agent tooling provides such high value that it’s critical we figure out how to move forward.

article thumbnail

? How to Draw and Use Polygons in R

FlowingData

You can use straightforward functions in R to draw certain shapes, such as circles, squares, and rectangles. However, sometimes you need to draw a more complicated shape or one that’s based on data. Become a member for access to this — plus tutorials, courses, and guides.

130
130
article thumbnail

Lessons from a Senior Data Scientist

KDnuggets

The aim of this article was for me to gain a deeper insight into the life of a senior data scientist and how their experience can be used as lessons for up-and-coming data scientists.

article thumbnail

Analysis of Australian Shark Attacks

Analytics Vidhya

This article was published as a part of the Data Science Blogathon. Introduction Recently I searched for an interesting dataset to learn something new. After searching for a long time, I got a dataset on Shark Attacks in Australia. This dataset contains about 1,100 + shark bites and attempted shark bites between 1791 and early 2022, […]. The post Analysis of Australian Shark Attacks appeared first on Analytics Vidhya.

article thumbnail

Data-Driven Marketers Must Configure Outlook Data Files

Smart Data Collective

Big data has changed the marketing profession in extraordinary ways. Global companies spent over $3.2 billion on marketing analytics software last year. This figure is expected to grow in the future. There are many different ways that marketers can leverage data analytics to create successful marketing strategies. One of the biggest benefits is in the realm of email marketing.

Big Data 145
article thumbnail

How to Modernize Manufacturing Without Losing Control

Speaker: Andrew Skoog, Founder of MachinistX & President of Hexis Representatives

Manufacturing is evolving, and the right technology can empower—not replace—your workforce. Smart automation and AI-driven software are revolutionizing decision-making, optimizing processes, and improving efficiency. But how do you implement these tools with confidence and ensure they complement human expertise rather than override it? Join industry expert Andrew Skoog as he explores how manufacturers can leverage automation to enhance operations, streamline workflows, and make smarter, data-dri

article thumbnail

Trajectories of celebratory gunfire

FlowingData

When someone fires a gun into the air, the bullet travels thousands of feet in elevation. Gravity pulls the bullet back down, and it accelerates fast enough to penetrate a human skull by the time it reaches ground-level. Acceleration and trajectory vary by type of gun and the shot angle. 1Point21 Interactive shows the variation and dangers with a visual explainer.

123
123
article thumbnail

5 Python Interview Questions & Answers

KDnuggets

The Python coding questions challenge your problem-solving and programming skills.

Python 363
article thumbnail

Blockchain : Proof-of-Stake (PoS)

Analytics Vidhya

This article was published as a part of the Data Science Blogathon. Introduction Proof-of-stake is a cryptocurrency consensus mechanism for processing transactions and creating new blocks in the blockchain. A consensus mechanism is a method for validating records in a distributed database and keeping the database secure. In the case of cryptocurrency, the database is […].

Database 394
article thumbnail

Safety and Security Tips To Know in the Era of Big Data

Smart Data Collective

Today, data has become more critical than it has ever been in the past. We have talked about the importance of investing in good data collection methodologies. There are a growing number of risks with big data. Some of them stem from security issues if data is compromised. There are also physical safety issues associated with using the hardware that big data depends on.

Big Data 145
article thumbnail

Automation, Evolved: Your New Playbook for Smarter Knowledge Work

Speaker: Frank Taliano

Documents are the backbone of enterprise operations, but they are also a common source of inefficiency. From buried insights to manual handoffs, document-based workflows can quietly stall decision-making and drain resources. For large, complex organizations, legacy systems and siloed processes create friction that AI is uniquely positioned to resolve.

article thumbnail

Days-since tickers for all the natural disasters

FlowingData

You know those signs in workplaces that keep track of days since injury? Making use of NASA APIs, Neal Agarwal used that concept to keep track of natural disasters. As of this writing, it’s been 9,691,764 since the last Apocalyptic Volcanic Eruption (VEI 8). Pretty good. Tags: counting , disaster , Neal Agarwal.

121
121
article thumbnail

Top 5 Machine Learning Practices Recommended by Experts

KDnuggets

this article is intended to help beginners improve their model structure by listing the best practices recommended by machine learning experts.

article thumbnail

Concept of Cryptography in Blockchain

Analytics Vidhya

This article was published as a part of the Data Science Blogathon. Introduction Cryptography is a way of securing data against unauthorized access. In the blockchain, cryptography is used to secure transactions between two nodes in the blockchain network. As mentioned above, there are two main concepts in blockchain cryptography and hashing. Cryptography encrypts messages in […].

article thumbnail

Top 4 Blockchain Trends Shaping Business in 2022

Smart Data Collective

An increasing number of businesses are interested in investing in blockchain technology. The technology is attracting the attention of global business executives due to its huge real-world applications. In addition, blockchain applications are more scalable and secure compared to traditional apps. Enterprise blockchain will greatly benefit businesses due to the continual expansion of digital ecosystems.

article thumbnail

The 2nd Generation of Innovation Management: A Survival Guide

Speaker: Chris Townsend, VP of Product Marketing, Wellspring

Over the past decade, companies have embraced innovation with enthusiasm—Chief Innovation Officers have been hired, and in-house incubators, accelerators, and co-creation labs have been launched. CEOs have spoken with passion about “making everyone an innovator” and the need “to disrupt our own business.” But after years of experimentation, senior leaders are asking: Is this still just an experiment, or are we in it for the long haul?

article thumbnail

Mapping climate-related hazards in real-time

FlowingData

Bringing in data from various federal agencies : Climate Mapping for Resilience and Adaptation (CMRA) integrates information from across the federal government to help people consider their local exposure to climate-related hazards. People working in community organizations or for local, Tribal, state, or Federal governments can use the site to help them develop equitable climate resilience plans to protect people, property, and infrastructure.

119
119
article thumbnail

A Day in the Life of a Data Scientist: Expert vs. Beginner

KDnuggets

Let’s learn more about what a Data Scientist gets up to.

article thumbnail

7 Questions You Can Expect in Data Science Interview

Analytics Vidhya

This article was published as a part of the Data Science Blogathon. Source: DDI Introduction Data science job interviews need special skills. The candidates who succeed in landing employment are often not the ones with the best technical abilities but those who can pair such capabilities with interview acumen. Although data science is […].

article thumbnail

AI Technology Helps App Marketplaces Compete with App Store

Smart Data Collective

AI has become one of the most important gamechangers for businesses and customers relying on mobile technology. This is one of the reasons companies are spending over $328 billion on AI technology. One of the many reasons that AI is changing the landscape of mobile technology is that it helps develop and distribute apps more easily than ever. We previously talked about some of the ways that AI is making it easier to develop new mobile apps.

AI 145
article thumbnail

How to Achieve High-Accuracy Results When Using LLMs

Speaker: Ben Epstein, Stealth Founder & CTO | Tony Karrer, Founder & CTO, Aggregage

When tasked with building a fundamentally new product line with deeper insights than previously achievable for a high-value client, Ben Epstein and his team faced a significant challenge: how to harness LLMs to produce consistent, high-accuracy outputs at scale. In this new session, Ben will share how he and his team engineered a system (based on proven software engineering approaches) that employs reproducible test variations (via temperature 0 and fixed seeds), and enables non-LLM evaluation m

article thumbnail

Tree Talk

FlowingData

Kelton Sears used a vertical scroll upwards to think about trees and time. Tags: comic , Kelton Sears , time , trees.

116
116
article thumbnail

Top Posts September 19-25: 7 Machine Learning Portfolio Projects to Boost the Resume

KDnuggets

7 Machine Learning Portfolio Projects to Boost the Resume • How to Select Rows and Columns in Pandas Using [ ],loc, iloc,at and.iat • Decision Tree Algorithm, Explained • Free SQL and Database Course • 5 Tricky SQL Queries Solved.

article thumbnail

Dummies Guide to Writing a Custom Loss Function in Tensorflow

Analytics Vidhya

This article was published as a part of the Data Science Blogathon. Introduction Have you ever encountered a situation where you felt to use a custom loss function in your machine learning model? Maybe, you had to experiment with a new loss function while writing a research paper or to handle a new business case. […]. The post Dummies Guide to Writing a Custom Loss Function in Tensorflow appeared first on Analytics Vidhya.

article thumbnail

AI Technology is Changing Outbound Calling for Better or Worse

Smart Data Collective

Last year, HubSpot published an article on the benefits of using AI for call center management. More businesses are taking advantage of this opportunity. Automated outbound calls can save you a lot of time and money as an organization, by automating the frequently repeated calling processes. For instance, having your phone system automatically ask a user for their basic information can be much more efficient than having your agents do the same.

AI 142
article thumbnail

Apache Airflow® Best Practices: DAG Writing

Speaker: Tamara Fingerlin, Developer Advocate

In this new webinar, Tamara Fingerlin, Developer Advocate, will walk you through many Airflow best practices and advanced features that can help you make your pipelines more manageable, adaptive, and robust. She'll focus on how to write best-in-class Airflow DAGs using the latest Airflow features like dynamic task mapping and data-driven scheduling!