Sat.Oct 01, 2022 - Fri.Oct 07, 2022

article thumbnail

The ABCs of NLP, From A to Z

KDnuggets

There is no shortage of tools today that can help you through the steps of natural language processing, but if you want to get a handle on the basics this is a good place to start. Read about the ABCs of NLP, all the way from A to Z.

article thumbnail

Three R Libraries for Automated EDA

Analytics Vidhya

This article was published as a part of the Data Science Blogathon. Introduction With the increasing use of technology, data accumulation is faster than ever due to connected smart devices. These devices continuously collect and transmit data that can be processed, transformed, and stored for later use. This collected data, known as big data, holds valuable […].

EDA 400
professionals

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

The maximizing effect of ad hoc reports: Ad hoc reporting explained

Dataconomy

If used properly, data presents a multitude of opportunities for people and businesses aiming to enhance their operational effectiveness, business intelligence, profitability, and long-term success. Today, failing to use digital data to your advantage could have severe effects on your company. It’s like trying to navigate a busy roadway while.

article thumbnail

How Formula 1 Teams Leverage Big Data for Success

Smart Data Collective

We have previously talked about ways that big data is changing the world of sports. Formula 1 teams are among those most affected. Ever since the Oakland A’s switched their recruitment policy from a players’ running speed and strength to a more sophisticated and nuanced look at the on-base slugging percentage, the world of sports has become more and more accustomed to utilizing sports analytics in their team-building.

Big Data 145
article thumbnail

What’s New in Apache Airflow® 3.0—And How Will It Reshape Your Data Workflows?

Speaker: Tamara Fingerlin, Developer Advocate

Apache Airflow® 3.0, the most anticipated Airflow release yet, officially launched this April. As the de facto standard for data orchestration, Airflow is trusted by over 77,000 organizations to power everything from advanced analytics to production AI and MLOps. With the 3.0 release, the top-requested features from the community were delivered, including a revamped UI for easier navigation, stronger security, and greater flexibility to run tasks anywhere at any time.

article thumbnail

Key-Value Databases, Explained

KDnuggets

Among the four big NoSQL database types, key-value stores are probably the most popular ones due to their simplicity and fast performance. Let’s further explore how key-value stores work and what are their practical uses.

Database 399
article thumbnail

Using MongoDB with Pandas, NumPy, and PyArrow

Analytics Vidhya

This article was published as a part of the Data Science Blogathon. Introduction If you are a data scientist or a Python developer who sometimes wears the data scientist hat, you were likely required to work with some of these tools & technologies: Pandas, NumPy, PyArrow, and MongoDB. If you are new to these terms, […]. The post Using MongoDB with Pandas, NumPy, and PyArrow appeared first on Analytics Vidhya.

More Trending

article thumbnail

Wide range of data exploration tools

FlowingData

Simon Willison asked a straightforward question about the tools people use : If someone gives you a CSV file with 100,000 rows in it, what tools do you use to start exploring and understanding that data? Then he expanded the question asking what people use for files with 1 million rows, 10 million rows, and 1 billion rows. Browse the thousands of replies, and you quickly see that (1) there are many options to explore a dataset and (2) many people feel that what they’re using is the best op

139
139
article thumbnail

Machine Learning for Everybody!

KDnuggets

Who is machine learning for? Everybody!

article thumbnail

Key Components and Challenges of Data Lakes

Analytics Vidhya

This article was published as a part of the Data Science Blogathon. Introduction Today, Data Lake is most commonly used to describe an ecosystem of IT tools and processes (infrastructure as a service, software as a service, etc.) that work together to make processing and storing large volumes of data easy. An ecosystem consists of […]. The post Key Components and Challenges of Data Lakes appeared first on Analytics Vidhya.

article thumbnail

Essential Productivity Hacks in Cloud-Centric Workplaces

Smart Data Collective

The market for cloud technology is growing remarkably. One study shows spending on cloud services doubled between 2017 and 2020 from $30 billion to $60 billion. Cloud technology is changing the face of the modern workplace. More companies than ever are leveraging the cloud to boost productivity, improve customer service strategies and streamline the research and development process.

article thumbnail

Agent Tooling: Connecting AI to Your Tools, Systems & Data

Speaker: Alex Salazar, CEO & Co-Founder @ Arcade | Nate Barbettini, Founding Engineer @ Arcade | Tony Karrer, Founder & CTO @ Aggregage

There’s a lot of noise surrounding the ability of AI agents to connect to your tools, systems and data. But building an AI application into a reliable, secure workflow agent isn’t as simple as plugging in an API. As an engineering leader, it can be challenging to make sense of this evolving landscape, but agent tooling provides such high value that it’s critical we figure out how to move forward.

article thumbnail

Accessible visualization with Olli JavaScript library

FlowingData

The Olli library aims to make it easier for developers to improve the accessibility of existing charts : Olli is an open-source library for converting data visualizations into accessible text structures for screen reader users. Starting with an existing visualization specification created with a supported toolkit, Olli produces a keyboard-navigable tree view with descriptions at varying levels of detail.

article thumbnail

NLP Interview Questions

KDnuggets

What is NLP, and what types of questions related to NLP can you expect at the NLP-related job interviews?

398
398
article thumbnail

Sentiment Analysis Using VADER

Analytics Vidhya

This article was published as a part of the Data Science Blogathon. Introduction A business or a brand’s success depends solely on customer satisfaction. Suppose, if the customer does not like the product, you may have to work on the product to make it more efficient. So, for you to identify this, you will be […]. The post Sentiment Analysis Using VADER appeared first on Analytics Vidhya.

article thumbnail

Cloud Helps Russian Developers Gain Global Popularity

Smart Data Collective

From creating world-famous games like Tertis to being recognized by Google and Facebook with some of the most prestigious programmer awards, Russian developers have come a long way to becoming global leaders in programming. The country is known for its large pool of skilled and expert programmers who are capable of accomplishing the most challenging tasks.

article thumbnail

How to Modernize Manufacturing Without Losing Control

Speaker: Andrew Skoog, Founder of MachinistX & President of Hexis Representatives

Manufacturing is evolving, and the right technology can empower—not replace—your workforce. Smart automation and AI-driven software are revolutionizing decision-making, optimizing processes, and improving efficiency. But how do you implement these tools with confidence and ensure they complement human expertise rather than override it? Join industry expert Andrew Skoog as he explores how manufacturers can leverage automation to enhance operations, streamline workflows, and make smarter, data-dri

article thumbnail

RecSys 2022: Recap, Favorite Papers, and Lessons

Eugene Yan

My three favorite papers, 17 paper summaries, and ML and non-ML lessons.

ML 130
article thumbnail

Hyperparameter Tuning Using Grid Search and Random Search in Python

KDnuggets

A comprehensive guide on optimizing model hyperparameters with Scikit-Learn.

Python 378
article thumbnail

Real-time Challenges of Machine Learning Projects

Analytics Vidhya

This article was published as a part of the Data Science Blogathon. Introduction Machine learning projects can be extremely challenging in the IT industry. Several factors can make them difficult, including the volume of data that needs to be processed, the complexity of the algorithms involved, and the need to ensure that the systems are […].

article thumbnail

China’s fishing patterns shift globally

FlowingData

China’s fish supply is running low along its own coast, so they’ve shifted their fishing activities globally. The New York Times visualized the shift with animated maps. Tags: China , fishing , New York Times.

128
128
article thumbnail

Automation, Evolved: Your New Playbook for Smarter Knowledge Work

Speaker: Frank Taliano

Documents are the backbone of enterprise operations, but they are also a common source of inefficiency. From buried insights to manual handoffs, document-based workflows can quietly stall decision-making and drain resources. For large, complex organizations, legacy systems and siloed processes create friction that AI is uniquely positioned to resolve.

article thumbnail

Discovering novel algorithms with AlphaTensor

DeepMind

In our paper, published today in Nature, we introduce AlphaTensor, the first artificial intelligence (AI) system for discovering novel, efficient, and provably correct algorithms for fundamental tasks such as matrix multiplication. This sheds light on a 50-year-old open question in mathematics about finding the fastest way to multiply two matrices. This paper is a stepping stone in DeepMind’s mission to advance science and unlock the most fundamental problems using AI.

Algorithm 108
article thumbnail

AI in FinTech: Managing the Finance of the Future

KDnuggets

Digital transformation is evolving, and so is the fintech industry by implementing AI trends and leveraging several benefits, such as optimizing productivity, increasing ROI, and enhancing security.

AI 373
article thumbnail

Demystifying NoSQL: Your Complete Interview Guide

Analytics Vidhya

This article was published as a part of the Data Science Blogathon. Introduction In data science, learning about databases is inevitable. In fact, as a data science expert, you have to learn how to work with databases, run queries quickly, and more. There is no way around it! He has two things to know. Learn […]. The post Demystifying NoSQL: Your Complete Interview Guide appeared first on Analytics Vidhya.

article thumbnail

Coffee versus tea in charts

FlowingData

Anahad O’Connor, Aaron Steckelberg and Garland Potts, for The Washington Post, made charts that compare the benefits of coffee and tea. But let’s be honest here. All we really want to see in a battle between coffee and tea is an anthropomorphic bean and leaf wrestle. Tags: coffee , illustration , tea , Washington Post.

124
124
article thumbnail

The 2nd Generation of Innovation Management: A Survival Guide

Speaker: Chris Townsend, VP of Product Marketing, Wellspring

Over the past decade, companies have embraced innovation with enthusiasm—Chief Innovation Officers have been hired, and in-house incubators, accelerators, and co-creation labs have been launched. CEOs have spoken with passion about “making everyone an innovator” and the need “to disrupt our own business.” But after years of experimentation, senior leaders are asking: Is this still just an experiment, or are we in it for the long haul?

article thumbnail

Best of Tableau Web: September 2022

Tableau

Mark Bradbourne. National Solutions Engineer, Tableau . Bronwen Boyd. September 30, 2022 - 8:07pm. October 1, 2022. Welcome to the Best of the Tableau Web! Each month we showcase the amazing outputs from the Tableau Community, including blogs, podcasts, and even videos. This month we take things to the next level and recognize community members who have written the book(s) on data!

Tableau 98
article thumbnail

How to Get Up and Running with SQL – A List of Free Learning Resources

KDnuggets

We have compiled a list of the top free resources to help new data practitioners learn SQL. These include free online courses and resources to get the most out of your SQL skills.

SQL 365
article thumbnail

Reduce Equation of Quantum Physics Using Artificial Intelligence

Analytics Vidhya

This article was published as a part of the Data Science Blogathon. Introduction Physicists have reduced a quantum physics problem that required 100,000 equations into a bite-size task that only requires four equations using Artificial Intelligence (AI). Researchers at the US-based Flatiron Institute trained a machine learning tool to grasp the physics of electrons moving on […].

article thumbnail

Unemployed data scientist

FlowingData

It seems a lot of data scientists have either left or were laid off from their jobs during the past few months. Jacqueline Nolis and Emily Robinson, data scientists who hosted a podcast and wrote a book on building a career in the field, happened to be in the lot. So naturally, they brought back the podcast for a bonus episode on their experiences with sudden unemployment and the job search.

article thumbnail

How to Achieve High-Accuracy Results When Using LLMs

Speaker: Ben Epstein, Stealth Founder & CTO | Tony Karrer, Founder & CTO, Aggregage

When tasked with building a fundamentally new product line with deeper insights than previously achievable for a high-value client, Ben Epstein and his team faced a significant challenge: how to harness LLMs to produce consistent, high-accuracy outputs at scale. In this new session, Ben will share how he and his team engineered a system (based on proven software engineering approaches) that employs reproducible test variations (via temperature 0 and fixed seeds), and enables non-LLM evaluation m

article thumbnail

Best of Tableau Web: September 2022

Tableau

Mark Bradbourne. National Solutions Engineer, Tableau . Bronwen Boyd. September 30, 2022 - 8:07pm. October 1, 2022. Welcome to the Best of the Tableau Web! Each month we showcase the amazing outputs from the Tableau Community, including blogs, podcasts, and even videos. This month we take things to the next level and recognize community members who have written the book(s) on data!

Tableau 98
article thumbnail

3 Ways to Process CSV Files in Python

KDnuggets

This article is about 3 ways you can process a CSV file using Python.

Python 348
article thumbnail

Apache Kafka Use Cases and Installation Guide

Analytics Vidhya

This article was published as a part of the Data Science Blogathon. Introduction Today, we expect web applications to respond to user queries quickly, if not immediately. As applications cover more aspects of our daily lives, it is increasingly difficult to provide users with a quick response. Source: kafka.apache.org Caching is used to solve […].

article thumbnail

Difficulties reading the cone of uncertainty

FlowingData

It seems that there is always surprise when a hurricane makes landfall in some areas, which some attribute to poor forecast communication with the cone on a map that shows possible paths. Scott Dance and Amudalat Ajasa for The Washington Post discuss the challenges that people have reading the cone of uncertainty : Indeed, many residents and authorities have said Ian’s track surprised them, even though the cone for days included the storm’s eventual landfall point on its southern edge.

119
119
article thumbnail

Apache Airflow® Best Practices: DAG Writing

Speaker: Tamara Fingerlin, Developer Advocate

In this new webinar, Tamara Fingerlin, Developer Advocate, will walk you through many Airflow best practices and advanced features that can help you make your pipelines more manageable, adaptive, and robust. She'll focus on how to write best-in-class Airflow DAGs using the latest Airflow features like dynamic task mapping and data-driven scheduling!