Sat.Jun 25, 2022 - Fri.Jul 01, 2022

article thumbnail

Celebrating Women in Leadership Roles in the Tech Industry

KDnuggets

The technology industry, specifically, has been continuing to close the gender gap.

article thumbnail

Stemming vs Lemmatization in NLP: Must-Know Differences

Analytics Vidhya

This article was published as a part of the Data Science Blogathon. Introduction In the field of Natural Language Processing i.e., NLP, Lemmatization and Stemming are Text Normalization techniques. These techniques are used to prepare words, text, and documents for further processing. Languages such as English, Hindi consists of several words which are often derived […].

professionals

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Analysis of compound curse words used on Reddit

FlowingData

As you know, Reddit is typically a sophisticated place of kind and pleasant conversation. So Colin Morris analyzed the usage of compound pejoratives in Reddit comments : The full “matrix” of combinations is surprisingly dense. Of the ~4,800 possible compounds, more than half occurred in at least one comment. The most frequent compound, dumbass , appears in 3.6 million comments, but there’s also a long tail of many rare terms, including 444 hapax legomena (terms which appear only once

145
145
article thumbnail

8 Reasons Data-Driven Companies Are Utilizing Email Marketing

Smart Data Collective

Big data is at the heart of all successful, modern marketing strategies. Companies that engage in email marketing have discovered that big data is particularly effective. When you are running a data-driven company, you should seriously consider investing in email marketing campaigns. Keep reading to learn more about the benefits. Data-Driven Companies are Discovering the Benefits of Investing in Email Marketing.

Big Data 145
article thumbnail

What’s New in Apache Airflow® 3.0—And How Will It Reshape Your Data Workflows?

Speaker: Tamara Fingerlin, Developer Advocate

Apache Airflow® 3.0, the most anticipated Airflow release yet, officially launched this April. As the de facto standard for data orchestration, Airflow is trusted by over 77,000 organizations to power everything from advanced analytics to production AI and MLOps. With the 3.0 release, the top-requested features from the community were delivered, including a revamped UI for easier navigation, stronger security, and greater flexibility to run tasks anywhere at any time.

article thumbnail

Making Sense of CRISP-ML(Q): The Machine Learning Lifecycle Process

KDnuggets

Learn about the standard process for building sustainable machine learning applications.

article thumbnail

How to Become a Blockchain Developer?

Analytics Vidhya

This article was published as a part of the Data Science Blogathon. Introduction Although blockchain is still in its infancy, the opportunities for developers to contribute are not just exciting but also many. Many businesses, including supply chains, automotive, and finance, have adopted blockchain, but it is not without problems. When a cryptocurrency, namely Bitcoin, […].

More Trending

article thumbnail

3 Smart Technologies Boosting Energy Efficiency Worldwide

Smart Data Collective

The growth of smart technology is one of the most beneficial trends brought on by advances in AI. It is projected that there will be over 77 million smart homes in the United States by 2025. Smart technology is also being used by businesses and government institutions around the world. Many factors are driving the demand for smart technology. The quest for efficient and sustainable energy usage is one of the defining technological challenges of the modern age — especially as we find ourselves in

AI 133
article thumbnail

24 SQL Questions You Might See on Your Next Interview

KDnuggets

Preparing for the SQL job interview can be overwhelming enough. You don’t need someone telling you that you need to know everything on top of that! Be smart and focus on preparing the SQL questions that appear most often at the job interview.

SQL 400
article thumbnail

Data Driven Culture: A Far-fetched Goal for Organizations

Analytics Vidhya

This article was published as a part of the Data Science Blogathon. Introduction Creating a collaborative, data-driven culture is one of the most important goals of many modern organizations. A data-driven culture is when data is used to make decisions at every level of the organization. A data-driven culture is about replacing the gut feeling […].

article thumbnail

Introduction to statistical learning

FlowingData

An Introduction to Statistical Learning , by Gareth James, Daniela Witten, Trevor Hastie, and Rob Tibshirani: As the scale and scope of data collection continue to increase across virtually all fields, statistical learning has become a critical toolkit for anyone who wishes to understand data. An Introduction to Statistical Learning provides a broad and less technical treatment of key topics in statistical learning.

article thumbnail

Agent Tooling: Connecting AI to Your Tools, Systems & Data

Speaker: Alex Salazar, CEO & Co-Founder @ Arcade | Nate Barbettini, Founding Engineer @ Arcade | Tony Karrer, Founder & CTO @ Aggregage

There’s a lot of noise surrounding the ability of AI agents to connect to your tools, systems and data. But building an AI application into a reliable, secure workflow agent isn’t as simple as plugging in an API. As an engineering leader, it can be challenging to make sense of this evolving landscape, but agent tooling provides such high value that it’s critical we figure out how to move forward.

article thumbnail

Using Instagram Highlight Covers in Your Data-Driven Marketing Strategy

Smart Data Collective

Modern marketing strategies rely heavily on big data. One study found that retailers that use big data have 2.7 times greater brand awareness than those that don’t. Big data is even more important for companies that depend on social media marketing. Geoffrey Moore tweeted about this in 2012 when he said: “Without big data analytics, companies are blind and deaf, wandering out onto the Web like deer on a freeway.”.

Big Data 132
article thumbnail

Data Preparation with SQL Cheatsheet

KDnuggets

If your raw data is in a SQL-based data lake, why spend the time and money to export the data into a new platform for data prep?

SQL 400
article thumbnail

20 SQL Coding Interview Questions

Analytics Vidhya

This article was published as a part of the Data Science Blogathon. Introduction SQL stands for Structured Query Language. It’s a programming language to interact/query and manage RDBMS (Relational Database Management Systems). SQL skills are highly preferred and required as it’s used by many organizations in a large variety of software applications.

SQL 337
article thumbnail

Visualising Knowledge

FlowingData

Visualising Knowledge is an open book from PBL Netherlands Environmental Assessment Agency, based on 25 years of making charts : PBL data visualisation is about visualising research results, using graphs, maps, diagrams and infographics. Over the years, the variety in types of visualisation formats has greatly increased. In addition, visualisations have to be presented in an increasing number of different media: from figures in reports to interactive visualisations that are easy to read on smart

124
124
article thumbnail

How to Modernize Manufacturing Without Losing Control

Speaker: Andrew Skoog, Founder of MachinistX & President of Hexis Representatives

Manufacturing is evolving, and the right technology can empower—not replace—your workforce. Smart automation and AI-driven software are revolutionizing decision-making, optimizing processes, and improving efficiency. But how do you implement these tools with confidence and ensure they complement human expertise rather than override it? Join industry expert Andrew Skoog as he explores how manufacturers can leverage automation to enhance operations, streamline workflows, and make smarter, data-dri

article thumbnail

Ways Businesses Can Boost Logistics Performance with Analytics

Smart Data Collective

Smart companies realize that analytics technology needs to be at the core of their business models. One of the most important ways that analytics can help companies thrive is by improving their logistics. Analytics Technology Helps Companies Bolster their Logistics Strategies. If you were cryogenically frozen twenty years ago, upon awakening, you’d probably be more shocked to learn that you can place an order on the internet and get it the same day, than you would about the world’s billionaires

Analytics 126
article thumbnail

7 Steps to Mastering Python for Data Science

KDnuggets

Here’s how you can learn to code in Python from scratch in 7 easy steps.

Python 392
article thumbnail

Introduction to Memcached using Python

Analytics Vidhya

This article was published as a part of the Data Science Blogathon. Introduction Memcached is a highly-performant distributed caching system. It is an in-memory key-value data store, which makes it a type of NoSQL database. Memcached is used by tech giants like Facebook, Twitter, Instagram, and Netflix. In my previous article, I explained Redis which […].

Python 336
article thumbnail

Personal life dashboard

FlowingData

Felix Krause tracks many metrics of his life, both manually and passively, and put the data in one database. He put up a subset of the data on an updating site that shows where he is, what he’s eaten, how he’s feeling, the time he spent on the computer, and plenty more. After three years, he concluded it was not worth the time: Overall, having spent a significant amount of time building this project, scaling it up to the size it’s at now, as well as analysing the data, the main concl

Database 124
article thumbnail

Automation, Evolved: Your New Playbook for Smarter Knowledge Work

Speaker: Frank Taliano

Documents are the backbone of enterprise operations, but they are also a common source of inefficiency. From buried insights to manual handoffs, document-based workflows can quietly stall decision-making and drain resources. For large, complex organizations, legacy systems and siloed processes create friction that AI is uniquely positioned to resolve.

article thumbnail

10 Essential Data-Driven B2B Email Marketing Strategies

Smart Data Collective

Big data technology is leading to a lot of changes in the field of marketing. A growing number of marketers are exploring the benefits of big data as they strive to improve their branding and outreach strategies. Email marketing is one of the disciplines that has been heavily touched by big data. If you want to make the most of your big data strategy, you should keep reading to learn how to incorporate data into email marketing.

Big Data 124
article thumbnail

Essential Math for Data Science: Eigenvectors and Application to PCA

KDnuggets

In this article, you’ll learn about the eigendecomposition of a matrix.

article thumbnail

Custom Named Entity Recognition using spaCy v3

Analytics Vidhya

This article was published as a part of the Data Science Blogathon. Introduction to Named Entity Recognition A named entity is a ‘real-world object’ that is assigned a name, for example, person, organization, or location. For more details, check my previous article on fine tune Bert for NER. All in all, NER can be summarized as […].

article thumbnail

Population change in the UK

FlowingData

The Office for National Statistics for the UK published an interactive to show how population has changed : The population of England and Wales has increased by more than 3.5 million in the 10 years leading up to Census 2021. Using the first results from this census, we look at which places have seen the biggest increases and decreases, which areas had the largest growth in different age groups, and how your chosen local authority area compares with others.

119
119
article thumbnail

The 2nd Generation of Innovation Management: A Survival Guide

Speaker: Chris Townsend, VP of Product Marketing, Wellspring

Over the past decade, companies have embraced innovation with enthusiasm—Chief Innovation Officers have been hired, and in-house incubators, accelerators, and co-creation labs have been launched. CEOs have spoken with passion about “making everyone an innovator” and the need “to disrupt our own business.” But after years of experimentation, senior leaders are asking: Is this still just an experiment, or are we in it for the long haul?

article thumbnail

4 IT Management Best Practices Data-Driven Businesses Must Practice

Smart Data Collective

Data-driven businesses are far more successful than companies that don’t utilize data to their advantage. Unfortunately, they often find that managing their data effectively can be a challenge. Companies that rely on big data need a reliable IT department. You have to make sure that your IT infrastructure is adequately equipped to handle the volume of data your company will be processing and that it will be properly secured.

Big Data 122
article thumbnail

Statistics and Probability for Data Science

KDnuggets

In this article, we discuss the importance of statistics and probability in data science and machine learning.

article thumbnail

Top 15 Important Data Science Interview Questions

Analytics Vidhya

This article was published as a part of the Data Science Blogathon. Source – Analytics India Magazine Introduction Job interviews can be scary if you are a fresher and especially if you are attending interviews on interdisciplinary roles like Data Science and Machine Learning. The tension, the doubt if you will get a yes or […]. The post Top 15 Important Data Science Interview Questions appeared first on Analytics Vidhya.

article thumbnail

? Visualization Tools and Learning Resources, June 2022 Roundup

FlowingData

Welcome to issue #195 of The Process, the newsletter for FlowingData members that looks closer at how the charts get made. I’m Nathan Yau, and every month I collect useful tools and resources to help you visualize data better. This is the good stuff for June. Become a member for access to this — plus tutorials, courses, and guides.

115
115
article thumbnail

How to Achieve High-Accuracy Results When Using LLMs

Speaker: Ben Epstein, Stealth Founder & CTO | Tony Karrer, Founder & CTO, Aggregage

When tasked with building a fundamentally new product line with deeper insights than previously achievable for a high-value client, Ben Epstein and his team faced a significant challenge: how to harness LLMs to produce consistent, high-accuracy outputs at scale. In this new session, Ben will share how he and his team engineered a system (based on proven software engineering approaches) that employs reproducible test variations (via temperature 0 and fixed seeds), and enables non-LLM evaluation m

article thumbnail

Fintech App Development Discover the Benefits of Using AI

Smart Data Collective

The rapid pace of digitization has caused fintech markets to boom around the world. The market for Fintech was over $112 billion last year, but is projected to be worth over $333 billion by 2028. During this wave of disruption, successful business owners and startup founders need to understand the technologies that are driving the industry forward. Artificial intelligence is one of the most important trends pushing the envelope of what’s possible with fintech.

article thumbnail

Top Posts June 20-26: 20 Basic Linux Commands for Data Science Beginners

KDnuggets

Also: Decision Tree Algorithm, Explained; 15 Python Coding Interview Questions You Must Know For Data Science; Naïve Bayes Algorithm: Everything You Need to Know; KDnuggets Top Posts for May 2022: 9 Free Harvard Courses to Learn Data Science in 2022.

article thumbnail

Linear Algebra for Data Science With Python

Analytics Vidhya

This article was published as a part of the Data Science Blogathon. Introduction Linear Algebra, a branch of mathematics, is very much useful in Data Science. We can mathematically operate on large amounts of data by using Linear Algebra. Most algorithms used in ML use Linear Algebra, especially matrices. As most of the data is […]. The post Linear Algebra for Data Science With Python appeared first on Analytics Vidhya.

article thumbnail

Why You Should Write Weekly 15-5s

Eugene Yan

15 minutes a week to document your work, increase visibility, and earn trust.

100
100
article thumbnail

Apache Airflow® Best Practices: DAG Writing

Speaker: Tamara Fingerlin, Developer Advocate

In this new webinar, Tamara Fingerlin, Developer Advocate, will walk you through many Airflow best practices and advanced features that can help you make your pipelines more manageable, adaptive, and robust. She'll focus on how to write best-in-class Airflow DAGs using the latest Airflow features like dynamic task mapping and data-driven scheduling!