Sat.Feb 05, 2022 - Fri.Feb 11, 2022

article thumbnail

Managing Your Reusable Python Code as a Data Scientist

KDnuggets

Here are a few approaches that I have settled on for managing my own reusable Python code as a data scientist, presented from most to least general code use, and aimed at beginners.

article thumbnail

Workflow of MLOps: Part 2 | Model Building

Analytics Vidhya

This article was published as a part of the Data Science Blogathon. This is the 2nd blog of the MLOps series. Introduction This article is part of an ongoing blog series on Machine Learning Operations(MLOps). In the previous article, we have gone through the introduction of MLOps. We have seen differences in traditional software development in […].

professionals

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Deepdub closes fresh round for dubbing AI that dubs movies, shows, and games

Dataconomy

Dubbing, where recordings in other languages are lip-synced and mixed with a show’s original soundtrack, is an exploding business. One localization platform, Zoo Digital, saw revenues jump by 73% to $28.6 million in July 2018 compared to the year prior. Another, BTI Studios, told Television Business International that dubbing grew from 3%.

AI 240
article thumbnail

The Invention of Battlezone

Hacker News

Three-dimensional displays first appeared on computer screens in the 1960s, and very large machines could manipulate those images in real time, but it was not until 1980 that a video-game player could maneuver at will through an imaginary landscape, wreaking havoc until brought to an untimely end by enemy tanks. Battlezone , a first-person tank game, was made possible by a vector display unit used by Atari Inc., Sunnyvale, Calif., in Asteroids , which came out the previous year.

153
153
article thumbnail

What’s New in Apache Airflow® 3.0—And How Will It Reshape Your Data Workflows?

Speaker: Tamara Fingerlin, Developer Advocate

Apache Airflow® 3.0, the most anticipated Airflow release yet, officially launched this April. As the de facto standard for data orchestration, Airflow is trusted by over 77,000 organizations to power everything from advanced analytics to production AI and MLOps. With the 3.0 release, the top-requested features from the community were delivered, including a revamped UI for easier navigation, stronger security, and greater flexibility to run tasks anywhere at any time.

article thumbnail

How to Learn Math for Machine Learning

KDnuggets

So how much math do you need to know in order to work in the data science industry? The answer: Not as much as you think.

article thumbnail

Optimal Resource Allocation using Python

Analytics Vidhya

This article was published as a part of the Data Science Blogathon. Objective “True optimization is the revolutionary contribution of modern research to decision processes” – George Dantzig. This article discusses solving a resource allocation problem using linear programming in Python. We will find an optimal value for a linear equation with different linear constraints.

Python 328

More Trending

article thumbnail

Age of Moms When Kids are Born

FlowingData

People have kids at a wide range of ages, but the moments tend towards where we are in life. There are social norms and biological norms. Based on data from the National Center for Health Statistics, we can see how these ranges shift by child number. Read More.

145
145
article thumbnail

The Complete Collection of Data Science Cheat Sheets – Part 1

KDnuggets

A collection of cheat sheets that will help you prepare for a technical interview, assessment tests, class presentation, and help you revise core data science concepts.

article thumbnail

11 Extensions to Power Up your Jupyter Notebook

Analytics Vidhya

This article was published as a part of the Data Science Blogathon. […]. The post 11 Extensions to Power Up your Jupyter Notebook appeared first on Analytics Vidhya.

article thumbnail

Cloud Technology Makes Virtual Assistants More Beneficial than Ever

Smart Data Collective

More companies are relying on cloud technology than ever before. They are discovering the benefits of using the cloud to utilize data and facilitate communications between employees, customers, contractors and other stakeholders. One of the underappreciated benefits of cloud technology is that it makes it easier to work with virtual assistants. Savvy executives and small business owners realize that virtual assistants can perform many important tasks a lot more efficiently.

article thumbnail

Agent Tooling: Connecting AI to Your Tools, Systems & Data

Speaker: Alex Salazar, CEO & Co-Founder @ Arcade | Nate Barbettini, Founding Engineer @ Arcade | Tony Karrer, Founder & CTO @ Aggregage

There’s a lot of noise surrounding the ability of AI agents to connect to your tools, systems and data. But building an AI application into a reliable, secure workflow agent isn’t as simple as plugging in an API. As an engineering leader, it can be challenging to make sense of this evolving landscape, but agent tooling provides such high value that it’s critical we figure out how to move forward.

article thumbnail

Bubble tea combinations, a visual breakdown

FlowingData

Walk into a boba shop and usually you’ll see a large menu that lists the options for your tea, milk, toppings, ice, and sweetness. With all the variations, you get a lot of combinations. Julia Janicki and Daisy Chung broke it down with an interactive that takes you through the steps. Tags: boba , combinations , Daisy Chung , Julia Janicki.

145
145
article thumbnail

Build a Web Scraper with Python in 5 Minutes

KDnuggets

In this article, I will show you how to create a web scraper from scratch in Python.

Python 399
article thumbnail

Folder Management in Python

Analytics Vidhya

This article was published as a part of the Data Science Blogathon. Overview “You’re either the one that creates the automation or you’re getting automated.” Tom Preston-Werner. Automation affects almost every aspect of modern life, and it can be used in any industry. Automation minimizes human input and eliminates doing repetitive tasks.

Python 326
article thumbnail

DirectX Visualization Optimizes Analytics Algorithmic Traders

Smart Data Collective

Learn how DirectX visualization can improve your study and assessment of different trading instruments for maximum productivity and profitability. Analytics technology has become an invaluable aspect of modern financial trading. A growing number of traders are using increasingly sophisticated data mining and machine learning tools to develop a competitive edge.

Analytics 145
article thumbnail

How to Modernize Manufacturing Without Losing Control

Speaker: Andrew Skoog, Founder of MachinistX & President of Hexis Representatives

Manufacturing is evolving, and the right technology can empower—not replace—your workforce. Smart automation and AI-driven software are revolutionizing decision-making, optimizing processes, and improving efficiency. But how do you implement these tools with confidence and ensure they complement human expertise rather than override it? Join industry expert Andrew Skoog as he explores how manufacturers can leverage automation to enhance operations, streamline workflows, and make smarter, data-dri

article thumbnail

? How to Make a Line Chart with a Color Gradient in R

FlowingData

It’s typically straightforward to make and read a line chart. The position on the line represents a value, and the slope between points represents a rate of change. Usually a line chart that represents a single time series uses a solid color for the line. But while messing with a heatmap, which uses color as its primary visual encoding, I was curious what you could show if you introduced a color scheme to a line chart.

143
143
article thumbnail

Junior Data Scientist: The Next Level

KDnuggets

There is a difference in the level of experience compared to Junior, Mid-Level, and Senior Data Scientists. This article will go through the expectations for all job roles and what is required to move up the ladder.

article thumbnail

Heart Disease Prediction using Machine Learning

Analytics Vidhya

This article was published as a part of the Data Science Blogathon. Overview In this article, we will be closely working with the heart disease prediction and for that, we will be looking into the heart disease dataset from that dataset we will derive various insights that help us know the weightage of each feature and […]. The post Heart Disease Prediction using Machine Learning appeared first on Analytics Vidhya.

article thumbnail

5 Data Security Strategies Businesses Should Implement

Smart Data Collective

We have witnessed some horrifying data breaches over the last year. One of the worst was when a team of Chinese hackers penetrated the security of the Microsoft Exchange and accessed the accounts of over 250,000 global organizations. The Colonial Pipeline and SolarWinds were also victims to hackers. While large corporations like these will continue to be targets for data breaches, small businesses are also at risk.

136
136
article thumbnail

Automation, Evolved: Your New Playbook for Smarter Knowledge Work

Speaker: Frank Taliano

Documents are the backbone of enterprise operations, but they are also a common source of inefficiency. From buried insights to manual handoffs, document-based workflows can quietly stall decision-making and drain resources. For large, complex organizations, legacy systems and siloed processes create friction that AI is uniquely positioned to resolve.

article thumbnail

? Good Redundant – The Process 176

FlowingData

Welcome to issue #176 of The Process , the newsletter for FlowingData members about how the charts get made. I’m Nathan Yau, and this week I’m thinking about using more color, and more generally, using more encodings to show the same thing in one chart. Become a member for access to this — plus tutorials, courses, and guides.

142
142
article thumbnail

The Not-so-Sexy SQL Concepts to Make You Stand Out

KDnuggets

Databases are the houses of our data and data scientists HAVE TO HAVE A KEY! In this article, I discuss some lesser known concepts of SQL that data scientists do not familiarize themselves with.

SQL 313
article thumbnail

Different Types of Cross-Validations in Machine Learning

Analytics Vidhya

This article was published as a part of the Data Science Blogathon. Introduction Model Development is a critical stage in the life cycle of a Data Science project. We attempt to train our data set using various forms of Machine Learning models, either supervised or unsupervised, depending on the Business Problem. Given many models available for […].

article thumbnail

Stop paying for APIs to calculate distances and use this Open Source tool!

Applied Data Science

How to use OSRM to calculate distances reliably and for free. Photo by T.H. Chia on Unsplash Calculating distances between a set of coordinates is something that regularly comes up in Data Science projects. Whether it is planning routes for delivery services, or measuring a customer’s willingness to travel to certain locations, getting an accurate measure of distance is always key.

article thumbnail

The 2nd Generation of Innovation Management: A Survival Guide

Speaker: Chris Townsend, VP of Product Marketing, Wellspring

Over the past decade, companies have embraced innovation with enthusiasm—Chief Innovation Officers have been hired, and in-house incubators, accelerators, and co-creation labs have been launched. CEOs have spoken with passion about “making everyone an innovator” and the need “to disrupt our own business.” But after years of experimentation, senior leaders are asking: Is this still just an experiment, or are we in it for the long haul?

article thumbnail

Modernized version of a mid-19th century encylopedia

FlowingData

Between 1849 and 1851, J.G. Heck published a 10-part encyclopedia called Iconographic Encyclopædia covering a wide range of topics in science and art. Nicholas Rougeux, who likes to web-ify old works , restored Heck’s 13,000-plus illustrations and restructured the encyclopedia for the browser. All it took was hours of manual labor spread out over 13 months.

122
122
article thumbnail

5 Ways to Apply AI to Small Data Sets

KDnuggets

It is better to use AI algorithms on small data sets for results free of human errors and false results when applied correctly. Here are some methods to apply AI to small data sets.

AI 298
article thumbnail

Exploratory Data Analysis in Python

Analytics Vidhya

Overview Understanding how EDA is done in Python Various steps involved in the Exploratory Data Analysis Performing EDA on a given dataset Introduction Exploratory data analysis popularly known as EDA is a process of performing some initial investigations on the dataset to discover the structure and the content of the given dataset. It is often […].

article thumbnail

Data Security Standards Are Evolving in Response to Rising Threats

Smart Data Collective

Cybersecurity is a growing concern. In 2018 alone, over 1,200 data breaches were orchestrated and nearly 450 million records were compromised. There will be more pressure to improve cybersecurity as these threats escalate. We have a problem with data security. We are more than 70 years into the digital age. We have found that data is one of the most important assets a person or company can have, and the threats of destruction and theft are constantly looming.

article thumbnail

How to Achieve High-Accuracy Results When Using LLMs

Speaker: Ben Epstein, Stealth Founder & CTO | Tony Karrer, Founder & CTO, Aggregage

When tasked with building a fundamentally new product line with deeper insights than previously achievable for a high-value client, Ben Epstein and his team faced a significant challenge: how to harness LLMs to produce consistent, high-accuracy outputs at scale. In this new session, Ben will share how he and his team engineered a system (based on proven software engineering approaches) that employs reproducible test variations (via temperature 0 and fixed seeds), and enables non-LLM evaluation m

article thumbnail

Past redlining still seen in the present

FlowingData

In the 1930s, a group called the Home Owners’ Loan Corporation went to cities classifying neighborhoods based on the “risk” of defaulting on loans. Areas deemed highest risk were marked with red ink on a map, and these areas tended to be non-white. The classification, redlining, was made illegal, but you can still see the effects today, as shown by Ryan Best and Elena Mejía with these interactive maps for FiveThirtyEight.

122
122
article thumbnail

The motivation behind using graph convolutions

KDnuggets

This article is an excerpt from the book Machine Learning with PyTorch and Scikit-Learn is the new book from the widely acclaimed and bestselling Python Machine Learning series, fully updated and expanded to cover PyTorch, transformers, graph neural networks, and best practices.

article thumbnail

Guide On Customer Churn: Don’t Just Predict, Prevent it!

Analytics Vidhya

This article was published as a part of the Data Science Blogathon. Introduction Phonepe, Google Pay (Tez) are ubiquitous names in the Indian payment ecosystem and the top two players in the area. According to Phonepe pulse report, it has133 million monthly active users as of July’21. For the Q3-21 quarter, the total transactions were 526.8 Cr […].

article thumbnail

Guidelines on Trading Cryptocurrency Over the Blockchain

Smart Data Collective

Cryptocurrencies are evolving with new technology and growing interest. The blockchain has made cryptocurrencies very valuable investments for millions of people all over the world. If you want to invest in bitcoin, ethereum or other cryptocurrencies, then you can easily purchase them over the blockchain. You just need to know which exchange to use and what steps to take to process your transactions.

105
105
article thumbnail

Apache Airflow® Best Practices: DAG Writing

Speaker: Tamara Fingerlin, Developer Advocate

In this new webinar, Tamara Fingerlin, Developer Advocate, will walk you through many Airflow best practices and advanced features that can help you make your pipelines more manageable, adaptive, and robust. She'll focus on how to write best-in-class Airflow DAGs using the latest Airflow features like dynamic task mapping and data-driven scheduling!