Data Modeling, Data Preparation and Data Scientist

Empower your career – Discover the 10 essential skills to excel as a data scientist in 2023

Data Science Dojo

MARCH 7, 2023

As data science evolves and grows, the demand for skilled data scientists is also rising. A data scientist’s role is to extract insights and knowledge from data and to use this information to inform decisions and drive business growth.

Data Scientist

Data Scientist Exploratory Data Analysis Data Science Data Visualization

5 Hardware Accelerators Every Data Scientist Should Leverage

Smart Data Collective

APRIL 5, 2022

It allows people with excess computing resources to sell them to data scientists in exchange for cryptocurrencies. Data scientists can access remote computing power through sophisticated networks. This feature helps automate many parts of the data preparation and data model development process.

Data Scientist

Data Scientist Data Science Machine Learning Machine Learning

Data science revolution 101 – Unleashing the power of data in the digital age

Data Science Dojo

JUNE 7, 2023

The primary aim is to make sense of the vast amounts of data generated daily by combining statistical analysis, programming, and data visualization. It is divided into three primary areas: data preparation, data modeling, and data visualization.

Data Science

Data Science Data Visualization Data Scientist Machine Learning

LLMOps demystified: Why it’s crucial and best practices for 2023

Data Science Dojo

AUGUST 28, 2023

LLMOps facilitates the streamlined deployment, continuous monitoring, and ongoing maintenance of large language models. Similar to traditional Machine Learning Ops (MLOps), LLMOps necessitates a collaborative effort involving data scientists, DevOps engineers, and IT professionals.

Exploratory Data Analysis

Exploratory Data Analysis Data Preparation Machine Learning Machine Learning

Streamline RAG applications with intelligent metadata filtering using Amazon Bedrock

Flipboard

NOVEMBER 20, 2024

By combining the capabilities of LLM function calling and Pydantic data models, you can dynamically extract metadata from user queries. Knowledge base – You need a knowledge base created in Amazon Bedrock with ingested data and metadata. In her free time, she likes to go for long runs along the beach.

AWS

AWS Natural Language Processing Machine Learning Machine Learning

Improve governance of models with Amazon SageMaker unified Model Cards and Model Registry

AWS Machine Learning Blog

NOVEMBER 13, 2024

This required custom integration efforts, along with complex AWS Identity and Access Management (IAM) policy management, further complicating the model governance process. ML development – This phase of the ML lifecycle should be hosted in an isolated environment for model experimentation and building the candidate model.

ML

ML ML AWS Data Preparation

How can Data Scientists use ChatGPT for developing Machine Learning Models

Pickl AI

OCTOBER 17, 2023

Learn how Data Scientists use ChatGPT, a potent OpenAI language model, to improve their operations. ChatGPT is essential in the domains of natural language processing, modeling, data analysis, data cleaning, and data visualization. It facilitates exploratory Data Analysis and provides quick insights.

Data Scientist

Data Scientist Machine Learning Machine Learning Data Science

Predictive Analytics: 4 Primary Aspects of Predictive Analytics

Smart Data Collective

SEPTEMBER 16, 2020

However, many data scientists and business analysts can’t readily lean on automated regression techniques like logistic regression and linear regression. This stems, largely, from the fact that there are certain data regulations in place when it comes to marketing tech and predictive analytics software.

Predictive Analytics

Predictive Analytics Analytics Analytics Decision Trees

Deploying Gen AI in Production with NVIDIA NIM & MLRun

Iguazio

JUNE 9, 2025

The blog is based on the webinar Deploying Gen AI in Production with NVIDIA NIM & MLRun with Amit Bleiweiss, Senior Data Scientist at NVIDIA, and Yaron Haviv, co-founder and CTO and Guy Lecker, ML Engineering Team Lead at Iguazio (acquired by McKinsey). When developers and data scientists need a gen Al app/tech playground.

AI

AI AI Data Preparation ML

Building an efficient MLOps platform with OSS tools on Amazon ECS with AWS Fargate

AWS Machine Learning Blog

SEPTEMBER 18, 2024

It simplifies feature access for model training and inference, significantly reducing the time and complexity involved in managing data pipelines. Additionally, Feast promotes feature reuse, so the time spent on data preparation is reduced greatly. Saurabh Gupta is a Principal Engineer at Zeta Global.

AWS

AWS Machine Learning Machine Learning ML

Apply fine-grained data access controls with AWS Lake Formation in Amazon SageMaker Data Wrangler

AWS Machine Learning Blog

AUGUST 21, 2023

Amazon SageMaker Data Wrangler reduces the time it takes to collect and prepare data for machine learning (ML) from weeks to minutes. We are happy to announce that SageMaker Data Wrangler now supports using Lake Formation with Amazon EMR to provide this fine-grained data access restriction.

AWS

AWS Data Lakes Clustering Data Preparation

Unlocking Tabular Data’s Hidden Potential

ODSC - Open Data Science

MAY 10, 2023

Data-centric AI, in his opinion, is based on the following principles: It’s time to focus on the data — after all the progress achieved in algorithms means it’s now time to spend more time on the data Inconsistent data labels are common since reasonable, well-trained people can see things differently. The choice is yours.

Data Scientist

Data Scientist Data Science Deep Learning Deep Learning

MLOps Landscape in 2023: Top Tools and Platforms

The MLOps Blog

JUNE 27, 2023

See also Thoughtworks’s guide to Evaluating MLOps Platforms End-to-end MLOps platforms End-to-end MLOps platforms provide a unified ecosystem that streamlines the entire ML workflow, from data preparation and model development to deployment and monitoring. Check out the Kubeflow documentation. Can you render audio/video?

Machine Learning

Machine Learning Machine Learning ML ML

Building Scalable AI Pipelines with MLOps: A Guide for Software Engineers

ODSC - Open Data Science

OCTOBER 7, 2024

In today’s landscape, AI is becoming a major focus in developing and deploying machine learning models. It isn’t just about writing code or creating algorithms — it requires robust pipelines that handle data, model training, deployment, and maintenance. Model Training: Running computations to learn from the data.

Machine Learning

Machine Learning Machine Learning AI AI

The Top AI Slides from ODSC West 2024

ODSC - Open Data Science

NOVEMBER 19, 2024

ODSC West 2024 showcased a wide range of talks and workshops from leading data science, AI, and machine learning experts. This blog highlights some of the most impactful AI slides from the world’s best data science instructors, focusing on cutting-edge advancements in AI, data modeling, and deployment strategies.

Deep Learning

Deep Learning Deep Learning Data Science AI

How to: Focus on three areas for a holistic data governance approach for self-service analytics

Tableau

SEPTEMBER 23, 2021

At Tableau, we wanted to understand use cases and common issues from our most advanced data scientists to general data consumers. While not exhaustive, here are additional capabilities to consider as part of your data management and governance solution: Data preparation. Data modeling.

Data Governance

Data Governance Analytics Analytics Tableau

How to: Focus on three areas for a holistic data governance approach for self-service analytics

Tableau

SEPTEMBER 23, 2021

At Tableau, we wanted to understand use cases and common issues from our most advanced data scientists to general data consumers. While not exhaustive, here are additional capabilities to consider as part of your data management and governance solution: Data preparation. Data modeling.

Data Governance

Data Governance Analytics Analytics Tableau

Discover the Most Important Fundamentals of Data Engineering

Pickl AI

NOVEMBER 4, 2024

Summary: The fundamentals of Data Engineering encompass essential practices like data modelling, warehousing, pipelines, and integration. Understanding these concepts enables professionals to build robust systems that facilitate effective data management and insightful analysis. What is Data Engineering?

Data Engineering

Data Engineering Data Engineering Data Engineering Data Engineer

How Light & Wonder built a predictive maintenance solution for gaming machines on AWS

AWS Machine Learning Blog

JUNE 22, 2023

New machines are added continuously to the system, so we had to make sure our model can handle prediction on new machines that have never been seen in training. Data preprocessing and feature engineering In this section, we discuss our methods for data preparation and feature engineering.

AWS

AWS ML ML Machine Learning

Introducing our New Book: Implementing MLOps in the Enterprise

Iguazio

DECEMBER 14, 2023

Who This Book Is For This book is for practitioners in charge of building, managing, maintaining, and operationalizing the ML process end to end: Data science / AI / ML leaders: Heads of Data Science, VPs of Advanced Analytics, AI Lead etc. Exploratory data analysis (EDA) and modeling.

ML

ML ML Data Science Data Preparation

LLMOps vs. MLOps: Understanding the Differences

Iguazio

FEBRUARY 8, 2024

Data engineers, data scientists and other data professional leaders have been racing to implement gen AI into their engineering efforts. Data Pipeline - Manages and processes various data sources. Application Pipeline - Manages requests and data/model validations. LLMOps is MLOps for LLMs.

ML

ML ML Data Scientist Machine Learning

Why SQL is important for Data Analyst?

Pickl AI

APRIL 10, 2023

In case of professional Data Analysts, who might be engaged in performing experiments on data, standard SQL tools are required. Data Analysts need deeper knowledge on SQL to understand relational databases like Oracle, Microsoft SQL and MySQL. Moreover, SQL is an important tool for conducting Data Preparation and Data Wrangling.

Data Analyst

Data Analyst SQL Data Analysis Data Analysis

How to Choose MLOps Tools: In-Depth Guide for 2024

DagsHub

APRIL 21, 2024

You need to make that model available to the end users, monitor it, and retrain it for better performance if needed. This collaboration of ML and operations teams is what you call MLOps and focuses on streamlining the process of deploying the ML models to production, along with maintaining and monitoring them.

Machine Learning

Machine Learning Machine Learning ML ML

Implement a custom AutoML job using pre-selected algorithms in Amazon SageMaker Automatic Model Tuning

AWS Machine Learning Blog

NOVEMBER 15, 2023

Understanding up front which preprocessing techniques and algorithm types provide best results reduces the time to develop, train, and deploy the right model. It plays a crucial role in every model’s development process and allows data scientists to focus on the most promising ML techniques.

Algorithm

Algorithm AWS ML ML

Must-Have Skills for a Machine Learning Engineer

Pickl AI

NOVEMBER 28, 2024

Model Evaluation and Tuning After building a Machine Learning model, it is crucial to evaluate its performance to ensure it generalises well to new, unseen data. Model evaluation and tuning involve several techniques to assess and optimise model accuracy and reliability.

Machine Learning

Machine Learning Machine Learning ML ML

A Step-by-Step Guide: Efficiently Managing TensorFlow/Keras Model Development with Comet

Heartbeat

NOVEMBER 28, 2023

However, as your model development process becomes more complex and involves numerous experiments and iterations, keeping track of your progress, managing experiments, and collaborating effectively with team members becomes increasingly challenging. First, ML models are becoming increasingly complex and require a lot of data to train.

ML

ML ML Machine Learning Machine Learning

Why is Git Not the Best for ML Model Version Control

The MLOps Blog

NOVEMBER 30, 2022

It requires significant effort in terms of data preparation, exploration, processing, and experimentation, which involves trying out algorithms and hyperparameters. It is so because these algorithms have proven great results on a benchmark dataset, whereas your business problem and hence your data is different.

ML

ML ML Machine Learning Machine Learning

How to Build an End-To-End ML Pipeline

The MLOps Blog

MAY 9, 2023

The platform typically includes components for the ML ecosystem like data management, feature stores, experiment trackers, a model registry, a testing environment, model serving, and model management. They include: 1 Data (or input) pipeline. 2 Model (or training) pipeline.

ML

ML ML Machine Learning Machine Learning

Ask HN: Who is hiring? (July 2025)

Hacker News

JULY 1, 2025

The current team is very high functioning (MD + data scientist combos, former ASF board member, Google and Amazon engineers, Stanford LLM researchers, etc.) Experience integrating AI/ML models into production systems (LLMs, transformers, fine-tuning, etc.). Strong system design, data modeling, and architectural thinking.

Python

Python AWS ML ML

Operationalize LLM Evaluation at Scale using Amazon SageMaker Clarify and MLOps services

AWS Machine Learning Blog

NOVEMBER 29, 2023

Data scientists can analyze detailed results with SageMaker Clarify visualizations in Notebooks, SageMaker Model Cards, and PDF reports. The following figure shows end-to-end LLMOps lifecycle: In LLMOps the main differences compared to MLOps are model selection and model evaluation involving different processes and metrics.

Algorithm

Algorithm ML ML Data Scientist

Gen AI Trends and Scaling Strategies for 2025

Iguazio

MARCH 20, 2025

This includes responsible AI, Gartners concept of AI TRiSM (Trust, Risk and Security in AI Models) and Sovereign AI. AI engineering - AI is being democratized for developers and engineers, expanding beyond the limited pool of data scientists. AI Agents and multi-agent systems.

AI

AI AI Data Pipeline Data Scientist

Data Science Current

Empower your career – Discover the 10 essential skills to excel as a data scientist in 2023

5 Hardware Accelerators Every Data Scientist Should Leverage

Trending Sources

Data science revolution 101 – Unleashing the power of data in the digital age

LLMOps demystified: Why it’s crucial and best practices for 2023

Streamline RAG applications with intelligent metadata filtering using Amazon Bedrock

Improve governance of models with Amazon SageMaker unified Model Cards and Model Registry

How can Data Scientists use ChatGPT for developing Machine Learning Models

Predictive Analytics: 4 Primary Aspects of Predictive Analytics

Deploying Gen AI in Production with NVIDIA NIM & MLRun

Building an efficient MLOps platform with OSS tools on Amazon ECS with AWS Fargate

Apply fine-grained data access controls with AWS Lake Formation in Amazon SageMaker Data Wrangler

Unlocking Tabular Data’s Hidden Potential

MLOps Landscape in 2023: Top Tools and Platforms

Building Scalable AI Pipelines with MLOps: A Guide for Software Engineers

The Top AI Slides from ODSC West 2024

How to: Focus on three areas for a holistic data governance approach for self-service analytics

How to: Focus on three areas for a holistic data governance approach for self-service analytics

Discover the Most Important Fundamentals of Data Engineering

How Light & Wonder built a predictive maintenance solution for gaming machines on AWS

Introducing our New Book: Implementing MLOps in the Enterprise

LLMOps vs. MLOps: Understanding the Differences

Why SQL is important for Data Analyst?

How to Choose MLOps Tools: In-Depth Guide for 2024

Implement a custom AutoML job using pre-selected algorithms in Amazon SageMaker Automatic Model Tuning

Must-Have Skills for a Machine Learning Engineer

A Step-by-Step Guide: Efficiently Managing TensorFlow/Keras Model Development with Comet

Why is Git Not the Best for ML Model Version Control

How to Build an End-To-End ML Pipeline

Ask HN: Who is hiring? (July 2025)

Operationalize LLM Evaluation at Scale using Amazon SageMaker Clarify and MLOps services

Gen AI Trends and Scaling Strategies for 2025

Stay Connected