Get Pumped For ODSC West 2023 With Highlights from Last Year!

5 min readAug 11, 2023

There’s a lot to consider when learning about data science, such as what topics are relevant, format, and so on. To help you build AI better, we’ve created a list of the top ten virtual talks from ODSC West this year so you can learn a variety of topics and from speakers of different backgrounds. All you need to do is sign up for a free Ai+ Training account and then you’ll have access to the sessions! Watch all sessions here.

DS/AI for Incident Response & Threat Hunting with CHRYSALIS & DAISY

Jess Garcia | CEO, Security & Forensics Analyst, Incident Responder | One eSecurity, Senior Instructor | SANS Institute

There is a lot of talk about the use of AI in cybersecurity these days. Lots of cybersecurity vendors claim that their products use AI for detecting and stopping threats, but very little information is available on how they do it. In this talk, you’ll learn how to transparently use AI in Incident Response and Threat Hunting with the help of the DS4N6 toolset (DAISY VM & CHRYSALIS) and learn about the most useful ML algorithms for this purpose.

Denoising Diffusion-based Generative Modeling

Stefano Ermon, PhD | Assistant Professor | Stanford University

Diffusion-based generative models, such as DALL·E 2, have achieved exceptional image generation quality. Unlike other generative models based on explicit representations of probability distributions (e.g., autoregressive) or implicit sampling procedures (e.g., GANs), diffusion models learn directly the vector field of gradients of the data distribution (scores). This framework allows flexible architectures, and requires no sampling during training or the use of adversarial training methods. These score-based generative models enable exact likelihood evaluation, achieve state-of-the-art sample quality, and can be used to improve performance in a variety of inverse problems, including medical imaging.

Orchestrating Data Assets instead of Tasks, with Dagster

Sandy Ryza | Lead Engineer — Dagster Project | Elementl

Asset-based orchestration works well with modern data stack tools like dbt, Meltano, Airbyte, and Fivetran, because those tools already think in terms of assets. Attendees of this session will learn how to build and maintain data pipelines in a way that makes their datasets and ML models dramatically easier to trust and evolve.

An Intuition-Based Approach to Reinforcement Learning

Oswald Campesato | Founder | iQuarkt

Reinforcement learning (RL) has achieved remarkable success in various tasks, such as defeating all-human teams in MMP (massive multiplayer) games, advances in robotics, and astonishing results in the protein folding problem in chemistry. Expertise in RL requires strong knowledge of machine learning, statistics, and areas of mathematics. Moreover, RL contains many concepts that seem “fuzzy”; and hence can be challenging for beginners who are trying to learn RL. However, this session provides the intuition of various RL concepts, such as exploit/explore and maximization of expected reward, along with real-life examples of these concepts.

Riding the Tailwind of NLP Explosion

Rongyao Huang | Lead Data Scientist | CB Insights

In this talk, we’ll share how we modernized our NLP stack @ CBI R&D and the challenges we met with. Part I will walk you through the timeline and milestones of NLP evolution, highlighting significant trends after the “attention” revolution. Part II will discuss battle-ready lessons gained using transformer models across various tasks and languages, leveraging open-source libraries such as HuggingFace Transformers and Pytorch Lightning.

A Primer on Causal AI

Robert Osazuwa Ness, PhD | Senior Researcher | Microsoft

Causal inference is increasingly an indispensable tool of data science, machine learning, and data-driven decision-making. In this talk, Robert will present the state-of-play in causal machine learning and will cover the problems that matter in practice, with emphasis on the tech and retail industries. He will also talk about trends in open-source tools for causal inference. Finally, he’ll show examples from DoWhy and its sister package EconML, which together form the PyTorch of causal inference.

Emerging Approaches to AI Governance: Tech-Led vs Policy-Led

Ilana Golbin | Director | PwC Emerging Technologies and Responsible AI Lead

Over the past few years, many have become more familiar with the potential risks posed to the improper deployment and usage of AI/ML systems. Companies of almost all sizes and across almost all sectors have seen examples of major AI failures, leading to significant decay in trust in these systems. As a result, stakeholders across organizations have emerged as interested in remediating these risks and getting a handle on AI — in owning AI governance. Some are drawn to technical capabilities which promise solutions to ethical problems and enable quality. Others rely on existing compliance and policy methods to enforce standards. In this session, we will describe what these different approaches look like, the pros and cons of each, and considerations to build a robust framework around AI governance that engages technical, business, and compliance teams.

Cloud Directions, MLOps, and Production Data Science

Joe Hellerstein, PhD | Jim Gray Professor of Computer Science | University of California, Berkeley

Recent trends in cloud technology, including serverless computing, promise new approaches for abstracting away infrastructure. Unfortunately, these offerings fall short of the challenge of MLOps. In this talk, Josh will cover some of the important promises and weaknesses of current cloud offerings, and describe research from Berkeley’s RISElab and the resulting open-source Aqueduct system, which are putting Production Data Science at the fingertips of anyone working with data and models.

Robust and Equitable Uncertainty Estimation

Aaron Roth, PhD | Professor of Computer and Cognitive Science | University of Pennsylvania

In this lecture, we will describe a new technique to address some common problems: a way to produce prediction sets for arbitrary black-box prediction methods that have correct empirical coverage even when the data distribution might change in arbitrary, unanticipated ways and such that we have correct coverage even when we zoom in to focus on demographic groups that can be arbitrary and intersecting.

Check out the sessions now

The above sessions cover a bit of everything under the data science umbrella to help you build AI better, and were all highly rated by attendees of the ODSC West virtual conference. You can check all of them out for free. If you’re itching for a conference experience in real-time, then you can get an in-person or virtual pass for ODSC West this October 30th-November 2nd while tickets are 60% off.

Originally posted on OpenDataScience.com

Read more data science articles on OpenDataScience.com, including tutorials and guides from beginner to advanced levels! Subscribe to our weekly newsletter here and receive the latest news every Thursday. You can also get data science training on-demand wherever you are with our Ai+ Training platform. Subscribe to our fast-growing Medium Publication too, the ODSC Journal, and inquire about becoming a writer.

Get Pumped For ODSC West 2023 With Highlights from Last Year!

Check out the sessions now

Written by ODSC - Open Data Science