10 Can’t-Miss Sessions on Language Models Coming to ODSC West 2023

5 min readOct 4, 2023

LLMs and Generative AI have dominated both the industry and everyday life this year. At ODSC West this October 30th to November 2nd, our goal is to prepare you for the new technologies, applications, and necessary skills ushered in by this change. At the conference, you’ll find hands-on training sessions, workshops, and talks on LLMs, Generative AI, and prompt engineering. Find a selection of our confirmed sessions below.

Personalizing LLMs with a Feature Store

Jim Dowling | CEO | Hopsworks

This session will show you how to personalize LLMs using a feature store and prompt engineering. You will walk through how to build an example free serverless, personalized LLM application using Hopsworks, an open-source feature store with a built-in vector database, and look at how to build templates for prompts, how to fill-in prompt templates with real-time context data, and how we can incorporate documents from vector databases in prompts using a combination of user-input and historical user data from the feature store.

Evaluation Techniques for Large Language Models

Rajiv Shah, PhD | Machine Learning Engineer | Hugging Face

Selecting the right LLM for your needs has become increasingly complex. During this tutorial, you’ll learn about the practical tools and best practices for evaluating and choosing LLMs.

You will explore the existing research on the capabilities of LLMs versus small traditional ML models, as well as several techniques, including evaluation suites like the EleutherAI Harness, head-to-head competition approaches, and using LLMs for evaluating other LLMs. Finally, you will touch on subtle factors that affect evaluation, including the role of prompts, tokenization, requirements for factual accuracy, and model bias and ethics.

Building an Expert Question/Answer Bot with Open Source Tools and LLMs

Chris Hoge | Head of Community | Heartex

In addition to an “understanding” of the world, LLMs inherit biases that are hard to understand or control. This issue must be addressed as they are incorporated into real-world applications. This session will explore how Label Studio, LangChain, Chroma, and Gradio can be employed as tools for continuous improvement, specifically in building a Question-Answering (QA) system.

Understanding the Landscape of Large Models

Lukas Biewald | CEO and Co-founder | Weights & Biases

Join this session to explore the current landscape of large models from GPT-3 to Stable Diffusion. You’ll also discuss how the teams behind some of the open-source projects are using W&B to accelerate their work.

Democratizing Fine-tuning of Open-Source Large Models with Joint Systems Optimization

Kabir Nagrecha | PhD Student | UC San Diego

This session will provide an overview of the core ideas behind Saturn, how it works on a technical level to reduce runtimes & costs, and the process of using Saturn for large-model finetuning. You’ll explore how Saturn can accelerate and optimize large-model workloads in just a few lines of code and describe some high-value real-world use cases from industry and academia.

Building LLM-powered Knowledge Workers over Your Data with LlamaIndex

Jerry Liu | Co-founder and CEO | LlamaIndex

LLMs offer new ways for you to search for, interact with, and generate new content. In this session, you’ll cover how LlamaIndex enables you to build LLM-powered search and retrieval systems as well as more automated knowledge workers capable of interfacing with your data sources in more sophisticated manners. In this workshop, you’ll see how to build both a simple QA bot as well as an automated workflow agent.

General and Efficient Self-supervised Learning with data2vec

Michael Auli | Principal Research Scientist at FAIR | Director at Meta AI

This session will explore data2vec, a framework for general self-supervised learning that uses the same learning method for either speech, NLP, or computer vision. Instead of predicting modality-specific targets such as words, visual tokens, or units of human speech that are local in nature, data2vec predicts contextualized latent representations that contain information from the entire input. Experiments on the major benchmarks of speech recognition, image classification, and natural language understanding demonstrate a new state-of-the-art or competitive performance to predominant approaches.

Towards Explainable and Language-Agnostic LLMs

Walid S. Saba | Senior Research Scientist | Institute for Experiential AI at Northeastern University

To address the challenges of true language understanding and lack of explainability, the session will explore combining the strength of symbolic representations with a successful bottom-up reverse engineering of language at scale. As such, the argument is made for bottom-up reverse engineering of language in a symbolic setting. Hints on what this project amounts to have been suggested by several authors, and you will discuss in some detail here how this project could be accomplished.

Beyond Demos and Prototypes: How to Build Production-Ready Applications Using Open-Source LLMs

Suhas Pai | Chief Technology Officer | Bedrock AI

This workshop will explore the landscape of open-source LLMs and provide a playbook on how to effectively utilize them to build production-ready applications. You will learn how to choose an LLM that best fits your task, explore several fine-tuning techniques that enable you to adapt the LLM to your domain of interest, and discuss techniques to deal with reasoning limitations, hallucinations, bias, and fairness issues.

Large Language Models — Common Pitfalls & Challenges

Nils Reimers | Director of Machine Learning | Cohere.ai

This session will introduce how Large Language Models (LLMs) can be connected to your data via semantic search with a focus on the many pitfalls and challenges. You’ll explore how some can be solved, when using the right technologies, and discuss the others that are still open problems.

Sign me up!

To attend these and many more expert-led sessions on LLMs, Generative AI, Machine Learning, NLP, Deep Learning, Data Engineering, and more, join us at ODSC West in just a few weeks. Register now to take advantage of our 40% off sale, which ends Friday.

Originally posted on OpenDataScience.com

Read more data science articles on OpenDataScience.com, including tutorials and guides from beginner to advanced levels! Subscribe to our weekly newsletter here and receive the latest news every Thursday. You can also get data science training on-demand wherever you are with our Ai+ Training platform. Interested in attending an ODSC event? Learn more about our upcoming events here.

10 Can’t-Miss Sessions on Language Models Coming to ODSC West 2023

Written by ODSC - Open Data Science