Eugene Yan

article thumbnail

How to Unit Test Machine Learning Code & Models

Eugene Yan

How it differs from unit testing typical software and some guidelines

article thumbnail

How to Generate Synthetic Data for Pretraining and Finetuning

Eugene Yan

Distillation vs. self-improvement across the three stages of language model training.

296
296
professionals

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Language Modeling Reading List (to Start Your Paper Club)

Eugene Yan

Some fundamental papers and a one-sentence summary for each; start your own paper club!

277
277
article thumbnail

2023 in Review

Eugene Yan

An expanded charter, lots of writing and speaking, and finally learning to snowboard.

174
174
article thumbnail

Push Notifications - What to Push, What Not to Push, and How Often

Eugene Yan

Sending helpful & engaging pushes, filtering annoying pushes, and finding the frequency sweet spot.

318
318
article thumbnail

Finetuning on Out-of-Domain Data to Detect Factual Inconsistency

Eugene Yan

Or how we can bootstrap on open-source, permissive-use data and collect less labeled samples.

229
229
article thumbnail

Reflections on AI Engineer Summit 2023

Eugene Yan

The biggest deployment challenges, backward compatibility, multi-modality, and SF work ethic.

AI 315