Datasets for Fine-Tuning Large Language Models, Prompt Engineering Use Cases, and How to Ace the Data Science Interview

ODSC - Open Data Science
ODSCJournal
Published in
5 min readFeb 1, 2024

--

10 Datasets for Fine-Tuning Large Language Models

In this blog post, we will explore ten valuable datasets that can assist you in fine-tuning or training your LLM.

Chi-Square Goodness of Fit Test

In this article, we will dive through the maths behind the goodness of fit test and walk through an example problem.

Exploring AI Innovation in Singapore and Beyond with Laurence Liew

Join the Director of AI Innovation and AI trailblazer, Laurence Liew, on a fascinating journey into the heart of AI innovation in Singapore and beyond in this podcast.

Meta Introduces ‘Prompt Engineering with Llama 2’

Meta AI introduced “Prompt Engineering with Llama 2,” an interactive guide that is a significant stride forward and designed specifically for the Llama community.

Industry, Opinion, Career Advice

6 Prompt Engineering Use Cases That Employers are Looking For

What do prompt engineering jobs actually want you to do? Here are some common prompt engineering use cases that employers are looking for.

These Companies are Changing Biotech & Biopharma with AI in 2024

Let’s take a look at a few companies that are using AI in biotech and biopharma so you can take inspiration for your organization.

How to Ace the Data Science Interview in 2024

In this interview with Nick Singh, we discussed a few major topics — how to stand out among others in a competitive field, how to ace the data science interview, and how to follow up after an interview.

Boston Code and Coffee @ Northeastern

Saturday, February 10, 12:00 PM EST

Code and Coffee is an inclusive, informal co-working and networking session. People of all skill levels attend, and we love it that way. Many people (optionally) bring projects to work on, and many other people (optionally) socialize the entire time. It’s entirely up to you!

Buy any ODSC East in-person or virtual pass and get access to our popular LLMs and Prompt Engineering certification series for free!

Register by Friday to get this deal!

Data Science & AI News

ODSC’s AI Weekly Recap: Week of January 26th

This week’s AI Weekly Recap is all about an author admitting to using AI to help write a book, and Meta’s quest for general intelligence.

Higher Education Institutions Look to Add AI Into Lessons

At multiple universities and colleges, AI is looking to be welcomed with open arms after an initial period of suspension.

Prestigious Japanese Literary Award Winner Rie Kudan Admits to Using AI

Author Rie Kudan, after winning Japan’s most prestigious literary awards, has admitted to using ChatGPT to aid her in her work.

ODSC Highlights

Hands-On Training Sessions Coming to ODSC East

Data science training can be the difference maker for those looking to make moves in their career. Here are a few standout sessions coming to ODSC East this April.

ODSC East Call for Volunteers

Become a valued part of the ODSC Community and connect with an incredibly motivated group of Data Science enthusiasts!

Data Engineering Summit Call for Speakers

At our second annual Data Engineering Summit, Ai+ and ODSC are partnering to bring together the leading experts in data engineering and thousands of practitioners to explore different strategies for making data actionable. Whether you are a data engineer or someone who does data engineering on the side, you’ll find new tools and techniques you can apply to your work immediately at this data engineering conference.

Weekly Recap Newsletter

Want to get a weekly digest of AI news from around the world every Friday? Sign up for our new newsletter here!

New Podcast Episode: Troubleshooting Large Language Models with Amber Roberts

Join Amber Roberts and Sheamus McGovern for a wide-ranging discussion on the challenges and possible solutions for troubleshooting unstructured data, in their recent interview, Troubleshooting Large Language Models.

Spotify | SoundCloud | Apple

Video of the Week: PyTorch 2.1 — New Developments with Supriya Rao

Supriya Rao, an Engineering Manager at Meta, unveils PyTorch 2.1’s new features in compile, distributed, inference, export, and edge technologies, along with optimization techniques like quantization and pruning.

Upcoming Webinars:

Interview “The Modern Data Science Development Toolkit”

Fri, Feb 2, 2024 12:00 PM EST

Join entrepreneur and machine learning expert, Greg Michaelson, and Sheamus McGovern on February 2nd for a lively discussion on the modern data science development toolkit. Discover the tools, skills, and techniques that will help you improve your end-to-end development process.

Building predictive marketing lists using feature engineering

Tue, Feb 6, 2024 12:00 PM — 1:00 PM EST

Join dotData’s Demand Generation Manager, Brandon Bednar, and dotData DataScientist Sharada Narayanan as they walk you through a hands-on example of how to leverage your in-house first-party data to build predictive audiences for targeted advertising on social channels like LinkedIn.

AI Panel “Multimodal AI: Business and Technical Insights”

Weds, Feb 7, 2024 6:00 PM — 8:00 PM EST

​AI will soon expand from tables and text to images, video, and speech. From robotics to financial services to the future of learning, AI is transforming the way industries innovate and deliver business value. How will the fusion of different modalities unlock new AI use cases and encourage new startup development? What are the technical obstacles to new multimodal AI use cases (inference costs, model performance, training data, etc)? Join our panel for a conversation diving into what’s right around the corner in AI.

Space-Time Hotspot Analysis of a Human Mobility Index using CARTO

Tue, Feb 13, 2024 12:00 PM — 1:00 PM EST

In this webinar, we will analyze a large-scale human mobility dataset across the entire UK to provide actionable insights into the dynamics of human mobility. Using CARTO, we will identify high-intensity hotspots emerging in urban centers and key transportation hubs, reflecting areas of heightened activity. Conversely, cold spots will reveal regions with unusual mobility fluctuations, necessitating further investigation.

Inference Benchmarking of Prominent Open-Source LLMs

Tue, Feb 27, 2024 12:00 PM — 1:00 PM EST

In the upcoming webinar, we delve into the inference benchmarking of prominent open-source Large Language Models such as the 13B and 70B Llama-2. We have used a diverse range of compute shapes available inOracle Cloud Infrastructure (OCI), like Intel, AMD, ARM CPUs, and NVIDIA GPUs.

--

--

ODSC - Open Data Science
ODSCJournal

Our passion is bringing thousands of the best and brightest data scientists together under one roof for an incredible learning and networking experience.