Do you know how well your model is doing? Evaluate your LLMs

Cheuk Ting Ho

Natural Language Processing & Audio (incl. Generative AI NLP)
Python Skill Advanced
Domain Expertise Advanced

We will begin with an essential revision of the Hugging Face Transformers library, covering basic LLM inference and fine-tuning. The core of the workshop will introduce and provide deep practice with Lighteval, an efficient and powerful LLM evaluation framework. Participants will learn how to leverage Lighteval to compare various LLMs available on the Hugging Face Hub using a range of pre-built tasks and metrics.

Finally, we will delve into advanced evaluation techniques, focusing on creating custom tasks and metrics tailored to unique, real-world application requirements. Participants will learn how to prepare custom datasets on the Hugging Face Hub and integrate them into Lighteval for precise, domain-specific evaluation. By the end of this workshop, you will possess the practical skills to rigorously evaluate, benchmark, and fine-tune your LLMs with confidence.

Prerequisites:

- Have experience coding in Python (with Python installed in the local machine)
- Basic understand of machine learning and LLMs
- Experience with Hugging Face Transformers preferred but not necessary
- A Hugging Face Hub account (sign up for free)
- A modern computer that can fine-turn small LLMs locally

Cheuk Ting Ho

After having a career as a Data Scientist and Developer Advocate, Cheuk dedicated her work to the open-source community. Currently, she is working as a developer advocate for JetBrains. She has co-founded Humble Data, a beginner Python workshop that has been happening around the world. Cheuk also started and hosted a Python podcast, PyPodCats, which highlights the achievements of underrepresented members in the community. She has served the EuroPython Society board for two years and is now a fellow and director of the Python Software Foundation.