Spaces:
Running
A newer version of the Gradio SDK is available:
5.27.1
class: impact
Presentation based on 1.md
Generated Presentation
??? Welcome everyone. This presentation, automatically generated from the course material titled '1.md', will walk you through the key topics discussed in the document. Let's begin.
Introduction
??? Welcome to the first chapter of our course. In this introductory section, we'll set the stage for what you can expect to learn and explore throughout this journey.
Welcome to the π€ Course!
This course will teach you about large language models (LLMs) and natural language processing (NLP) using libraries from the Hugging Face ecosystem β π€ Transformers, π€ Datasets, π€ Tokenizers, and π€ Accelerate β as well as the Hugging Face Hub. It's completely free and without ads.
??? Welcome to the Hugging Face course! This course is designed to teach you about large language models and natural language processing using the Hugging Face ecosystem. We'll be exploring libraries like Transformers, Datasets, Tokenizers, and Accelerate, as well as the Hugging Face Hub. The best part? This course is completely free and ad-free.
Understanding NLP and LLMs
While this course was originally focused on NLP (Natural Language Processing), it has evolved to emphasize Large Language Models (LLMs), which represent the latest advancement in the field.
What's the difference?
- NLP (Natural Language Processing) is the broader field focused on enabling computers to understand, interpret, and generate human language. NLP encompasses many techniques and tasks such as sentiment analysis, named entity recognition, and machine translation.
- LLMs (Large Language Models) are a powerful subset of NLP models characterized by their massive size, extensive training data, and ability to perform a wide range of language tasks with minimal task-specific training. Models like the Llama, GPT, or Claude series are examples of LLMs that have revolutionized what's possible in NLP.
Throughout this course, you'll learn about both traditional NLP concepts and cutting-edge LLM techniques, as understanding the foundations of NLP is crucial for working effectively with LLMs.
??? Let's start by understanding the difference between NLP and LLMs. NLP, or Natural Language Processing, is a broad field that focuses on enabling computers to understand, interpret, and generate human language. It encompasses various techniques and tasks like sentiment analysis and machine translation. On the other hand, LLMs, or Large Language Models, are a subset of NLP models known for their massive size and ability to perform a wide range of language tasks with minimal task-specific training. Throughout this course, we'll explore both traditional NLP concepts and cutting-edge LLM techniques.
What to expect?
- Chapters 1 to 4 provide an introduction to the main concepts of the π€ Transformers library. By the end of this part of the course, you will be familiar with how Transformer models work and will know how to use a model from the Hugging Face Hub, fine-tune it on a dataset, and share your results on the Hub!
- Chapters 5 to 8 teach the basics of π€ Datasets and π€ Tokenizers before diving into classic NLP tasks and LLM techniques. By the end of this part, you will be able to tackle the most common language processing challenges by yourself.
- Chapter 9 goes beyond NLP to cover how to build and share demos of your models on the π€ Hub. By the end of this part, you will be ready to showcase your π€ Transformers application to the world!
- Chapters 10 to 12 dive into advanced LLM topics like fine-tuning, curating high-quality datasets, and building reasoning models.
This course:
- Requires a good knowledge of Python
- Is better taken after an introductory deep learning course
- Does not expect prior PyTorch or TensorFlow knowledge, though some familiarity will help
??? Now, let's talk about what you can expect from this course. We'll start with an introduction to the Transformers library, followed by exploring Datasets and Tokenizers. We'll then dive into classic NLP tasks and LLM techniques. By the end of this course, you'll be equipped to tackle common language processing challenges and even build and share your own models. This course assumes a good knowledge of Python and is best taken after an introductory deep learning course.
Who are we?
About the authors:
- Abubakar Abid: Completed his PhD at Stanford in applied machine learning. Founded Gradio, acquired by Hugging Face.
- Ben Burtenshaw: Machine Learning Engineer at Hugging Face, PhD in NLP from the University of Antwerp.
- Matthew Carrigan: Machine Learning Engineer at Hugging Face, previously at Parse.ly and Trinity College Dublin.
- Lysandre Debut: Machine Learning Engineer at Hugging Face, core maintainer of the π€ Transformers library.
- Sylvain Gugger: Research Engineer at Hugging Face, co-author of Deep Learning for Coders with fastai and PyTorch.
- Dawood Khan: Machine Learning Engineer at Hugging Face, co-founder of Gradio.
- Merve Noyan: Developer Advocate at Hugging Face, focused on democratizing machine learning.
- Lucile Saulnier: Machine Learning Engineer at Hugging Face, involved in NLP research projects.
- Lewis Tunstall: Machine Learning Engineer at Hugging Face, co-author of Natural Language Processing with Transformers.
- Leandro von Werra: Machine Learning Engineer at Hugging Face, co-author of Natural Language Processing with Transformers.
??? Let me introduce you to the authors of this course. We have a diverse team of experts, including Abubakar Abid, who founded Gradio; Ben Burtenshaw, with a PhD in NLP; and many others who bring a wealth of knowledge and experience to this course.
FAQ
Does taking this course lead to a certification?
Currently, no certification is available, but a program is in development.How much time should I spend on this course?
Each chapter is designed for 1 week, with 6-8 hours of work per week.Where can I ask a question?
Click the "Ask a question" banner to be redirected to the Hugging Face forums.
- Where can I get the code for the course?
Click the banner to run code in Google Colab or Amazon SageMaker Studio Lab.
How can I contribute to the course?
Open an issue on the course repo or help translate the course.Can I reuse this course?
Yes, under the Apache 2 license.
??? Before we proceed, let's address some frequently asked questions. We're working on a certification program, but it's not available yet. Each chapter is designed for one week of study, with 6-8 hours of work per week. If you have questions, you can ask them on the Hugging Face forums. The code for the course is available on GitHub, and you can contribute by opening issues or helping with translations. Finally, this course is released under the Apache 2 license, so feel free to reuse it.
Let's Go
In this chapter, you will learn:
- How to use the
pipeline()
function to solve NLP tasks such as text generation and classification - About the Transformer architecture
- How to distinguish between encoder, decoder, and encoder-decoder architectures and use cases
???
Now that we've covered the basics, let's dive into what you'll learn in this chapter. We'll explore the pipeline()
function for solving NLP tasks, understand the Transformer architecture, and learn to distinguish between different model architectures and their use cases.
class: center, middle
Thank You!
??? That concludes the material covered in this presentation, generated from the provided course document. Thank you for your time and attention. Are there any questions? ```