Merge pull request #797 from huggingface/bump_release

Bump_release
huggingface · Feb 20, 2025 · 5adc473 · 5adc473
2 parents fc92c8a + 3a49d7e
commit 5adc473
Show file tree

Hide file tree

Showing 4 changed files with 39 additions and 14 deletions.
diff --git a/chapters/en/chapter11/1.mdx b/chapters/en/chapter11/1.mdx
@@ -8,7 +8,7 @@ Chat templates structure interactions between users and AI models, ensuring cons
 
 ## 2️⃣ Supervised Fine-Tuning
 
-Supervised Fine-Tuning (SFT) is a critical process for adapting pre-trained language models to specific tasks. It involves training the model on a task-specific dataset with labeled examples. For a detailed guide on SFT, including key steps and best practices, see [The supervised fine-tuning section of the TRL documentation](https://huggingface.co/docs/trl/en/sft_trainer).
+Supervised Fine-Tuning (SFT) is a critical process for adapting pre-trained language models to specific tasks. It involves training the model on a task-specific dataset with labeled examples. For a detailed guide on SFT, including key steps and best practices, see [the supervised fine-tuning section of the TRL documentation](https://huggingface.co/docs/trl/en/sft_trainer).
 
 ## 3️⃣ Low Rank Adaptation (LoRA)
 
@@ -25,9 +25,9 @@ Evaluation is a crucial step in the fine-tuning process. It allows us to measure
 ## References
 
 - [Transformers documentation on chat templates](https://huggingface.co/docs/transformers/main/en/chat_templating)
-- [Script for Supervised Fine-Tuning in TRL](https://github.com/huggingface/trl/blob/main/examples/scripts/sft.py)
+- [Script for Supervised Fine-Tuning in TRL](https://github.com/huggingface/trl/blob/main/trl/scripts/sft.py)
 - [`SFTTrainer` in TRL](https://huggingface.co/docs/trl/main/en/sft_trainer)
 - [Direct Preference Optimization Paper](https://arxiv.org/abs/2305.18290)
-- [Supervised Fine-Tuning with TRL](https://huggingface.co/docs/trl/main/en/tutorials/supervised_finetuning)
+- [Supervised Fine-Tuning with TRL](https://huggingface.co/docs/trl/sft_trainer)
 - [How to fine-tune Google Gemma with ChatML and Hugging Face TRL](https://github.com/huggingface/alignment-handbook)  
 - [Fine-tuning LLM to Generate Persian Product Catalogs in JSON Format](https://huggingface.co/learn/cookbook/en/fine_tuning_llm_to_generate_persian_product_catalogs_in_json_format)
diff --git a/chapters/en/chapter11/2.mdx b/chapters/en/chapter11/2.mdx
@@ -28,7 +28,7 @@ Instuction tuned models are trained to follow a specific conversational structur
 To make a base model behave like an instruct model, we need to format our prompts in a consistent way that the model can understand. This is where chat templates come in. ChatML is one such template format that structures conversations with clear role indicators (system, user, assistant). Here's a guide on [ChatML](https://huggingface.co/HuggingFaceTB/SmolLM2-135M-Instruct/blob/e2c3f7557efbdec707ae3a336371d169783f1da1/tokenizer_config.json#L146).
 
 <Tip warning={true}>
-When using an instruct model, always verify you're using the correct chat template format. Using the wrong template can result in poor model performance or unexpected behavior. The easiest way to ensure this is to check the model tokenizer configuration on the Hub. For example, the `SmolLM2-135M-Instruct` model uses [this configuration](https://huggingface.co/HuggingFaceTB/SmolLM2-135M-Instruct/blob/e2c3f7557efbdec707ae3a336371d169783f1da1/tokenizer_config.json#L146).  
+When using an instruct model, always verify you're using the correct chat template format. Using the wrong template can result in poor model performance or unexpected behavior. The easiest way to ensure this is to check the model tokenizer configuration on the Hub. For example, the `SmolLM2-135M-Instruct` model uses <a href="https://huggingface.co/HuggingFaceTB/SmolLM2-135M-Instruct/blob/e2c3f7557efbdec707ae3a336371d169783f1da1/tokenizer_config.json#L146">this configuration</a>.  
 </Tip>
 
 ### Common Template Formats

diff --git a/chapters/en/chapter11/3.mdx b/chapters/en/chapter11/3.mdx
@@ -109,7 +109,7 @@ import torch
 device = "cuda" if torch.cuda.is_available() else "cpu"
 
 # Load dataset
-dataset = load_dataset("HuggingFaceTB/smoltalk")
+dataset = load_dataset("HuggingFaceTB/smoltalk", "all")
 
 # Configure trainer
 training_args = SFTConfig(
@@ -119,7 +119,7 @@ training_args = SFTConfig(
     learning_rate=5e-5,
     logging_steps=10,
     save_steps=100,
-    evaluation_strategy="steps",
+    eval_strategy="steps",
     eval_steps=50,
 )
 
@@ -129,7 +129,7 @@ trainer = SFTTrainer(
     args=training_args,
     train_dataset=dataset["train"],
     eval_dataset=dataset["test"],
-    tokenizer=tokenizer,
+    processing_class=tokenizer,
 )
 
 # Start training
@@ -142,7 +142,33 @@ When using a dataset with a "messages" field (like the example above), the SFTTr
 
 ## Packing the Dataset
 
-The SFTTrainer supports example packing to optimize training efficiency through the `ConstantLengthDataset` utility class. This feature allows multiple short examples to be packed into the same input sequence, maximizing GPU utilization during training. To enable packing, simply set `packing=True` in the SFTConfig constructor. When using packed datasets with `max_steps`, be aware that you may train for more epochs than expected depending on your packing configuration. You can customize how examples are combined using a formatting function - particularly useful when working with datasets that have multiple fields like question-answer pairs. For evaluation datasets, you can disable packing by setting `eval_packing=False` in the SFTConfig. Here's a basic example:
+The SFTTrainer supports example packing to optimize training efficiency. This feature allows multiple short examples to be packed into the same input sequence, maximizing GPU utilization during training. To enable packing, simply set `packing=True` in the SFTConfig constructor. When using packed datasets with `max_steps`, be aware that you may train for more epochs than expected depending on your packing configuration. You can customize how examples are combined using a formatting function - particularly useful when working with datasets that have multiple fields like question-answer pairs. For evaluation datasets, you can disable packing by setting `eval_packing=False` in the SFTConfig. Here's a basic example of customizing the packing configuration:
+
+```python
+# Configure packing
+training_args = SFTConfig(packing=True)
+
+trainer = SFTTrainer(model=model, train_dataset=dataset, args=training_args)
+
+trainer.train()
+```
+
+When packing the dataset with multiple fields, you can define a custom formatting function to combine the fields into a single input sequence. This function should take a list of examples and return a dictionary with the packed input sequence. Here's an example of a custom formatting function:
+
+```python
+def formatting_func(example):
+    text = f"### Question: {example['question']}\n ### Answer: {example['answer']}"
+    return text
+
+
+training_args = SFTConfig(packing=True)
+trainer = SFTTrainer(
+    "facebook/opt-350m",
+    train_dataset=dataset,
+    args=training_args,
+    formatting_func=formatting_func,
+)
+```
 
 ## Monitoring Training Progress
 
@@ -346,5 +372,5 @@ You've learned how to fine-tune models using SFT! To continue your learning:
 ## Additional Resources
 
 - [TRL Documentation](https://huggingface.co/docs/trl)
-- [SFT Examples Repository](https://github.com/huggingface/trl/tree/main/examples/sft)
+- [SFT Examples Repository](https://github.com/huggingface/trl/blob/main/trl/scripts/sft.py)
 - [Fine-tuning Best Practices](https://huggingface.co/docs/transformers/training)
diff --git a/chapters/en/chapter11/4.mdx b/chapters/en/chapter11/4.mdx
@@ -43,7 +43,6 @@ model = AutoModelForCausalLM.from_pretrained(config.base_model_name_or_path)
 lora_model = PeftModel.from_pretrained(model, "ybelkada/opt-350m-lora")
 ```
 
-<!-- TODO: Add image -->
 ![lora_load_adapter](https://github.com/huggingface/smol-course/raw/main/3_parameter_efficient_finetuning/images/lora_adapter.png)
 
 ## Fine-tune LLM using `trl` and the `SFTTrainer` with LoRA
@@ -107,9 +106,9 @@ trainer = SFTTrainer(
     model=model,
     args=args,
     train_dataset=dataset["train"],
-    peft_config=lora_config,  # LoRA configuration
+    peft_config=peft_config,  # LoRA configuration
     max_seq_length=max_seq_length,  # Maximum sequence length
-    tokenizer=tokenizer,
+    processing_class=tokenizer,
 )
 ```
 
@@ -168,6 +167,6 @@ tokenizer.save_pretrained("path/to/save/merged_model")
 
 # Resources
 
-- [LORA: LOW-RANK ADAPTATION OF LARGE LANGUAGE MODELS](https://arxiv.org/pdf/2106.09685)
+- [LoRA: Low-Rank Adaptation of Large Language Models](https://arxiv.org/pdf/2106.09685)
 - [PEFT Documentation](https://huggingface.co/docs/peft)
-- [Hugging Face blog post on PEFT](https://huggingface.co/blog/peft)
+- [Hugging Face blog post on PEFT](https://huggingface.co/blog/peft)