diff --git a/images/high-momentum.png b/images/high-momentum.png new file mode 100644 index 0000000..93a1a12 Binary files /dev/null and b/images/high-momentum.png differ diff --git a/images/low-momentum.png b/images/low-momentum.png new file mode 100644 index 0000000..4c62ceb Binary files /dev/null and b/images/low-momentum.png differ diff --git a/notebooks/deep-learning/Introduction to Deep Learning with PyTorch.ipynb b/notebooks/deep-learning/Introduction to Deep Learning with PyTorch.ipynb index 4ce101d..c61bc70 100644 --- a/notebooks/deep-learning/Introduction to Deep Learning with PyTorch.ipynb +++ b/notebooks/deep-learning/Introduction to Deep Learning with PyTorch.ipynb @@ -3588,30 +3588,120 @@ "- You might overshoot the bottom and crash\n", "- In ML terms: Your model might miss the optimal solution and bounce around, never finding the best weights\n", "- You might even \"fly off the mountain\" (model weights explode)\n", + "![Impact of the learning rate - high learning rate](../../images/Impact-of-the-learning-rate-high-learning-rate.png)\n", "\n", "**Too Low Learning Rate (Skiing Too Slowly)**\n", "- Like inching down the mountain at a snail's pace\n", "- You'll eventually get there, but it'll take forever\n", "- In ML terms: Training will be very slow and might get stuck in small dips before reaching the bottom\n", + "![Impact of the learning rate - small learning rate](../../images/Impact-of-the-learning-rate-small-learning-rate.png)\n", "\n", "**Just Right Learning Rate**\n", "- Moving at a controlled, steady pace\n", "- Allows you to adjust your path while making good progress\n", "- Typical values often start around 0.01 or 0.001\n", "\n", - "## Momentum: Your Skiing Inertia\n", + "Let's continue with our skiing analogy to understand momentum in more detail!\n", + "\n", + "## Understanding Momentum: The Skiing Inertia\n", + "\n", + "Imagine you're wearing a heavy backpack while skiing. The weight of this backpack represents momentum - it affects how you move down the slope and how easily you can change direction.\n", + "\n", + "## High Momentum (0.9 - 0.99)\n", + "\n", + "**What It's Like**\n", + "- Like skiing with a heavy backpack\n", + "- Once you start moving in a direction, it's harder to make sudden turns\n", + "- You maintain more of your previous direction and speed\n", + "\n", + "![high-momentum](../../images/high-momentum.png)\n", + "\n", + "**Benefits**\n", + "- Helps push through flat spots or small uphill sections\n", + "- Like using your built-up speed to glide over small bumps\n", + "- Great for escaping shallow valleys (local minima) in the loss landscape\n", + "- Training tends to be more stable and faster overall\n", + "\n", + "**Potential Problems**\n", + "- Might overshoot the bottom of the slope\n", + "- Harder to make precise adjustments\n", + "- Like trying to stop quickly with a heavy backpack - it takes longer\n", + "- Could miss the optimal solution if it requires sharp turns\n", + "\n", + "## Low Momentum (0.1 - 0.5)\n", + "\n", + "**What It's Like**\n", + "- Like skiing with a light backpack\n", + "- Easier to make quick direction changes\n", + "- Less influenced by your previous movement\n", + "\n", + "![low-momentum](../../images/low-momentum.png)\n", + "\n", + "**Benefits**\n", + "- More precise control over your movement\n", + "- Better for navigating tricky, winding paths\n", + "- Useful when you need to make careful adjustments\n", + "- Good for fine-tuning near the optimal solution\n", "\n", - "Momentum is like your forward motion that helps you glide through small bumps and dips in the snow:\n", + "**Potential Problems**\n", + "- Might get stuck in small dips\n", + "- Progress can be slower\n", + "- Like stopping at every little bump in the snow\n", + "- More susceptible to getting trapped in local minima\n", "\n", - "**How Momentum Works**\n", - "- It's like keeping some of your previous direction and speed\n", - "- Helps you push through small obstacles (local minima)\n", - "- Smooths out your descent, making it less jerky\n", + "## Zero Momentum (0.0)\n", "\n", - "**Benefits of Momentum**\n", - "- Helps you get through flat spots on the slope\n", - "- Prevents you from getting stuck in small snow banks\n", - "- In ML terms: Helps escape local minima and speeds up convergence\n", + "**What It's Like**\n", + "- Like skiing with no backpack at all\n", + "- Each move is independent of previous moves\n", + "- Pure SGD without any memory of past updates\n", + "\n", + "**When to Use It**\n", + "- When you want very precise control\n", + "- In simple landscapes with clear paths to the minimum\n", + "- When dealing with noisy or unpredictable data\n", + "\n", + "## Real-World Scenarios\n", + "\n", + "**High Momentum Works Best When**\n", + "- Your loss landscape is like a smooth, wide valley\n", + "- You're dealing with consistent patterns in your data\n", + "- You want faster convergence\n", + "- Like skiing down a long, gentle slope where you can use your momentum effectively\n", + "\n", + "**Low Momentum Works Best When**\n", + "- Your loss landscape is tricky with lots of turns\n", + "- You're near the optimal solution and need precision\n", + "- Your data has lots of variation\n", + "- Like skiing through a technical course where you need careful control\n", + "\n", + "## Common Momentum Values and Their Effects\n", + "\n", + "| Momentum Value | Behavior | Best Used For |\n", + "|----------------|----------|---------------|\n", + "| 0.9 | Standard choice, good balance | Most training scenarios |\n", + "| 0.99 | Very aggressive momentum | Large datasets, smooth loss landscapes |\n", + "| 0.5 | Moderate momentum | When high momentum is too aggressive |\n", + "| 0.0 | No momentum | Precise control needed |\n", + "\n", + "## Practical Tips\n", + "\n", + "**Starting Out**\n", + "- Begin with momentum = 0.9\n", + "- Like starting on a gentler slope before tackling steeper ones\n", + "- Observe how your model behaves\n", + "\n", + "**Adjusting Momentum**\n", + "- If training is unstable: Lower the momentum\n", + "- If training is too slow: Try increasing momentum\n", + "- If close to convergence: Consider reducing momentum for fine-tuning\n", + "\n", + "**Warning Signs**\n", + "- If your loss is bouncing wildly: Your momentum might be too high\n", + "- If progress is very slow: Your momentum might be too low\n", + "- If you're overshooting repeatedly: Consider reducing both momentum and learning rate\n", + "\n", + "Remember: Just like a skilled skier adjusts their technique based on the terrain, you'll need to adjust momentum based on your model's behavior and your training data's characteristics. The goal is to find that sweet spot where you're making steady progress while maintaining control!\n", "\n", "## Finding the Right Balance\n", "\n", @@ -3636,9 +3726,7 @@ "\n", "**When It Goes Wrong**\n", "- Loss might oscillate wildly (too high learning rate)\n", - "![Impact of the learning rate - high learning rate](../../images/Impact-of-the-learning-rate-high-learning-rate.png)\n", "- Training might take forever (too low learning rate)\n", - "![Impact of the learning rate - small learning rate](../../images/Impact-of-the-learning-rate-small-learning-rate.png)\n", "- Model might get stuck (no momentum when needed)\n", "- Like either tumbling down the mountain or never making it to the bottom\n", "\n",