From 4487058cc6b39670055e0609e4c86cdd26689449 Mon Sep 17 00:00:00 2001 From: mohamedsaeed Date: Tue, 25 Jun 2024 00:48:28 +0300 Subject: [PATCH] Edit monitor env in unit1 handson --- notebooks/unit1/unit1.ipynb | 176 ++++++++++++++++++------------------ 1 file changed, 88 insertions(+), 88 deletions(-) diff --git a/notebooks/unit1/unit1.ipynb b/notebooks/unit1/unit1.ipynb index 6f5f103b..06d62b08 100644 --- a/notebooks/unit1/unit1.ipynb +++ b/notebooks/unit1/unit1.ipynb @@ -31,6 +31,9 @@ }, { "cell_type": "markdown", + "metadata": { + "id": "x7oR6R-ZIbeS" + }, "source": [ "### The environment 🎮\n", "\n", @@ -39,19 +42,16 @@ "### The library used 📚\n", "\n", "- [Stable-Baselines3](https://stable-baselines3.readthedocs.io/en/master/)" - ], - "metadata": { - "id": "x7oR6R-ZIbeS" - } + ] }, { "cell_type": "markdown", - "source": [ - "We're constantly trying to improve our tutorials, so **if you find some issues in this notebook**, please [open an issue on the Github Repo](https://github.com/huggingface/deep-rl-class/issues)." - ], "metadata": { "id": "OwEcFHe9RRZW" - } + }, + "source": [ + "We're constantly trying to improve our tutorials, so **if you find some issues in this notebook**, please [open an issue on the Github Repo](https://github.com/huggingface/deep-rl-class/issues)." + ] }, { "cell_type": "markdown", @@ -72,14 +72,14 @@ }, { "cell_type": "markdown", + "metadata": { + "id": "Ff-nyJdzJPND" + }, "source": [ "## This notebook is from Deep Reinforcement Learning Course\n", "\n", "\"Deep" - ], - "metadata": { - "id": "Ff-nyJdzJPND" - } + ] }, { "cell_type": "markdown", @@ -120,14 +120,14 @@ }, { "cell_type": "markdown", + "metadata": { + "id": "HoeqMnr5LuYE" + }, "source": [ "## A small recap of Deep Reinforcement Learning 📚\n", "\n", "\"The" - ], - "metadata": { - "id": "HoeqMnr5LuYE" - } + ] }, { "cell_type": "markdown", @@ -157,6 +157,9 @@ }, { "cell_type": "markdown", + "metadata": { + "id": "qDploC3jSH99" + }, "source": [ "# Let's train our first Deep Reinforcement Learning agent and upload it to the Hub 🚀\n", "\n", @@ -167,23 +170,20 @@ "To find your result, go to the [leaderboard](https://huggingface.co/spaces/huggingface-projects/Deep-Reinforcement-Learning-Leaderboard) and find your model, **the result = mean_reward - std of reward**\n", "\n", "For more information about the certification process, check this section 👉 https://huggingface.co/deep-rl-course/en/unit0/introduction#certification-process" - ], - "metadata": { - "id": "qDploC3jSH99" - } + ] }, { "cell_type": "markdown", + "metadata": { + "id": "HqzznTzhNfAC" + }, "source": [ "## Set the GPU 💪\n", "\n", "- To **accelerate the agent's training, we'll use a GPU**. To do that, go to `Runtime > Change Runtime type`\n", "\n", "\"GPU" - ], - "metadata": { - "id": "HqzznTzhNfAC" - } + ] }, { "cell_type": "markdown", @@ -215,14 +215,14 @@ }, { "cell_type": "code", - "source": [ - "!apt install swig cmake" - ], + "execution_count": null, "metadata": { "id": "yQIGLPDkGhgG" }, - "execution_count": null, - "outputs": [] + "outputs": [], + "source": [ + "!apt install swig cmake" + ] }, { "cell_type": "code", @@ -237,65 +237,65 @@ }, { "cell_type": "markdown", + "metadata": { + "id": "BEKeXQJsQCYm" + }, "source": [ "During the notebook, we'll need to generate a replay video. To do so, with colab, **we need to have a virtual screen to be able to render the environment** (and thus record the frames).\n", "\n", "Hence the following cell will install virtual screen libraries and create and run a virtual screen 🖥" - ], - "metadata": { - "id": "BEKeXQJsQCYm" - } + ] }, { "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "j5f2cGkdP-mb" + }, + "outputs": [], "source": [ "!sudo apt-get update\n", "!sudo apt-get install -y python3-opengl\n", "!apt install ffmpeg\n", "!apt install xvfb\n", "!pip3 install pyvirtualdisplay" - ], - "metadata": { - "id": "j5f2cGkdP-mb" - }, - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "markdown", - "source": [ - "To make sure the new installed libraries are used, **sometimes it's required to restart the notebook runtime**. The next cell will force the **runtime to crash, so you'll need to connect again and run the code starting from here**. Thanks to this trick, **we will be able to run our virtual screen.**" - ], "metadata": { "id": "TCwBTAwAW9JJ" - } + }, + "source": [ + "To make sure the new installed libraries are used, **sometimes it's required to restart the notebook runtime**. The next cell will force the **runtime to crash, so you'll need to connect again and run the code starting from here**. Thanks to this trick, **we will be able to run our virtual screen.**" + ] }, { "cell_type": "code", - "source": [ - "import os\n", - "os.kill(os.getpid(), 9)" - ], + "execution_count": null, "metadata": { "id": "cYvkbef7XEMi" }, - "execution_count": null, - "outputs": [] + "outputs": [], + "source": [ + "import os\n", + "os.kill(os.getpid(), 9)" + ] }, { "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "BE5JWP5rQIKf" + }, + "outputs": [], "source": [ "# Virtual display\n", "from pyvirtualdisplay import Display\n", "\n", "virtual_display = Display(visible=0, size=(1400, 900))\n", "virtual_display.start()" - ], - "metadata": { - "id": "BE5JWP5rQIKf" - }, - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "markdown", @@ -581,12 +581,12 @@ }, { "cell_type": "markdown", - "source": [ - "\"Stable" - ], "metadata": { "id": "HLlClRW37Q7e" - } + }, + "source": [ + "\"Stable" + ] }, { "cell_type": "markdown", @@ -776,7 +776,7 @@ "outputs": [], "source": [ "#@title\n", - "eval_env = Monitor(gym.make(\"LunarLander-v2\"))\n", + "eval_env = Monitor(gym.make(\"LunarLander-v2\", render_mode='rgb_array'))\n", "mean_reward, std_reward = evaluate_policy(model, eval_env, n_eval_episodes=10, deterministic=True)\n", "print(f\"mean_reward={mean_reward:.2f} +/- {std_reward}\")" ] @@ -939,6 +939,11 @@ }, { "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "I2E--IJu8JYq" + }, + "outputs": [], "source": [ "import gymnasium as gym\n", "\n", @@ -974,15 +979,13 @@ " eval_env=eval_env, # Evaluation Environment\n", " repo_id=repo_id, # id of the model repository from the Hugging Face Hub (repo_id = {organization}/{repo_name} for instance ThomasSimonini/ppo-LunarLander-v2\n", " commit_message=commit_message)\n" - ], - "metadata": { - "id": "I2E--IJu8JYq" - }, - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "markdown", + "metadata": { + "id": "T79AEAWEFIxz" + }, "source": [ "Congrats 🥳 you've just trained and uploaded your first Deep Reinforcement Learning agent. The script above should have displayed a link to a model repository such as https://huggingface.co/osanseviero/test_sb3. When you go to this link, you can:\n", "* See a video preview of your agent at the right.\n", @@ -993,10 +996,7 @@ "Under the hood, the Hub uses git-based repositories (don't worry if you don't know what git is), which means you can update the model with new versions as you experiment and improve your agent.\n", "\n", "Compare the results of your LunarLander-v2 with your classmates using the leaderboard 🏆 👉 https://huggingface.co/spaces/huggingface-projects/Deep-Reinforcement-Learning-Leaderboard" - ], - "metadata": { - "id": "T79AEAWEFIxz" - } + ] }, { "cell_type": "markdown", @@ -1028,25 +1028,25 @@ }, { "cell_type": "markdown", + "metadata": { + "id": "bhb9-NtsinKB" + }, "source": [ "Because the model I download from the Hub was trained with Gym (the former version of Gymnasium) we need to install shimmy a API conversion tool that will help us to run the environment correctly.\n", "\n", "Shimmy Documentation: https://github.com/Farama-Foundation/Shimmy" - ], - "metadata": { - "id": "bhb9-NtsinKB" - } + ] }, { "cell_type": "code", - "source": [ - "!pip install shimmy" - ], + "execution_count": null, "metadata": { "id": "03WI-bkci1kH" }, - "execution_count": null, - "outputs": [] + "outputs": [], + "source": [ + "!pip install shimmy" + ] }, { "cell_type": "code", @@ -1086,17 +1086,17 @@ }, { "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "PAEVwK-aahfx" + }, + "outputs": [], "source": [ "#@title\n", "eval_env = Monitor(gym.make(\"LunarLander-v2\"))\n", "mean_reward, std_reward = evaluate_policy(model, eval_env, n_eval_episodes=10, deterministic=True)\n", "print(f\"mean_reward={mean_reward:.2f} +/- {std_reward}\")" - ], - "metadata": { - "id": "PAEVwK-aahfx" - }, - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "markdown", @@ -1154,12 +1154,12 @@ "metadata": { "accelerator": "GPU", "colab": { - "private_outputs": true, - "provenance": [], "collapsed_sections": [ "QAN7B0_HCVZC", "BqPKw3jt_pG5" - ] + ], + "private_outputs": true, + "provenance": [] }, "gpuClass": "standard", "kernelspec": {