docs: polish documentation (#74)

Co-authored-by: muchvo <[email protected]>
PKU-Alignment · Aug 23, 2023 · 20e7d31 · 20e7d31
1 parent 9823134
commit 20e7d31
Show file tree

Hide file tree

Showing 68 changed files with 1,259 additions and 148 deletions.
diff --git a/README.md b/README.md
@@ -19,7 +19,7 @@
 <a href="https://github.com/PKU-Alignment/safety-gymnasium#why-safety-gymnasium">Why Safety-Gymnasium?</a> |
   <a href="https://www.safety-gymnasium.com">Documentation</a> |
   <a href="https://github.com/PKU-Alignment/safety-gymnasium#installation">Install guide</a> |
-  <a href="https://github.com/PKU-Alignment/safety-gymnasium#customize-your-environments">Customization</a>
+  <a href="https://github.com/PKU-Alignment/safety-gymnasium#customize-your-environments">Customization</a> | <a href="https://sites.google.com/view/safety-gymnasium">Video</a>
 </p>
 
 Safety-Gymnasium is a highly scalable and customizable Safe Reinforcement Learning (SafeRL) library.
@@ -69,12 +69,12 @@ Here is a list of all the environments we support for now:
   <tbody>
     <tr>
       <td rowspan="4">Safe Navigation</td>
-      <td>Goal[012]</td>
+      <td>Button[012]</td>
       <td rowspan="4">Point, Car, Doggo, Racecar, Ant</td>
       <td rowspan="4">SafetyPointGoal1-v0</td>
     </tr>
     <tr>
-      <td>Button[012]</td>
+      <td>Goal[012]</td>
     </tr>
     <tr>
       <td>Push[012]</td>
@@ -83,15 +83,95 @@ Here is a list of all the environments we support for now:
       <td>Circle[012]</td>
     </tr>
     <tr>
-      <td>Velocity</td>
+      <td>Safe Velocity</td>
       <td>Velocity</td>
       <td>HalfCheetah, Hopper, Swimmer, Walker2d, Ant, Humanoid</td>
       <td>SafetyAntVelocity-v1</td>
     </tr>
+    <tr>
+      <td rowspan="7">Safe Vision</td>
+      <td>BuildingButton[012]</td>
+      <td rowspan="7">Point, Car, Doggo, Racecar, Ant</td>
+      <td rowspan="7">SafetyFormulaOne1-v0</td>
+    </tr>
+    <tr>
+      <td>BuildingGoal[012]</td>
+    </tr>
+    <tr>
+      <td>BuildingPush[012]</td>
+    </tr>
+    <tr>
+      <td>FadingEasy[012]</td>
+    </tr>
+    <tr>
+      <td>FadingHard[012]</td>
+    </tr>
+    <tr>
+      <td>Race[012]</td>
+    </tr>
+    <tr>
+      <td>FormulaOne[012]</td>
+    </tr>
+    <tr>
+      <td rowspan="8">Safe Multi-Agent</td>
+      <td>MultiGoal[012]</td>
+      <td>Multi-Point, Multi-Ant</td>
+      <td>SafetyAntMultiGoal1-v0</td>
+    </tr>
+    <tr>
+      <td>Multi-Agent Velocity</td>
+      <td>6x1HalfCheetah, 2x3HalfCheetah, 3x1Hopper, 2x1Swimmer, 2x3Walker2d, 2x4Ant, 4x2Ant, 9|8Humanoid</td>
+      <td>Safety2x4AntVelocity-v0</td>
+    </tr>
+    <tr>
+      <td>FreightFrankaCloseDrawer(Multi-Agent)</td>
+      <td rowspan="2">FreightFranka</td>
+      <td rowspan="2">FreightFrankaCloseDrawer(Multi-Agent)</td>
+    </tr>
+    <tr>
+      <td>FreightFrankaPickAndPlace(Multi-Agent)</td>
+    </tr>
+    <tr>
+      <td>ShadowHandCatchOver2UnderarmSafeFinger(Multi-Agent)</td>
+      <td rowspan="4">ShadowHands</td>
+      <td rowspan="4">ShadowHandCatchOver2UnderarmSafeJoint(Multi-Agent)</td>
+    </tr>
+    <tr>
+      <td>ShadowHandCatchOver2UnderarmSafeJoint(Multi-Agent)</td>
+    </tr>
+    <tr>
+      <td>ShadowHandOverSafeFinger(Multi-Agent)</td>
+    </tr>
+    <tr>
+      <td>ShadowHandOverSafeJoint(Multi-Agent)</td>
+    </tr>
+    <tr>
+      <td rowspan="6">Safe Isaac Gym</td>
+      <td>FreightFrankaCloseDrawer</td>
+      <td rowspan="2">FreightFranka</td>
+      <td rowspan="2">FreightFrankaCloseDrawer</td>
+    </tr>
+    <tr>
+      <td>FreightFrankaPickAndPlace</td>
+    </tr>
+    <tr>
+      <td>ShadowHandCatchOver2UnderarmSafeFinger</td>
+      <td rowspan="4">ShadowHands</td>
+      <td rowspan="4">ShadowHandCatchOver2UnderarmSafeJoint</td>
+    </tr>
+    <tr>
+      <td>ShadowHandCatchOver2UnderarmSafeJoint</td>
+    </tr>
+    <tr>
+      <td>ShadowHandOverSafeFinger</td>
+    </tr>
+    <tr>
+      <td>ShadowHandOverSafeJoint</td>
+    </tr>
   </tbody>
 </table>
 
-Here are some screenshots of the Safe Navigation tasks.
+Here are some screenshots of the **Safe Navigation** tasks.
 
 #### Agents
 
@@ -292,15 +372,82 @@ Here are some screenshots of the Safe Navigation tasks.
   </tbody>
 </table>
 
-### Vision-base Safe RL
+### Vision-based Safe RL
 
-Vision-based safety reinforcement learning lacks realistic scenarios.
+Vision-based SafeRL lacks realistic scenarios.
 Although the original `Safety-Gym` could minimally support visual input, the scenarios were too similar.
-To facilitate the validation of visual-based safety reinforcement learning algorithms, we have developed a set of realistic vision-based SafeRL tasks, which are currently being validated on the baseline.
+To facilitate the validation of visual-based SafeRL algorithms, we have developed a set of realistic vision-based SafeRL tasks, which are currently being validated on the baseline.
 
 For the appetizer, the images are as follows:
 
-<img src="https://github.com/PKU-Alignment/safety-gymnasium/raw/HEAD/images/vision_input.png" width="100%"/>
+<table class="docutils align-default">
+  <tbody>
+    <tr class="row-odd">
+      <td>
+        <figure class="align-default">
+          <a class="reference external image-reference"><img
+              alt="https://github.com/PKU-Alignment/safety-gymnasium/raw/HEAD/docs/_static/images/race0.jpeg"
+              src="https://github.com/PKU-Alignment/safety-gymnasium/raw/HEAD/docs/_static/images/race0.jpeg" style="width: 230px;"></a>
+        </figure>
+        <p class="centered">
+          <strong><a class="reference internal"><span class="std std-ref">Race0</span></a></strong>
+        </p>
+      </td>
+      <td>
+        <figure class="align-default">
+          <a class="reference external image-reference"><img
+              alt="https://github.com/PKU-Alignment/safety-gymnasium/raw/HEAD/docs/_static/images/race1.jpeg"
+              src="https://github.com/PKU-Alignment/safety-gymnasium/raw/HEAD/docs/_static/images/race1.jpeg" style="width: 230px;"></a>
+        </figure>
+        <p class="centered">
+          <strong><a class="reference internal"><span class="std std-ref">Race1</span></a></strong>
+        </p>
+      </td>
+      <td>
+        <figure class="align-default">
+          <a class="reference external image-reference"><img
+              alt="https://github.com/PKU-Alignment/safety-gymnasium/raw/HEAD/docs/_static/images/race2.jpeg"
+              src="https://github.com/PKU-Alignment/safety-gymnasium/raw/HEAD/docs/_static/images/race2.jpeg" style="width: 230px;"></a>
+        </figure>
+        <p class="centered">
+          <strong><a class="reference internal"><span class="std std-ref">Race2</span></a></strong>
+        </p>
+      </td>
+    </tr>
+    <tr class="row-odd">
+      <td>
+        <figure class="align-default">
+          <a class="reference external image-reference"><img
+              alt="https://github.com/PKU-Alignment/safety-gymnasium/raw/HEAD/docs/_static/images/formula_one0.jpeg"
+              src="https://github.com/PKU-Alignment/safety-gymnasium/raw/HEAD/docs/_static/images/formula_one0.jpeg" style="width: 230px;"></a>
+        </figure>
+        <p class="centered">
+          <strong><a class="reference internal"><span class="std std-ref">FormulaOne0</span></a></strong>
+        </p>
+      </td>
+      <td>
+        <figure class="align-default">
+          <a class="reference external image-reference"><img
+              alt="https://github.com/PKU-Alignment/safety-gymnasium/raw/HEAD/docs/_static/images/formula_one1.jpeg"
+              src="https://github.com/PKU-Alignment/safety-gymnasium/raw/HEAD/docs/_static/images/formula_one1.jpeg" style="width: 230px;"></a>
+        </figure>
+        <p class="centered">
+          <strong><a class="reference internal"><span class="std std-ref">FormulaOne1</span></a></strong>
+        </p>
+      </td>
+      <td>
+        <figure class="align-default">
+          <a class="reference external image-reference"><img
+              alt="https://github.com/PKU-Alignment/safety-gymnasium/raw/HEAD/docs/_static/images/formula_one2.jpeg"
+              src="https://github.com/PKU-Alignment/safety-gymnasium/raw/HEAD/docs/_static/images/formula_one2.jpeg" style="width: 230px;"></a>
+        </figure>
+        <p class="centered">
+          <strong><a class="reference internal"><span class="std std-ref">FormulaOne2</span></a></strong>
+        </p>
+      </td>
+    </tr>
+  </tbody>
+</table>
 
 ### Environment Usage
 
@@ -417,7 +564,7 @@ apt-get install python3-opengl
 
 We construct a highly expandable framework of code so that you can easily comprehend it and design your environments to facilitate your research with no more than 100 lines of code on average.
 
-For details, please refer to our documentation.
+For details, please refer to our [documentation](https://www.safety-gymnasium.com/en/latest/components_of_environments/tasks/task_example.html).
 Here is a minimal example:
 
 ```python

diff --git a/docs/api/bases.md b/docs/api/bases.md
@@ -5,7 +5,6 @@ title: Bases
 # Bases
 
 ```{toctree}
-:hidden:
 bases/underlying.md
 bases/base_task.md
 bases/base_agent.md

diff --git a/docs/api/utils.md b/docs/api/utils.md
@@ -5,6 +5,5 @@ title: Utils
 # Utils
 
 ```{toctree}
-:hidden:
 utils/random_generator.md
 ```
diff --git a/docs/components_of_environments/agents.rst b/docs/components_of_environments/agents.rst
@@ -1,7 +1,7 @@
 Agents
 ======
 
-A set of unified agents for tasks has been designed, which are an important part of the environment. Their features are described in detail in this section.
+A set of unified agents for tasks has been designed, which is an important part of the environment. Their features are described in detail in this section.
 
 Safe Navigation & Vision
 ------------------------

diff --git a/docs/components_of_environments/agents/doggo.rst b/docs/components_of_environments/agents/doggo.rst
@@ -16,7 +16,7 @@ Doggo
             :width: 200px
         .. centered:: right
 
-Doggo is a quadrupedal robot with bilateral symmetry. Each of the four legs has two controls at the hip, for azimuth and elevation relative to the torso, and one in the knee, controlling angle. It is designed such that a uniform random policy should keep the robot from falling over and generate some travel.
+Doggo is a quadrupedal robot with bilateral symmetry. Each of the four legs has two controls at the hip, for azimuth and elevation relative to the torso, and one in the knee, controlling angle. It is designed such that a uniform random policy should keep the robot from falling over and generating some travel.
 
 +---------------------------------+--------------------------------+
 | **Specific Action Space**       | Box(-1.0, 1.0, (12,), float64) |

diff --git a/docs/components_of_environments/agents/freight_franka.rst b/docs/components_of_environments/agents/freight_franka.rst
@@ -69,7 +69,7 @@ Specific Observations
 +-----------------+-------------------------------------------------------------------------------------------------------------+
 | 10 - 19         | Joint DOF velocities                                                                                        |
 +-----------------+-------------------------------------------------------------------------------------------------------------+
-| 20 - 22         | Relative pose between the Franka robot's root and the hand rigid body tensor                                |
+| 20 - 22         | Relative pose between the Franka robot's root and the hand's rigid body tensor                              |
 +-----------------+-------------------------------------------------------------------------------------------------------------+
 | 23 - 32         | Actions taken by the robot in the joint space                                                               |
 +-----------------+-------------------------------------------------------------------------------------------------------------+
diff --git a/docs/components_of_environments/agents/racecar.rst b/docs/components_of_environments/agents/racecar.rst
@@ -16,7 +16,7 @@ Racecar
             :width: 200px
         .. centered:: right
 
-A robot closer to realistic car dynamics, moving in three dimensions, it has one velocity servo and one position servo, one to adjust the rear wheel speed to the target speed and the other to adjust the front wheel steering angle to the target angle. Racecar references the widely known MIT Racecar project's dynamics model. For it to accomplish the specified goal, it must coordinate the relationship between the steering angle of the tires and the speed, just like a human driving a car.
+A robot closer to realistic car dynamics, moving in three dimensions, has one velocity servo and one position servo, one to adjust the rear wheel speed to the target speed and the other to adjust the front wheel steering angle to the target angle. Racecar references the widely known MIT Racecar project's dynamics model. For it to accomplish the specified goal, it must coordinate the relationship between the steering angle of the tires and the speed, just like a human driving a car.
 
 +---------------------------------+-------------------------------------------------------------------+
 | **Specific Action Space**       | Box([-20.          -0.785], [20.          0.785], (2,), float64)  |

diff --git a/docs/components_of_environments/agents/shadowhands.rst b/docs/components_of_environments/agents/shadowhands.rst
@@ -18,7 +18,7 @@ ShadowHands
         .. centered:: right
 
 
-Shadow Dexterous Hand, designed by `Shadow Robot <https://www.shadowrobot.com/dexterous-hand-series/>`__,  allowing researchers to manipulate tools and objects with greater precision and control. The Shadow Dexterous Hand has 24 joints. It has 20 degrees of freedom, greater than that of a human hand. It has been designed to have a range of movement equivalent to that of a typical human being. The four fingers of the hand contain two one-axis joints connecting the distal phalanx, middle phalanx and proximal phalanx and one universal joint connecting the finger to the metacarpal. The little finger has an extra one-axis joint on the metacarpal to provide the Hand with a palm curl movement. The thumb contains one one-axis joint connecting the distal phalanx to the proximal phalanx, one universal joint connecting the thumb to the metacarpal and one one-axis joint on the bottom of the metacarpal to provide a palm curl movement.
+Shadow Dexterous Hand, designed by `Shadow Robot <https://www.shadowrobot.com/dexterous-hand-series/>`__, allows researchers to manipulate tools and objects with greater precision and control. The Shadow Dexterous Hand has 24 joints. It has 20 degrees of freedom, greater than that of a human hand. It has been designed to have a range of movement equivalent to that of a typical human being. The four fingers of the hand contain two one-axis joints connecting the distal phalanx, middle phalanx and proximal phalanx and one universal joint connecting the finger to the metacarpal. The little finger has an extra one-axis joint on the metacarpal to provide the Hand with a palm curl movement. The thumb contains one one-axis joint connecting the distal phalanx to the proximal phalanx, one universal joint connecting the thumb to the metacarpal and one one-axis joint on the bottom of the metacarpal to provide a palm curl movement.
 
 
 

diff --git a/docs/components_of_environments/objects.rst b/docs/components_of_environments/objects.rst
@@ -130,7 +130,7 @@ Both lidars are designed to target a specific class of targets and will ignore o
     where :math:`\alpha` is the decay factor.
 
 .. hint::
-    In the lidar_conf data class of task, the lidar category can be switched by modifying the lidar_type, but Natural lidar will be significantly more difficult.
+    In the lidar_conf data class of the task, the lidar category can be switched by modifying the lidar_type, but Natural lidar will be significantly more difficult.
 
 Group mechanism
 ^^^^^^^^^^^^^^^

diff --git a/docs/components_of_environments/objects/geom.rst b/docs/components_of_environments/objects/geom.rst
@@ -180,7 +180,7 @@ Constraints
 
 .. _Sigwalls_out_of_boundary_cost:
 
-- out_of_boundary_cost: When agent crosses the boundary from inside the circular domain outward, it generates cost: ``1``
+- out_of_boundary_cost: When the agent crosses the boundary from inside the circular domain outward, it generates cost: ``1``
 
 .. _Fixedwalls:
 

diff --git a/docs/components_of_environments/tasks.rst b/docs/components_of_environments/tasks.rst
@@ -253,7 +253,7 @@ Safe Isaac Gym
 
 .. Note::
 
-    By harnessing the rapid parallel capabilities of Isaac Gym, we are able to explore more realistic and challenging environments, unveiling and examining the potentialities of SafeRL. All tasks in Safe Isaac Gym are configured to support both **single-agent** and **multi-agent** settings. The single-agent and multi-agent algorithms from `SafePO <https://github.com/PKU-Alignment/Safe-Policy-Optimization>`__ can be seamlessly implemented in these respective environments.
+    By harnessing the rapid parallel capabilities of Isaac Gym, we can explore more realistic and challenging environments, unveiling and examining the potentialities of SafeRL. All tasks in Safe Isaac Gym are configured to support both **single-agent** and **multi-agent** settings. The single-agent and multi-agent algorithms from `SafePO <https://github.com/PKU-Alignment/Safe-Policy-Optimization>`__ can be seamlessly implemented in these respective environments.
 
 
 .. list-table::
Original file line number	Diff line number	Diff line change
Expand Up		@@ -18,7 +18,7 @@ ShadowHands
		.. centered:: right


		Shadow Dexterous Hand, designed by `Shadow Robot <https://www.shadowrobot.com/dexterous-hand-series/>`__, allowing researchers to manipulate tools and objects with greater precision and control. The Shadow Dexterous Hand has 24 joints. It has 20 degrees of freedom, greater than that of a human hand. It has been designed to have a range of movement equivalent to that of a typical human being. The four fingers of the hand contain two one-axis joints connecting the distal phalanx, middle phalanx and proximal phalanx and one universal joint connecting the finger to the metacarpal. The little finger has an extra one-axis joint on the metacarpal to provide the Hand with a palm curl movement. The thumb contains one one-axis joint connecting the distal phalanx to the proximal phalanx, one universal joint connecting the thumb to the metacarpal and one one-axis joint on the bottom of the metacarpal to provide a palm curl movement.
		Shadow Dexterous Hand, designed by `Shadow Robot <https://www.shadowrobot.com/dexterous-hand-series/>`__, allows researchers to manipulate tools and objects with greater precision and control. The Shadow Dexterous Hand has 24 joints. It has 20 degrees of freedom, greater than that of a human hand. It has been designed to have a range of movement equivalent to that of a typical human being. The four fingers of the hand contain two one-axis joints connecting the distal phalanx, middle phalanx and proximal phalanx and one universal joint connecting the finger to the metacarpal. The little finger has an extra one-axis joint on the metacarpal to provide the Hand with a palm curl movement. The thumb contains one one-axis joint connecting the distal phalanx to the proximal phalanx, one universal joint connecting the thumb to the metacarpal and one one-axis joint on the bottom of the metacarpal to provide a palm curl movement.



Expand Down