Skip to content

Commit

Permalink
Remove DATA_DIR references
Browse files Browse the repository at this point in the history
  • Loading branch information
aliberts committed Nov 28, 2024
1 parent 2556960 commit d6b4429
Show file tree
Hide file tree
Showing 13 changed files with 37 additions and 35 deletions.
2 changes: 1 addition & 1 deletion .github/PULL_REQUEST_TEMPLATE.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ Provide a simple way for the reviewer to try out your changes.

Examples:
```bash
DATA_DIR=tests/data pytest -sx tests/test_stuff.py::test_something
pytest -sx tests/test_stuff.py::test_something
```
```bash
python lerobot/scripts/train.py --some.option=true
Expand Down
8 changes: 1 addition & 7 deletions .github/workflows/nightly-tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,10 +7,8 @@ on:
schedule:
- cron: "0 2 * * *"

env:
DATA_DIR: tests/data
# env:
# SLACK_API_TOKEN: ${{ secrets.SLACK_API_TOKEN }}

jobs:
run_all_tests_cpu:
name: CPU
Expand All @@ -30,13 +28,9 @@ jobs:
working-directory: /lerobot
steps:
- name: Tests
env:
DATA_DIR: tests/data
run: pytest -v --cov=./lerobot --disable-warnings tests

- name: Tests end-to-end
env:
DATA_DIR: tests/data
run: make test-end-to-end


Expand Down
3 changes: 0 additions & 3 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,6 @@ jobs:
name: Pytest
runs-on: ubuntu-latest
env:
DATA_DIR: tests/data
MUJOCO_GL: egl
steps:
- uses: actions/checkout@v4
Expand Down Expand Up @@ -70,7 +69,6 @@ jobs:
name: Pytest (minimal install)
runs-on: ubuntu-latest
env:
DATA_DIR: tests/data
MUJOCO_GL: egl
steps:
- uses: actions/checkout@v4
Expand Down Expand Up @@ -108,7 +106,6 @@ jobs:
name: End-to-end
runs-on: ubuntu-latest
env:
DATA_DIR: tests/data
MUJOCO_GL: egl
steps:
- uses: actions/checkout@v4
Expand Down
14 changes: 7 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -153,10 +153,12 @@ python lerobot/scripts/visualize_dataset.py \
--episode-index 0
```

or from a dataset in a local folder with the root `DATA_DIR` environment variable (in the following case the dataset will be searched for in `./my_local_data_dir/lerobot/pusht`)
or from a dataset in a local folder with the `root` option and the `--local-files-only` (in the following case the dataset will be searched for in `./my_local_data_dir/lerobot/pusht`)
```bash
DATA_DIR='./my_local_data_dir' python lerobot/scripts/visualize_dataset.py \
python lerobot/scripts/visualize_dataset.py \
--repo-id lerobot/pusht \
--root ./my_local_data_dir \
--local-files-only 1 \
--episode-index 0
```

Expand Down Expand Up @@ -208,12 +210,10 @@ dataset attributes:

A `LeRobotDataset` is serialised using several widespread file formats for each of its parts, namely:
- hf_dataset stored using Hugging Face datasets library serialization to parquet
- videos are stored in mp4 format to save space or png files
- episode_data_index saved using `safetensor` tensor serialization format
- stats saved using `safetensor` tensor serialization format
- info are saved using JSON
- videos are stored in mp4 format to save space
- metadata are stored in plain json/jsonl files

Dataset can be uploaded/downloaded from the HuggingFace hub seamlessly. To work on a local dataset, you can set the `DATA_DIR` environment variable to your root dataset folder as illustrated in the above section on dataset visualization.
Dataset can be uploaded/downloaded from the HuggingFace hub seamlessly. To work on a local dataset, you can use the `local_files_only` argument and specify its location with the `root` argument if it's not in the default `~/.cache/huggingface/lerobot` location.

### Evaluate a pretrained policy

Expand Down
5 changes: 2 additions & 3 deletions examples/10_use_so100.md
Original file line number Diff line number Diff line change
Expand Up @@ -218,7 +218,7 @@ python lerobot/scripts/visualize_dataset_html.py \

Now try to replay the first episode on your robot:
```bash
DATA_DIR=data python lerobot/scripts/control_robot.py replay \
python lerobot/scripts/control_robot.py replay \
--robot-path lerobot/configs/robot/so100.yaml \
--fps 30 \
--repo-id ${HF_USER}/so100_test \
Expand All @@ -229,7 +229,7 @@ DATA_DIR=data python lerobot/scripts/control_robot.py replay \

To train a policy to control your robot, use the [`python lerobot/scripts/train.py`](../lerobot/scripts/train.py) script. A few arguments are required. Here is an example command:
```bash
DATA_DIR=data python lerobot/scripts/train.py \
python lerobot/scripts/train.py \
dataset_repo_id=${HF_USER}/so100_test \
policy=act_so100_real \
env=so100_real \
Expand All @@ -245,7 +245,6 @@ Let's explain it:
3. We provided an environment as argument with `env=so100_real`. This loads configurations from [`lerobot/configs/env/so100_real.yaml`](../lerobot/configs/env/so100_real.yaml).
4. We provided `device=cuda` since we are training on a Nvidia GPU, but you can also use `device=mps` if you are using a Mac with Apple silicon, or `device=cpu` otherwise.
5. We provided `wandb.enable=true` to use [Weights and Biases](https://docs.wandb.ai/quickstart) for visualizing training plots. This is optional but if you use it, make sure you are logged in by running `wandb login`.
6. We added `DATA_DIR=data` to access your dataset stored in your local `data` directory. If you dont provide `DATA_DIR`, your dataset will be downloaded from Hugging Face hub to your cache folder `$HOME/.cache/hugginface`. In future versions of `lerobot`, both directories will be in sync.

Training should take several hours. You will find checkpoints in `outputs/train/act_so100_test/checkpoints`.

Expand Down
5 changes: 2 additions & 3 deletions examples/11_use_moss.md
Original file line number Diff line number Diff line change
Expand Up @@ -218,7 +218,7 @@ python lerobot/scripts/visualize_dataset_html.py \

Now try to replay the first episode on your robot:
```bash
DATA_DIR=data python lerobot/scripts/control_robot.py replay \
python lerobot/scripts/control_robot.py replay \
--robot-path lerobot/configs/robot/moss.yaml \
--fps 30 \
--repo-id ${HF_USER}/moss_test \
Expand All @@ -229,7 +229,7 @@ DATA_DIR=data python lerobot/scripts/control_robot.py replay \

To train a policy to control your robot, use the [`python lerobot/scripts/train.py`](../lerobot/scripts/train.py) script. A few arguments are required. Here is an example command:
```bash
DATA_DIR=data python lerobot/scripts/train.py \
python lerobot/scripts/train.py \
dataset_repo_id=${HF_USER}/moss_test \
policy=act_moss_real \
env=moss_real \
Expand All @@ -245,7 +245,6 @@ Let's explain it:
3. We provided an environment as argument with `env=moss_real`. This loads configurations from [`lerobot/configs/env/moss_real.yaml`](../lerobot/configs/env/moss_real.yaml).
4. We provided `device=cuda` since we are training on a Nvidia GPU, but you can also use `device=mps` if you are using a Mac with Apple silicon, or `device=cpu` otherwise.
5. We provided `wandb.enable=true` to use [Weights and Biases](https://docs.wandb.ai/quickstart) for visualizing training plots. This is optional but if you use it, make sure you are logged in by running `wandb login`.
6. We added `DATA_DIR=data` to access your dataset stored in your local `data` directory. If you dont provide `DATA_DIR`, your dataset will be downloaded from Hugging Face hub to your cache folder `$HOME/.cache/hugginface`. In future versions of `lerobot`, both directories will be in sync.

Training should take several hours. You will find checkpoints in `outputs/train/act_moss_test/checkpoints`.

Expand Down
3 changes: 1 addition & 2 deletions examples/7_get_started_with_real_robot.md
Original file line number Diff line number Diff line change
Expand Up @@ -868,7 +868,7 @@ Your robot should replicate movements similar to those you recorded. For example

To train a policy to control your robot, use the [`python lerobot/scripts/train.py`](../lerobot/scripts/train.py) script. A few arguments are required. Here is an example command:
```bash
DATA_DIR=data python lerobot/scripts/train.py \
python lerobot/scripts/train.py \
dataset_repo_id=${HF_USER}/koch_test \
policy=act_koch_real \
env=koch_real \
Expand Down Expand Up @@ -915,7 +915,6 @@ env:
It should match your dataset (e.g. `fps: 30`) and your robot (e.g. `state_dim: 6` and `action_dim: 6`). We are still working on simplifying this in future versions of `lerobot`.
4. We provided `device=cuda` since we are training on a Nvidia GPU, but you could use `device=mps` to train on Apple silicon.
5. We provided `wandb.enable=true` to use [Weights and Biases](https://docs.wandb.ai/quickstart) for visualizing training plots. This is optional but if you use it, make sure you are logged in by running `wandb login`.
6. We added `DATA_DIR=data` to access your dataset stored in your local `data` directory. If you dont provide `DATA_DIR`, your dataset will be downloaded from Hugging Face hub to your cache folder `$HOME/.cache/hugginface`. In future versions of `lerobot`, both directories will be in sync.

For more information on the `train` script see the previous tutorial: [`examples/4_train_policy_with_script.md`](../examples/4_train_policy_with_script.md)

Expand Down
3 changes: 1 addition & 2 deletions examples/9_use_aloha.md
Original file line number Diff line number Diff line change
Expand Up @@ -125,7 +125,7 @@ python lerobot/scripts/control_robot.py replay \

To train a policy to control your robot, use the [`python lerobot/scripts/train.py`](../lerobot/scripts/train.py) script. A few arguments are required. Here is an example command:
```bash
DATA_DIR=data python lerobot/scripts/train.py \
python lerobot/scripts/train.py \
dataset_repo_id=${HF_USER}/aloha_test \
policy=act_aloha_real \
env=aloha_real \
Expand All @@ -141,7 +141,6 @@ Let's explain it:
3. We provided an environment as argument with `env=aloha_real`. This loads configurations from [`lerobot/configs/env/aloha_real.yaml`](../lerobot/configs/env/aloha_real.yaml). Note: this yaml defines 18 dimensions for the `state_dim` and `action_dim`, corresponding to 18 motors, not 14 motors as used in previous Aloha work. This is because, we include the `shoulder_shadow` and `elbow_shadow` motors for simplicity.
4. We provided `device=cuda` since we are training on a Nvidia GPU.
5. We provided `wandb.enable=true` to use [Weights and Biases](https://docs.wandb.ai/quickstart) for visualizing training plots. This is optional but if you use it, make sure you are logged in by running `wandb login`.
6. We added `DATA_DIR=data` to access your dataset stored in your local `data` directory. If you dont provide `DATA_DIR`, your dataset will be downloaded from Hugging Face hub to your cache folder `$HOME/.cache/hugginface`. In future versions of `lerobot`, both directories will be in sync.

Training should take several hours. You will find checkpoints in `outputs/train/act_aloha_test/checkpoints`.

Expand Down
2 changes: 1 addition & 1 deletion lerobot/scripts/control_robot.py
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,7 @@
- Train on this dataset with the ACT policy:
```bash
DATA_DIR=data python lerobot/scripts/train.py \
python lerobot/scripts/train.py \
policy=act_koch_real \
env=koch_real \
dataset_repo_id=$USER/koch_pick_place_lego \
Expand Down
11 changes: 9 additions & 2 deletions lerobot/scripts/visualize_dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -207,11 +207,17 @@ def main():
required=True,
help="Episode to visualize.",
)
parser.add_argument(
"--local-files-only",
type=int,
default=0,
help="Use local files only. By default, this script will try to fetch the dataset from the hub if it exists.",
)
parser.add_argument(
"--root",
type=Path,
default=None,
help="Root directory for a dataset stored locally (e.g. `--root data`). By default, the dataset will be loaded from hugging face cache folder, or downloaded from the hub if available.",
help="Root directory for the dataset stored locally (e.g. `--root data`). By default, the dataset will be loaded from hugging face cache folder, or downloaded from the hub if available.",
)
parser.add_argument(
"--output-dir",
Expand Down Expand Up @@ -269,9 +275,10 @@ def main():
kwargs = vars(args)
repo_id = kwargs.pop("repo_id")
root = kwargs.pop("root")
local_files_only = kwargs.pop("local_files_only")

logging.info("Loading dataset")
dataset = LeRobotDataset(repo_id, root=root, local_files_only=True)
dataset = LeRobotDataset(repo_id, root=root, local_files_only=local_files_only)

visualize_dataset(dataset, **vars(args))

Expand Down
10 changes: 9 additions & 1 deletion lerobot/scripts/visualize_dataset_html.py
Original file line number Diff line number Diff line change
Expand Up @@ -234,6 +234,12 @@ def main():
required=True,
help="Name of hugging face repositery containing a LeRobotDataset dataset (e.g. `lerobot/pusht` for https://huggingface.co/datasets/lerobot/pusht).",
)
parser.add_argument(
"--local-files-only",
type=int,
default=0,
help="Use local files only. By default, this script will try to fetch the dataset from the hub if it exists.",
)
parser.add_argument(
"--root",
type=Path,
Expand Down Expand Up @@ -282,7 +288,9 @@ def main():
kwargs = vars(args)
repo_id = kwargs.pop("repo_id")
root = kwargs.pop("root")
dataset = LeRobotDataset(repo_id, root=root, local_files_only=True)
local_files_only = kwargs.pop("local_files_only")

dataset = LeRobotDataset(repo_id, root=root, local_files_only=local_files_only)
visualize_dataset_html(dataset, **kwargs)


Expand Down
2 changes: 1 addition & 1 deletion tests/test_policies.py
Original file line number Diff line number Diff line change
Expand Up @@ -383,7 +383,7 @@ def test_backward_compatibility(env_name, policy_name, extra_overrides, file_nam
include a report on what changed and how that affected the outputs.
2. Go to the `if __name__ == "__main__"` block of `tests/scripts/save_policy_to_safetensors.py` and
add the policies you want to update the test artifacts for.
3. Run `DATA_DIR=tests/data python tests/scripts/save_policy_to_safetensors.py`. The test artifact
3. Run `python tests/scripts/save_policy_to_safetensors.py`. The test artifact
should be updated.
4. Check that this test now passes.
5. Remember to restore `tests/scripts/save_policy_to_safetensors.py` to its original state.
Expand Down
4 changes: 2 additions & 2 deletions tests/test_push_dataset_to_hub.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
Example to run backward compatiblity tests locally:
```
DATA_DIR=tests/data python -m pytest --run-skipped tests/test_push_dataset_to_hub.py::test_push_dataset_to_hub_pusht_backward_compatibility
python -m pytest --run-skipped tests/test_push_dataset_to_hub.py::test_push_dataset_to_hub_pusht_backward_compatibility
```
"""

Expand Down Expand Up @@ -330,7 +330,7 @@ def test_push_dataset_to_hub_format(required_packages, tmpdir, raw_format, repo_
],
)
@pytest.mark.skip(
"Not compatible with our CI since it downloads raw datasets. Run with `DATA_DIR=tests/data python -m pytest --run-skipped tests/test_push_dataset_to_hub.py::test_push_dataset_to_hub_pusht_backward_compatibility`"
"Not compatible with our CI since it downloads raw datasets. Run with `python -m pytest --run-skipped tests/test_push_dataset_to_hub.py::test_push_dataset_to_hub_pusht_backward_compatibility`"
)
def test_push_dataset_to_hub_pusht_backward_compatibility(tmpdir, raw_format, repo_id):
_, dataset_id = repo_id.split("/")
Expand Down

0 comments on commit d6b4429

Please sign in to comment.