This repository has been archived by the owner on Jan 10, 2025. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
7 changed files
with
92 additions
and
23 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,4 @@ | ||
.. _dataset-filters: | ||
.. _filters: | ||
|
||
######### | ||
Filters | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,42 +1,41 @@ | ||
.. _dataset-operations: | ||
.. _operations: | ||
|
||
############ | ||
Operations | ||
############ | ||
|
||
Operations are blocks of YAML code that translates a list of dates into | ||
fields. | ||
|
||
****** | ||
join | ||
****** | ||
|
||
The join is the process of combining several sources data. Each | ||
source is expected to provide different variables at the same dates. | ||
The join is the process of combining several sources data. Each source | ||
is expected to provide different variables at the same dates. | ||
|
||
.. literalinclude:: input.yaml | ||
:language: yaml | ||
|
||
|
||
:language: yaml | ||
|
||
******** | ||
concat | ||
******** | ||
|
||
The concatenation is the process of combining different sets of | ||
operation that handle different dates. This is typically used to | ||
build a dataset that spans several years, when the several sources | ||
are involved, each providing a different period. | ||
operation that handle different dates. This is typically used to build a | ||
dataset that spans several years, when the several sources are involved, | ||
each providing a different period. | ||
|
||
.. literalinclude:: concat.yaml | ||
:language: yaml | ||
|
||
:language: yaml | ||
|
||
****** | ||
pipe | ||
****** | ||
|
||
The pipe is the process of transforming fields using filters. The | ||
first step of a pipe is typically a source, a join or another pipe. | ||
The following steps are filters. | ||
|
||
The pipe is the process of transforming fields using filters. The first | ||
step of a pipe is typically a source, a join or another pipe. The | ||
following steps are filters. | ||
|
||
.. literalinclude:: pipe.yaml | ||
:language: yaml | ||
:language: yaml |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,4 @@ | ||
.. _dataset-sources: | ||
.. _sources: | ||
|
||
######### | ||
Sources | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,5 @@ | ||
.. _overview: | ||
|
||
########## | ||
Overview | ||
########## | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,61 @@ | ||
.. _using-introduction: | ||
|
||
############## | ||
Introduction | ||
############## | ||
|
||
.. warning:: | ||
|
||
The code below still mentions the old name of the package, | ||
`ecml_tools`. This will be updated once the package is renamed to | ||
`anemoi-datasets`. | ||
|
||
An *Anemoi* dataset is a thin wrapper around a zarr_ store that is | ||
optimised for training data-driven weather forecasting models. It is | ||
organised in such a way that I/O operations are minimised (see | ||
:ref:`overview`). | ||
|
||
.. _zarr: https://zarr.readthedocs.io/ | ||
|
||
To open a dataset, you can use the `open_dataset` function. | ||
|
||
.. code:: python | ||
from anemoi_datasets import open_dataset | ||
ds = open_dataset("path/to/dataset.zarr") | ||
You can then access the data in the dataset using the `ds` object as if | ||
it was a NumPy array. | ||
|
||
.. code:: python | ||
print(ds.shape) | ||
print(len(ds)) | ||
print(ds[0]) | ||
print(ds[10:20]) | ||
One of the main feature of the *anemoi-datasets* package is the ability | ||
to subset or combine datasets. | ||
|
||
.. code:: python | ||
from anemoi_datasets import open_dataset | ||
ds = open_dataset("path/to/dataset.zarr", start=2000, end=2020) | ||
In that case, a dataset is created that only contains the data between | ||
the years 2000 and 2020. Combining is done by passing multiple paths to | ||
the `open_dataset` function: | ||
|
||
.. code:: python | ||
from anemoi_datasets import open_dataset | ||
ds = open_dataset("path/to/dataset1.zarr", "path/to/dataset2.zarr") | ||
In the latter case, the datasets are combined along the time dimension | ||
or the variable dimension depending on the datasets structure. |