Skip to content

rom-py/rompy-binary-datasources

Repository files navigation

rompy-binary-datasources

Documentation Status Updates

Overview

The rompy_binary_datasources package is an extension to the main rompy package, providing specialized source classes for handling binary data formats like pandas DataFrames and xarray Datasets. These data source classes are separated from the main package to avoid issues with OpenAPI schema generation while ensuring complete functionality for users who need to work with these data formats.

Integration with rompy

This package seamlessly integrates with the main rompy package through a carefully designed import stub mechanism. When users attempt to use the binary data source classes from the main package, they receive helpful error messages directing them to install this package.

Key features:

  • Transparent integration: Classes behave as if they were part of the main package
  • Helpful error messages: Clear guidance on how to install this package when needed
  • Backward compatibility: Maintains existing import paths for code that expects these classes in the main package

Installation

To install rompy_binary_datasources:

$ pip install rompy_binary_datasources

The package is designed to be used alongside the main rompy package:

$ pip install rompy rompy_binary_datasources

Classes Provided

SourceDataset

A source class for wrapping existing xarray Dataset objects:

import xarray as xr
from rompy_binary_datasources import SourceDataset

# Create a dataset
ds = xr.Dataset(...)

# Wrap it in a SourceDataset
source = SourceDataset(obj=ds)

# Use it in rompy workflows
# ...

SourceTimeseriesDataFrame

A source class for wrapping pandas DataFrame timeseries objects:

import pandas as pd
from rompy_binary_datasources import SourceTimeseriesDataFrame

# Create a timeseries DataFrame
df = pd.DataFrame(...)
df.index = pd.DatetimeIndex(...)
df.index.name = "time"

# Wrap it in a SourceTimeseriesDataFrame
source = SourceTimeseriesDataFrame(obj=df)

# Use it in rompy workflows
# ...

Usage Examples

Working with in-memory datasets

import numpy as np
import pandas as pd
import xarray as xr
from rompy_binary_datasources import SourceDataset
from rompy.core.data import Data

# Create an xarray Dataset
times = pd.date_range("2023-01-01", "2023-01-10", freq="1D")
lats = np.linspace(-90, 90, 19)
lons = np.linspace(-180, 180, 37)

ds = xr.Dataset(
    data_vars={
        "temperature": (["time", "lat", "lon"], np.random.rand(len(times), len(lats), len(lons))),
    },
    coords={
        "time": times,
        "lat": lats,
        "lon": lons,
    }
)

# Wrap in SourceDataset
source = SourceDataset(obj=ds)

# Use in rompy Data object
data = Data(
    variables=["temperature"],
    source=source
)

# Use the data in rompy workflows
# ...

Working with timeseries DataFrames

import pandas as pd
import numpy as np
from rompy_binary_datasources import SourceTimeseriesDataFrame
from rompy.core.data import Data

# Create a timeseries DataFrame
times = pd.date_range("2023-01-01", "2023-01-10", freq="1H")
df = pd.DataFrame({
    "temperature": np.random.rand(len(times)),
    "humidity": np.random.rand(len(times)) * 100,
}, index=times)
df.index.name = "timestamp"  # Must have a name

# Wrap in SourceTimeseriesDataFrame
source = SourceTimeseriesDataFrame(obj=df)

# Use in rompy Data object
data = Data(
    variables=["temperature", "humidity"],
    source=source
)

# Use the data in rompy workflows
# ...

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

Free software: BSD license

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

No packages published