Thank you for your interest in contributing to marEx! This document provides guidelines for contributing to the Marine Extremes Detection and Tracking package.
- Getting Started
- Development Environment Setup
- Contribution Workflow
- Code Style Guidelines
- Testing Requirements
- Documentation Guidelines
- Release Process
- Getting Help
- Python 3.10 or higher
- Git
- Familiarity with oceanographic data analysis (helpful but not required)
- Basic understanding of xarray, Dask, and scientific Python ecosystem
We welcome various types of contributions:
- Bug fixes: Fix issues in existing code
- Feature enhancements: Improve existing functionality
- New features: Add new capabilities for marine extreme detection/tracking
- Documentation: Improve documentation, tutorials, or examples
- Performance improvements: Optimise algorithms or memory usage
- Testing: Add or improve test coverage
- Examples: Create/Share new example notebooks or workflows
# Fork the repository on GitHub, then clone your fork
git clone https://github.com/YOUR_USERNAME/marEx.git
cd marEx
# Add the upstream repository as a remote
git remote add upstream https://github.com/wienkers/marEx.gitWe recommend using conda or mamba for managing dependencies:
# Create a new environment
conda create -n marex-dev python=3.10
conda activate marex-dev
# Install the package in development mode with all dependencies
pip install -e ".[dev,full]"Alternative with pip and virtual environment:
# Create and activate virtual environment
python -m venv marex-dev
source marex-dev/bin/activate # On Windows: marex-dev\Scripts\activate
# Install development dependencies
pip install -e ".[dev,full]"Pre-commit hooks ensure code quality and consistency:
# Install pre-commit hooks
pre-commit install
# Run pre-commit on all files to check setup
pre-commit run --all-files# Run tests to ensure everything works
pytest
# Check code formatting
black --check marEx/
flake8 marEx/
# Verify imports work
python -c "import marEx; print(marEx.__version__)"# Ensure your main branch is up to date
git checkout main
git pull upstream main
# Create a new branch for your feature/fix
git checkout -b feature/your-feature-name
# or
git checkout -b fix/issue-number-description- Follow the code style guidelines
- Write tests for new functionality
- Update documentation as needed
- Keep commits focused and atomic
# Run the full test suite
pytest
# Run specific test categories
pytest -m "not slow" # Skip slow tests during development
pytest tests/test_gridded_preprocessing.py -v # Run specific test file
# Check code coverage
coverage run -m pytest
coverage report -m- Add docstrings to new functions/classes
- Update relevant documentation files
- Add examples if introducing new features
# Stage your changes
git add .
# Pre-commit hooks will run automatically
git commit -m "Add feature: brief description of changes"
# If pre-commit hooks modify files, stage and commit again
git add .
git commit -m "Apply pre-commit hook fixes"# Push your branch to your fork
git push origin feature/your-feature-name
# Create a pull request on GitHub
# Fill out the pull request template completely- Functions must accept Dask arrays: All processing functions should validate
is_dask_collection(da.data)and raise informative errors for non-Dask arrays - Memory efficiency: Strategically use
.persist()andwait()to manage the dask task graph and memory - Grid type support: Ideally, support both structured (3D: time, lat, lon) and unstructured (2D: time, cell) grids; however, focus on structured grids for initial development
- Docstrings: All public functions/classes should have comprehensive docstrings
- Type hints: Add type hints where practical
- Examples: Include usage examples in docstrings for complex functions
- Parameter documentation: Document all parameters, their types, and expected values
marEx uses pytest with the following test organisation:
tests/
├── __init__.py
├── conftest.py # Shared fixtures and configuration
├── test_gridded_preprocessing.py # Gridded data preprocessing tests
├── test_gridded_tracking.py # Gridded data tracking tests
├── test_unstructured_preprocessing.py # Unstructured data tests
├── test_plotx.py # Plotting functionality tests
├── test_integration.py # Integration tests
└── data/ # Test data files
Tests are organised using pytest markers:
@pytest.mark.slow: Computationally expensive tests (skip with-m "not slow")@pytest.mark.integration: End-to-end workflow tests
- Test new functionality: All new features must have corresponding tests
- Test edge cases: Include tests for boundary conditions and error cases
- Test with both grid types: Test structured and unstructured data where applicable
- Use fixtures: Leverage existing fixtures for common test data
- Mock external dependencies: Use mocking for expensive operations or external services
# Run all tests
pytest
# Run specific test suites (as in CI)
pytest tests/test_gridded_preprocessing.py -v --tb=short
pytest tests/test_unstructured_preprocessing.py -v --tb=short
pytest tests/test_gridded_tracking.py -v --tb=short
# Skip slow tests during development
pytest -m "not slow"
# Run only integration tests
pytest -m integration
# Run with coverage
coverage run -m pytest
coverage report -m
coverage html # Generate HTML coverage reportUse the existing test fixtures in conftest.py:
def test_my_function(dask_client, sample_sst_data):
"""Test using shared fixtures."""
with dask_client:
result = my_function(sample_sst_data)
assert result is not NonemarEx documentation is built with Sphinx and includes:
- API documentation: Auto-generated from docstrings
- User guides: Step-by-step tutorials
- Examples: Jupyter notebooks demonstrating workflows
- Development documentation: This file and related developer resources
cd docs/
make html # Build HTML documentation
make clean # Clean build artifacts
# View documentation
open _build/html/index.html # macOS
xdg-open _build/html/index.html # Linux- Docstring format: Use NumPy-style docstrings
- Examples in docstrings: Include working code examples
- Cross-references: Use Sphinx cross-references for linking
- Jupyter notebooks: Keep example notebooks clean and well-documented
- User-focused: Write documentation from the user's perspective
- API changes: Docstrings are automatically included
- New features: Add examples to relevant user guide sections
- Tutorials: Create new Jupyter notebooks in
docs/tutorials/
marEx uses setuptools_scm for automatic versioning based on git tags:
- Development versions:
0.2.0.dev10+g1234567(based on commits since last tag) - Release versions:
0.2.0(based on git tags)
-
Prepare release:
# Update CHANGELOG.md with new version # Ensure all tests pass pytest # Check documentation builds cd docs/ && make html
-
Create release:
# Tag the release git tag -a v0.2.0 -m "Release version 0.2.0" git push upstream v0.2.0
-
Build and distribute:
# Build package python -m build # Upload to PyPI (maintainers only) twine upload dist/*
- All tests pass on all supported Python versions
- Documentation builds without warnings
- Version tag created and pushed
- GitHub release created with release notes
- Package uploaded to PyPI
- Documentation: https://marex.readthedocs.io/
- GitHub Issues: Report bugs or request features
- GitHub Discussions: Ask questions or discuss ideas
- Email: Contact maintainers for sensitive issues
When reporting issues:
- Use the provided issue templates
- Include minimal reproducible examples
- Provide environment information (Python version, OS, package versions)
- Include full error messages and tracebacks
When requesting features:
- Describe the scientific use case
- Explain why existing functionality doesn't meet your needs
- Provide examples of the desired API
- Consider contributing the implementation
Contributors are recognised in:
- GitHub contributors list
- Release notes
- Documentation acknowledgments
- Academic publications (for significant contributions)
Thank you for contributing to marEx!