Contributing to marEx

Thank you for your interest in contributing to marEx! This document provides guidelines for contributing to the Marine Extremes Detection and Tracking package.

Getting Started
Development Environment Setup
Contribution Workflow
Code Style Guidelines
Testing Requirements
Documentation Guidelines
Release Process
Getting Help

Getting Started

Prerequisites

Python 3.10 or higher
Git
Familiarity with oceanographic data analysis (helpful but not required)
Basic understanding of xarray, Dask, and scientific Python ecosystem

Types of Contributions

We welcome various types of contributions:

Bug fixes: Fix issues in existing code
Feature enhancements: Improve existing functionality
New features: Add new capabilities for marine extreme detection/tracking
Documentation: Improve documentation, tutorials, or examples
Performance improvements: Optimise algorithms or memory usage
Testing: Add or improve test coverage
Examples: Create/Share new example notebooks or workflows

Development Environment Setup

1. Fork and Clone the Repository

# Fork the repository on GitHub, then clone your fork
git clone https://github.com/YOUR_USERNAME/marEx.git
cd marEx

# Add the upstream repository as a remote
git remote add upstream https://github.com/wienkers/marEx.git

2. Create a Development Environment

We recommend using conda or mamba for managing dependencies:

# Create a new environment
conda create -n marex-dev python=3.10
conda activate marex-dev

# Install the package in development mode with all dependencies
pip install -e ".[dev,full]"

Alternative with pip and virtual environment:

# Create and activate virtual environment
python -m venv marex-dev
source marex-dev/bin/activate  # On Windows: marex-dev\Scripts\activate

# Install development dependencies
pip install -e ".[dev,full]"

3. Install Pre-commit Hooks

Pre-commit hooks ensure code quality and consistency:

# Install pre-commit hooks
pre-commit install

# Run pre-commit on all files to check setup
pre-commit run --all-files

4. Verify Installation

# Run tests to ensure everything works
pytest

# Check code formatting
black --check marEx/
flake8 marEx/

# Verify imports work
python -c "import marEx; print(marEx.__version__)"

Contribution Workflow

1. Create a Feature Branch

# Ensure your main branch is up to date
git checkout main
git pull upstream main

# Create a new branch for your feature/fix
git checkout -b feature/your-feature-name
# or
git checkout -b fix/issue-number-description

2. Make Your Changes

Follow the code style guidelines
Write tests for new functionality
Update documentation as needed
Keep commits focused and atomic

3. Test Your Changes

# Run the full test suite
pytest

# Run specific test categories
pytest -m "not slow"  # Skip slow tests during development
pytest tests/test_gridded_preprocessing.py -v  # Run specific test file

# Check code coverage
coverage run -m pytest
coverage report -m

4. Update Documentation

Add docstrings to new functions/classes
Update relevant documentation files
Add examples if introducing new features

5. Commit Your Changes

# Stage your changes
git add .

# Pre-commit hooks will run automatically
git commit -m "Add feature: brief description of changes"

# If pre-commit hooks modify files, stage and commit again
git add .
git commit -m "Apply pre-commit hook fixes"

6. Push and Create Pull Request

# Push your branch to your fork
git push origin feature/your-feature-name

# Create a pull request on GitHub
# Fill out the pull request template completely

Code Style Guidelines

Code Quality Standards

Functions must accept Dask arrays: All processing functions should validate is_dask_collection(da.data) and raise informative errors for non-Dask arrays
Memory efficiency: Strategically use .persist() and wait() to manage the dask task graph and memory
Grid type support: Ideally, support both structured (3D: time, lat, lon) and unstructured (2D: time, cell) grids; however, focus on structured grids for initial development

Documentation Standards

Docstrings: All public functions/classes should have comprehensive docstrings
Type hints: Add type hints where practical
Examples: Include usage examples in docstrings for complex functions
Parameter documentation: Document all parameters, their types, and expected values

Testing Requirements

Test Structure

marEx uses pytest with the following test organisation:

tests/
├── __init__.py
├── conftest.py                    # Shared fixtures and configuration
├── test_gridded_preprocessing.py  # Gridded data preprocessing tests
├── test_gridded_tracking.py       # Gridded data tracking tests
├── test_unstructured_preprocessing.py  # Unstructured data tests
├── test_plotx.py                  # Plotting functionality tests
├── test_integration.py            # Integration tests
└── data/                          # Test data files

Test Categories

Tests are organised using pytest markers:

@pytest.mark.slow: Computationally expensive tests (skip with -m "not slow")
@pytest.mark.integration: End-to-end workflow tests

Writing Tests

Test Requirements

Test new functionality: All new features must have corresponding tests
Test edge cases: Include tests for boundary conditions and error cases
Test with both grid types: Test structured and unstructured data where applicable
Use fixtures: Leverage existing fixtures for common test data
Mock external dependencies: Use mocking for expensive operations or external services

Running Tests

# Run all tests
pytest

# Run specific test suites (as in CI)
pytest tests/test_gridded_preprocessing.py -v --tb=short
pytest tests/test_unstructured_preprocessing.py -v --tb=short
pytest tests/test_gridded_tracking.py -v --tb=short

# Skip slow tests during development
pytest -m "not slow"

# Run only integration tests
pytest -m integration

# Run with coverage
coverage run -m pytest
coverage report -m
coverage html  # Generate HTML coverage report

Test Data

Use the existing test fixtures in conftest.py:

def test_my_function(dask_client, sample_sst_data):
    """Test using shared fixtures."""
    with dask_client:
        result = my_function(sample_sst_data)
        assert result is not None

Documentation Guidelines

Documentation Structure

marEx documentation is built with Sphinx and includes:

API documentation: Auto-generated from docstrings
User guides: Step-by-step tutorials
Examples: Jupyter notebooks demonstrating workflows
Development documentation: This file and related developer resources

Building Documentation

cd docs/
make html  # Build HTML documentation
make clean  # Clean build artifacts

# View documentation
open _build/html/index.html  # macOS
xdg-open _build/html/index.html  # Linux

Documentation Standards

Docstring format: Use NumPy-style docstrings
Examples in docstrings: Include working code examples
Cross-references: Use Sphinx cross-references for linking
Jupyter notebooks: Keep example notebooks clean and well-documented
User-focused: Write documentation from the user's perspective

Adding New Documentation

API changes: Docstrings are automatically included
New features: Add examples to relevant user guide sections
Tutorials: Create new Jupyter notebooks in docs/tutorials/

Release Process

Version Management

marEx uses setuptools_scm for automatic versioning based on git tags:

Development versions: 0.2.0.dev10+g1234567 (based on commits since last tag)
Release versions: 0.2.0 (based on git tags)

Release Workflow

Prepare release:

# Update CHANGELOG.md with new version
# Ensure all tests pass
pytest

# Check documentation builds
cd docs/ && make html

Create release:

# Tag the release
git tag -a v0.2.0 -m "Release version 0.2.0"
git push upstream v0.2.0

Build and distribute:

# Build package
python -m build

# Upload to PyPI (maintainers only)
twine upload dist/*

Release Checklist

All tests pass on all supported Python versions
Documentation builds without warnings
Version tag created and pushed
GitHub release created with release notes
Package uploaded to PyPI

Getting Help

Resources

Documentation: https://marex.readthedocs.io/
GitHub Issues: Report bugs or request features
GitHub Discussions: Ask questions or discuss ideas
Email: Contact maintainers for sensitive issues

Issue Reporting

When reporting issues:

Use the provided issue templates
Include minimal reproducible examples
Provide environment information (Python version, OS, package versions)
Include full error messages and tracebacks

Feature Requests

When requesting features:

Describe the scientific use case
Explain why existing functionality doesn't meet your needs
Provide examples of the desired API
Consider contributing the implementation

Recognition

Contributors are recognised in:

GitHub contributors list
Release notes
Documentation acknowledgments
Academic publications (for significant contributions)

Thank you for contributing to marEx!

FilesExpand file tree

CONTRIBUTING.md

Latest commit

History

CONTRIBUTING.md

File metadata and controls

Contributing to marEx

Table of Contents

Getting Started

Prerequisites

Types of Contributions

Development Environment Setup

1. Fork and Clone the Repository

2. Create a Development Environment

3. Install Pre-commit Hooks

4. Verify Installation

Contribution Workflow

1. Create a Feature Branch

2. Make Your Changes

3. Test Your Changes

4. Update Documentation

5. Commit Your Changes

6. Push and Create Pull Request

Code Style Guidelines

Code Quality Standards

Documentation Standards

Testing Requirements

Test Structure

Test Categories

Writing Tests

Test Requirements

Running Tests

Test Data

Documentation Guidelines

Documentation Structure

Building Documentation

Documentation Standards

Adding New Documentation

Release Process

Version Management

Release Workflow

Release Checklist

Getting Help

Resources

Issue Reporting

Feature Requests

Recognition