diff --git a/docs/design_docs/index.md b/docs/design_docs/index.md index 525594c31d..136dc3f6a6 100644 --- a/docs/design_docs/index.md +++ b/docs/design_docs/index.md @@ -6,6 +6,11 @@ :titlesonly: true shared_steps +unified_base_mesh +unified_mesh_prepare_coastline +unified_mesh_prepare_river_network +unified_mesh_build_sizing_field +unified_mesh_create_base_mesh vector_reconstruction template ``` diff --git a/docs/design_docs/unified_base_mesh.md b/docs/design_docs/unified_base_mesh.md new file mode 100644 index 0000000000..801c73c5d7 --- /dev/null +++ b/docs/design_docs/unified_base_mesh.md @@ -0,0 +1,784 @@ +# Unified Mesh: Global Base Mesh Workflow + +date: 2026/04/13 + +Contributors: + +- Xylar Asay-Davis +- Codex + +## Summary + +This design proposes a Polaris workflow for creating a global, spherical MPAS +base mesh for the E3SM land, river, ocean and sea-ice models using JIGSAW. The +starting point is the work-in-progress +[`mpas_land_mesh`](https://github.com/changliao1025/mpas_land_mesh) +package, which currently combines geospatial preprocessing, JSON-based +configuration, JIGSAW setup, mesh generation and ad hoc job creation in a +single standalone workflow. + +The Polaris implementation should preserve the relevant parts of that workflow +while translating them into shared steps, Polaris configuration files and +existing MPAS/JIGSAW infrastructure. In particular, the design should reuse +existing functionality in `polaris.mesh`, `mpas_tools` and the existing +`e3sm/init` topography remap and cull tasks wherever practical, rather than carrying +forward the standalone workflow's JSON configuration system or broad utility +modules. + +The initial focus is a feature-aware global base mesh whose resolution can be +informed by coastline and river-network data and whose output is directly +usable by downstream Polaris tasks such as `e3sm/init` topography remapping and mesh +culling. This design began as a first draft because +[`mpas_land_mesh`](https://github.com/changliao1025/mpas_land_mesh) +is still evolving, but the stage-specific Polaris workflow pieces described +here are now implemented: + +- coastline preparation, Polaris pull request + ; +- river-network preparation, Polaris pull request + ; and +- sizing-field construction, Polaris pull request + . + +The document therefore records both the intended workflow interfaces and the +current implementation state. Polaris now has shared coastline, +river-network, sizing-field, and base-mesh steps together with standalone +unified base-mesh tasks and explicit downstream remap and cull task variants. + +This document should be treated as an umbrella design for the overall workflow. +As the work is refined, we expect to add more focused design documents for +stages such as `prepare_coastline`, `prepare_river_network`, +`build_sizing_field`, and the final base-mesh stage. These stage names are +only working names for now and should not be treated as final task, step, class +or component names. + +The stage-level shared products should be built on a small set of supported +regular lon/lat target grids rather than on arbitrary default resolutions. A +short list of supported target-grid tiers is likely important for caching and +reuse of expensive shared steps such as coastline preparation, +river-network preparation, and topography remapping. + +Success means that Polaris gains a documented path to build a global MPAS base +mesh with feature-aware resolution controls, using Polaris-native setup and run +machinery, and that the resulting mesh can be consumed by existing downstream +E3SM workflows without an extra conversion stage. + +## Detailed Stage Designs + +The four stage-specific unified-mesh design documents provide the detailed +workflow design that this umbrella document summarizes: + +- [Unified Mesh: Coastline Preparation](unified_mesh_prepare_coastline.md) + describes the shared coastline-preparation workflow. +- [Unified Mesh: River Network Preparation](unified_mesh_prepare_river_network.md) + describes the shared river-network preprocessing workflow. +- [Unified Mesh: Sizing-Field Construction](unified_mesh_build_sizing_field.md) + describes how the upstream shared products are combined into the raster + sizing field used for final mesh generation. +- [Unified Mesh: Base-Mesh Creation and Downstream Integration](unified_mesh_create_base_mesh.md) + describes the final base-mesh stage together with downstream topography + remap and land or ocean culling variants. + +## Requirements + +### Requirement: Global Spherical MPAS Base Mesh + +Date last modified: 2026/04/10 + +Contributors: + +- Xylar Asay-Davis +- Codex + +Polaris shall support creation of a global, spherical MPAS base mesh suitable +for the needs of the E3SM land, river, ocean and sea-ice models. + +The workflow shall support meshes whose resolution varies spatially in response +to model needs rather than being limited to quasi-uniform meshes. + +The primary output of the workflow shall be an MPAS mesh in standard MPAS form. + +### Requirement: Downstream E3SM Interoperability + +Date last modified: 2026/04/10 + +Contributors: + +- Xylar Asay-Davis +- Codex + +The generated base mesh shall be usable as input to downstream E3SM and +Polaris tools, including the existing topography remap and cull workflows. + +The workflow shall not require a separate ad hoc conversion step before the +mesh can be passed to those downstream tools. + +### Requirement: Feature-Aware Resolution Control + +Date last modified: 2026/04/10 + +Contributors: + +- Xylar Asay-Davis +- Codex + +The workflow shall support resolution control based on geospatial features that +are important for a unified land-river-ocean mesh. At a minimum, the first +implementation shall support coastline and river-network information. + +The design shall allow additional feature classes such as watershed +boundaries, lakes or dams to be added later without redesigning the full +workflow. + +### Requirement: Shared Target-Grid Tiers and Cacheable Preprocessing + +Date last modified: 2026/04/10 + +Contributors: + +- Xylar Asay-Davis +- Codex + +The shared preprocessing stages of the workflow shall operate on a small +discrete set of supported regular lon/lat target grids rather than on an +arbitrary default resolution. + +Within a given workflow instance, the same selected target-grid tier shall be +used consistently by `prepare_coastline`, `prepare_river_network`, and +`build_sizing_field`. + +The first design should favor a short supported list, likely two or three +tiers, so shared-step outputs can be cached and reused effectively. + +### Requirement: Polaris-Native Configuration and Execution + +Date last modified: 2026/04/10 + +Contributors: + +- Xylar Asay-Davis +- Codex + +The workflow shall be expressed as Polaris steps and tasks and configured with +Polaris' ini-style configuration files. + +The workflow shall support standard Polaris setup, shared-step reuse, +provenance and machine execution patterns. + +### Requirement: Selective Migration and Maintainability + +Date last modified: 2026/04/10 + +Contributors: + +- Xylar Asay-Davis +- Codex + +The Polaris implementation shall prefer existing Polaris, `mpas_tools`, +JIGSAW and conda-forge capabilities wherever practical. + +Migration from +[`mpas_land_mesh`](https://github.com/changliao1025/mpas_land_mesh) +shall focus on the specific algorithms and +helpers needed for the Polaris workflow rather than wholesale reuse of general +utility modules or standalone workflow infrastructure. + +## Algorithm Design + +### Algorithm Design: Global Spherical MPAS Base Mesh + +Date last modified: 2026/04/10 + +Contributors: + +- Xylar Asay-Davis +- Codex + +The existing spherical JIGSAW workflow in `polaris.mesh` should be the starting +point for the new capability. The current `SphericalBaseStep` already handles +the parts of the workflow that are generic to MPAS spherical mesh generation: +writing the JIGSAW inputs, invoking JIGSAW, converting the JIGSAW triangles to +an MPAS mesh, updating MPAS fields such as `cellWidth`, and creating +`graph.info`. + +The unified base-mesh workflow should therefore focus on creating the +feature-aware mesh-spacing description rather than replacing the existing +JIGSAW-to-MPAS path. In the simplest formulation, the workflow builds a +global lon/lat-based sizing field and then reuses the existing spherical mesh +step to convert that sizing field into a JIGSAW mesh and finally into MPAS +form. + +This keeps the core mesh-generation algorithm close to existing Polaris +patterns and minimizes the amount of new meshing infrastructure that must be +maintained on the E3SM timeline. + +### Algorithm Design: Downstream E3SM Interoperability + +Date last modified: 2026/04/10 + +Contributors: + +- Xylar Asay-Davis +- Codex + +The output contract for the workflow should be aligned with what downstream +Polaris tasks already consume. The immediate target is the standard MPAS base +mesh plus associated graph file used by the existing E3SM topography remap and +cull tasks. + +Because the remap and cull tasks already operate on `base_mesh.nc`, the design +should treat that file as the primary authoritative output. Any additional +intermediate products needed for land or river workflows, such as cleaned +feature vectors or rasterized masks, should remain separate artifacts rather +than becoming a replacement mesh format. + +This requirement argues for producing a standard base mesh first and layering +additional land/river products around it, not embedding workflow-specific +assumptions into the base-mesh format. + +### Algorithm Design: Feature-Aware Resolution Control + +Date last modified: 2026/04/10 + +Contributors: + +- Xylar Asay-Davis +- Codex + +The feature-aware part of the workflow should be decomposed into two stages: +feature preprocessing and sizing-field construction. + +Feature preprocessing converts raw source datasets into cleaned global inputs +that are stable enough to drive mesh sizing. Based on the current +[`mpas_land_mesh`](https://github.com/changliao1025/mpas_land_mesh) +workflow, the first supported sources should be: + +- a coastline mask derived first from the existing `e3sm/init/topo` + topography product and its land/ocean masking logic, so the unified mesh + uses the same coastline interpretation as downstream topography remap and + cull workflows. A Natural Earth-derived coastline should remain available as + a fallback if the topo-derived coastline proves unsuitable, and +- a simplified global river network derived from HydroRIVERS or an equivalent + source. + +Sizing-field construction then combines a baseline resolution with local +refinement targets derived from those preprocessed features. The precise blend +function can evolve, but the first implementation should be framed as a global +sizing field on a regular lon/lat grid because that matches the existing +Polaris spherical JIGSAW workflow. + +For coastline-driven refinement, a signed-distance formulation on the sphere +should be considered the preferred first approach. If a coastline curve or +region can be derived cleanly from the `e3sm/init/topo` land/ocean +interpretation, `mpas_tools.mesh.creation.signed_distance` or a closely +related method can be used to build smooth coastal transition zones and inland +or oceanward buffers directly from spherical geometry. This approach is +promising because it matches existing Polaris mesh patterns and may avoid some +of the raster-buffer and antimeridian-complexity present in the standalone +workflow. + +The design should assume that coastline and river controls are modular inputs +to the sizing-field builder. Additional controls for watersheds, lakes or dams +should enter through the same interface rather than through new one-off mesh +builders. + +### Algorithm Design: Shared Target-Grid Tiers and Cacheable Preprocessing + +Date last modified: 2026/04/22 + +Contributors: + +- Xylar Asay-Davis +- Codex + +The workflow should standardize on a small set of supported regular lon/lat +target-grid resolutions for all shared preprocessing products. These are: + +- 0.25 degree; +- 0.125 degree; +- 0.0625 degree; and +- 0.03125 degree. + +This range is expected to cover needs from quick testing that is not +scientifically validated (unified meshes with coarser than ~120 km resolution) +to our highest resolutions (finer than ~5 km). + +The selected target-grid tier should be a cross-cutting workflow choice. It +should control the resolution used for shared `e3sm/init/topo/combine` +lat/lon products, coastline preprocessing, river-network preprocessing, and +the final sizing field. This avoids mismatched products between stages and +makes cache reuse straightforward. + +The design should not prevent future support for custom target-grid +resolutions. However, arbitrary resolutions should not be the default +workflow path until there is a clear need, because they weaken cache reuse and +make the shared-step product space harder to manage. + +### Algorithm Design: Polaris-Native Configuration and Execution + +Date last modified: 2026/04/10 + +Contributors: + +- Xylar Asay-Davis +- Codex + +The standalone workflow currently uses JSON templates, repeated key mutation +and explicit HPC-script generation. Polaris already has suitable +abstractions: config sections, shared steps, cached outputs, work-directory +layout and machine-aware job submission. + +The algorithmic structure of the new workflow should therefore be a dependency +graph of Polaris steps, not a mutable configuration file plus a generated +driver script. A natural decomposition is: + +1. preprocess coastline inputs; +2. preprocess river-network inputs; +3. assemble a unified sizing field; +4. generate the spherical JIGSAW mesh and convert it to MPAS form; and +5. optionally pass the base mesh into downstream remap and cull tasks. + +This step decomposition matches Polaris' execution model and supports reuse of +shared expensive products across multiple tasks. + +### Algorithm Design: Selective Migration and Maintainability + +Date last modified: 2026/04/10 + +Contributors: + +- Xylar Asay-Davis +- Codex + +The migration strategy should begin with an audit of +[`mpas_land_mesh`](https://github.com/changliao1025/mpas_land_mesh) +capabilities grouped into three categories: + +- functionality already available in Polaris or `mpas_tools`, +- functionality available from direct use of conda-forge packages, and +- functionality that truly requires targeted extraction or reimplementation. + +The current standalone package includes broad helper modules such as +`utilities/vector.py`, JSON configuration managers and job-script generators. +Those are useful in the standalone context but should not be treated as the +default implementation strategy in Polaris. + +Instead, new shared helpers should be introduced only when a focused algorithm +cannot be expressed clearly with existing package APIs or current Polaris +utilities. This keeps the eventual Polaris implementation smaller, easier to +review and more adaptable as +[`mpas_land_mesh`](https://github.com/changliao1025/mpas_land_mesh) +continues to change. + +River-network simplification and river-driven meshing deserve special caution +in this migration strategy. Because that part of the workflow is the least +well-understood, the first Polaris design should preserve the corresponding +[`mpas_land_mesh`](https://github.com/changliao1025/mpas_land_mesh) +algorithms more closely than the coastline path whenever practical. + +## Implementation + +### Implementation: Global Spherical MPAS Base Mesh + +Date last modified: 2026/04/27 + +Contributors: + +- Xylar Asay-Davis +- Codex + +The current implementation uses `UnifiedBaseMeshStep` in +`polaris.mesh.spherical.unified.base_mesh`. This subclass of +`QuasiUniformSphericalMeshStep` overrides `build_cell_width_lat_lon()` to read +`cellWidth`, `lon`, and `lat` directly from `sizing_field.nc`, and it links the +upstream `clipped_river_network.geojson` product for direct JIGSAW geometry +constraints. + +This keeps the implementation on the existing `SphericalBaseStep.run()` path +for JIGSAW invocation, conversion to MPAS form and graph-file creation. The +expected output naming remains `base_mesh.nc` as the primary mesh product and +`graph.info` produced alongside it. + +The task wiring is also now implemented. `get_unified_base_mesh_steps()` builds +the full shared-step chain for one named unified mesh, and `BaseMeshTask` +provides the standalone task wrapper that runs that chain end to end. + +### Implementation: Downstream E3SM Interoperability + +Date last modified: 2026/04/27 + +Contributors: + +- Xylar Asay-Davis +- Codex + +The new unified base-mesh step is shaped so it can be passed directly to +existing E3SM tasks that expect a `SphericalBaseStep`-like dependency. In +practice, this means keeping the same mesh and graph-file outputs and the same +basic interface expected by the current remap and cull tasks. + +The design should avoid introducing a special mesh post-processing task whose +only purpose is to translate the new workflow back into the format already +expected by `polaris.tasks.e3sm.init.topo.remap` and +`polaris.tasks.e3sm.init.topo.cull`. + +That connection is now implemented. The `add_remap_topo_tasks()` and +`add_cull_topo_tasks()` factories iterate over the named unified meshes, +retrieve the shared unified base-mesh steps, and register explicit remap and +cull tasks without an extra mesh translation stage. + +### Implementation: Feature-Aware Resolution Control + +Date last modified: 2026/04/27 + +Contributors: + +- Xylar Asay-Davis +- Codex + +The current step decomposition is now implemented: + +- `prepare_coastline`: derive a coastline representation suitable for mesh + refinement, using the `e3sm/init/topo` coastline as the first-choice source + and Natural Earth as a fallback; +- `prepare_river_network`: simplify and filter a global river dataset into + source-level products, target-grid-ready products, and mesh-conditioned river + products for final cell-placement control; +- `build_sizing_field`: combine baseline ocean and land resolution choices with + coastline and river refinement controls on a global lon/lat grid, ideally + using signed-distance fields where that simplifies the definition of + transition zones and buffers; and +- `unified_base_mesh`: consume the sizing field and the mesh-conditioned river + geometry and create the MPAS base mesh. + +Related enabling work is already in place on the sibling +`add-lat-lon-topo-combine` branch, which adds shared lat-lon +`e3sm/init/topo/combine` support and extends `CombineStep` accordingly. That +branch does not implement the unified-mesh workflow itself, but it reduces +risk for the shared target-grid preprocessing described here. See Polaris pull +request . + +The stage-level implementations now make the intended sequence concrete. The +coastline workflow provides shared lat-lon coastline products on the supported +target-grid resolutions. The river workflow provides shared source-level, +target-grid, and mesh-conditioned preprocessing products that can be +coordinated with those coastline products. The sizing-field workflow consumes +the target-grid coastline and river products for each named unified mesh and +writes a JIGSAW-ready `sizing_field.nc`. The base-mesh workflow then consumes +that sizing field together with the mesh-conditioned river geometry. This makes +the implemented sequence: + +1. build or reuse shared lat-lon topography products; +2. derive coastline products on the selected shared target grid; +3. prepare river-network products that use the same shared target grid and a + consistent coastline interpretation, including mesh-conditioned river + geometry for the final stage; +4. combine coastline and river refinement signals into a unified sizing field; + and +5. read that sizing field through `UnifiedBaseMeshStep` while passing the + conditioned river geometry into the final spherical base-mesh generation. + +Coastline, river and sizing-field preprocessing now fit the same shared-step +pattern and exchange products on a common target grid. + +Even if the final implementation uses several shared steps, Polaris should +present them as one coherent workflow rather than as unrelated utilities. The +cleanest first implementation is to keep the shared steps together under the +`mesh` component in one common subtree such as +`mesh/spherical/unified/...`. That mirrors existing Polaris practice where a +shared base-mesh step lives in the `mesh` component and tasks provide a thin +wrapper around it. A separate `river` component would make more sense only if +Polaris later grows river-focused workflows that stand on their own apart from +base-mesh generation. + +As this workflow matures, more targeted design documents should be added for +the stage-level algorithms and interfaces, especially `prepare_coastline`, +`prepare_river_network`, and `build_sizing_field`. The final stage is now +covered by the separate `unified_mesh_create_base_mesh.md` design, which +describes final mesh creation, standalone visualization, and downstream remap +and culling integration. + +The preprocessing steps write clear intermediate products that are useful for +debugging and caching, including both source-level vector products and lat-lon +target-grid products. These intermediate products are explicit enough to +support shared reuse between stages while still keeping `build_sizing_field` as +the main integration point. + +`build_sizing_field` now has a concrete contract. It is defined as the step +that takes: + +- the selected target-grid tier; +- the background land and ocean resolution choices for the mesh; +- the outputs of `prepare_coastline`, such as coastline geometry, masks or + signed-distance fields; +- the outputs of `prepare_river_network`, such as retained flowlines, outlet + information, masks, or other river-refinement products; and +- configuration controlling how these refinement signals are blended, + including background cell widths, transition distances and optional feature + toggles. + +Its output is a single regular lon/lat `cellWidth` field in +`sizing_field.nc`, together with diagnostic candidate fields and active-control +metadata. Framing it this way makes clear that `build_sizing_field` is not +another ad hoc resolution option like the current quasi-uniform mesh choices. +Instead, it is the integration point between shared feature preprocessing and +the existing `SphericalBaseStep` machinery. The downstream mesh-generation +step can consume the resulting `cellWidth` field without needing to know +whether refinement came from coastlines, rivers or later feature classes. + +For coastline processing, the current implementation derives the coastline +from the same topography inputs used in `e3sm/init/topo`, because that gives +the strongest consistency with downstream masking and culling. It constructs a +signed-distance field on the sphere from raster coastline transitions and uses +that field to define smooth resolution transitions. The implementation also +applies shared critical passages and land blockages before flood filling the +candidate ocean mask, which is necessary to keep important connected seas in +the ocean domain and to close known artificial openings. A fallback path based +on Natural Earth should still be retained in case the topo-derived coastline +is too noisy, too expensive to generate, or otherwise unsuitable for driving +mesh refinement. + +The first implementation targets coastline and river inputs only. The +configuration and internal APIs should nonetheless leave room for later steps +that prepare watershed boundaries, lake boundaries or dam data if those prove +necessary. + +The selected target-grid resolution is now treated as part of this interface. +The preprocessing and sizing-field steps exchange products on one shared grid, +not on independently chosen grids. + +### Implementation: Polaris-Native Configuration and Execution + +Date last modified: 2026/04/10 + +Contributors: + +- Xylar Asay-Davis +- Codex + +The standalone JSON configuration files should be translated into Polaris +config sections, for example: + +- `[unified_mesh]` for overall workflow choices, target-grid-tier selection, + and supported feature toggles; +- `[coastline]` for coastline-source selection, fallback behavior and any + thresholds related to coastline cleaning or simplification, as well as + signed-distance transition and buffer parameters; +- `[river_network]` for river simplification and filtering controls; and +- `[sizing_field]` for background resolutions and feature-composition + parameters; and +- `[spherical_mesh]` for the final JIGSAW and MPAS mesh settings already used + by Polaris. + +The workflow should rely on Polaris work directories and machine support rather +than carrying forward `jigsawcase`, `change_json_key_value()` or generated +standalone job scripts. + +For the first implementation, the full shared-step chain should live in the +existing `mesh` component, because the workflow's primary public product is a +base mesh and because Polaris shared steps are organized most clearly when +their directories live at the highest common level where all consuming tasks +can find them. In practice, the task that exposes the workflow should be a +thin wrapper that links together shared steps such as `prepare_coastline`, +`prepare_river_network`, `build_sizing_field`, and the final +`unified_base_mesh` step, all under one mesh-oriented subtree. + +This recommendation does not rule out a later `river` or `land` component. +If Polaris eventually adds reusable river preprocessing, diagnostics or +standalone river-data products outside this mesh workflow, those could justify +a separate component. Even in that case, the interface should still make the +unified base-mesh workflow look like one pipeline, with `build_sizing_field` +remaining the explicit handoff from feature products to the generic spherical +mesh generator. + +### Implementation: Shared Target-Grid Tiers and Cacheable Preprocessing + +Date last modified: 2026/04/22 + +Contributors: + +- Xylar Asay-Davis +- Codex + +The current shared implementation exposes target-grid choice through the +supported lat-lon resolutions in +`polaris.mesh.spherical.unified.LAT_LON_TARGET_GRID_RESOLUTIONS`, currently +`0.25`, `0.125`, `0.0625`, and `0.03125` degrees. + +The standalone coastline and river task families both iterate over that same +tuple when constructing per-resolution tasks, and the shared step subdirectory +layout includes the formatted resolution name. In practice, this means the +resolution itself is already part of the cache key and of the work-directory +layout, which is exactly the behavior the design intended. + +### Implementation: Selective Migration and Maintainability + +Date last modified: 2026/04/10 + +Contributors: + +- Xylar Asay-Davis +- Codex + +The implementation effort should begin with a short function-by-function audit +of [`mpas_land_mesh`](https://github.com/changliao1025/mpas_land_mesh) to +decide what should be: + +- reused from Polaris or `mpas_tools`, +- replaced with direct use of external packages, or +- extracted into small Polaris helpers. + +The following parts of +[`mpas_land_mesh`](https://github.com/changliao1025/mpas_land_mesh) appear +unlikely to be appropriate for direct migration: + +- JSON configuration management in `utilities/config_manager.py`; +- standalone case and job infrastructure in `classes/jigsawcase.py`; and +- broad general-purpose utility layers such as + [`mpas_land_mesh`](https://github.com/changliao1025/mpas_land_mesh)/ + `utilities/vector.py`. + +Candidate targeted extractions may still be needed for items such as +geographic buffering, antimeridian-safe geometry handling or specific river +network simplification logic if those capabilities are not already available in +the chosen package stack. If helper code is brought over, it should remain +small, step-focused and colocated with the consuming workflow unless it quickly +proves reusable. + +For the river-network path in particular, targeted extraction or close +reimplementation is likely preferable to an early redesign of the underlying +algorithm. + +## Testing + +### Testing and Validation: Global Spherical MPAS Base Mesh + +Date last modified: 2026/04/10 + +Contributors: + +- Xylar Asay-Davis +- Codex + +The workflow should include an integration test that creates a coarse unified +global mesh and verifies that `base_mesh.nc` and `graph.info` are produced. + +Validation should confirm that the resulting file is a valid MPAS mesh and that +the feature-aware step reuses the standard JIGSAW-to-MPAS conversion path. + +### Testing and Validation: Downstream E3SM Interoperability + +Date last modified: 2026/04/27 + +Contributors: + +- Xylar Asay-Davis +- Codex + +At least one regression-style task should pass the generated base mesh into the +existing topography remap workflow, and ideally also the cull workflow, without +any manual conversion or edits in the work directory. + +Success for this requirement is not that the unified mesh produces final tuned +science results on the first attempt, but that the mesh product is accepted by +the existing downstream infrastructure as a standard MPAS base mesh. + +Because coastline consistency is a key motivation for the preferred source, +validation should also check that the coastline product used for refinement is +derived from the same topography interpretation used downstream when the +first-choice path is selected. + +Current unit tests now verify that the explicit unified-mesh remap and cull +task variants are registered for each named unified mesh and that the coarsest +unified mesh selects the low-resolution cubed-sphere topography path. + +What remains is an end-to-end execution test that runs a generated unified mesh +through remapping and culling on real products. + +### Testing and Validation: Feature-Aware Resolution Control + +Date last modified: 2026/04/27 + +Contributors: + +- Xylar Asay-Davis +- Codex + +Current automated coverage for feature-aware preprocessing now includes the +coastline, river and sizing-field stages. + +The current unit tests verify the coastline preprocessing contract, key +convention-specific behavior, and critical-passage or land-blockage handling. +They also verify that river preprocessing preserves major network structure, +produces consistent target-grid river and outlet products, and buffers +rasterized river channels by a physical distance when configured. + +Sizing-field tests verify that `build_sizing_field` composes constant and +latitude-dependent ocean backgrounds with land, coastline, river-channel and +river-outlet controls, that the shared grid is used by the step-setup logic, +and that `UnifiedBaseMeshStep` can read `sizing_field.nc`. + +Base-mesh tests now verify that conditioned river geometry is passed through to +JIGSAW line constraints and that one standalone base-mesh task is registered +for each named unified mesh. River tests also verify the coastline-aware river +conditioning performed for the final mesh stage. + +The remaining gap is executable end-to-end coverage: the current automated +tests are still mostly unit-level and do not yet include a smoke test that runs +the full shared-step chain through JIGSAW and downstream remap or cull steps. + +### Testing and Validation: Shared Target-Grid Tiers and Cacheable Preprocessing + +Date last modified: 2026/04/25 + +Contributors: + +- Xylar Asay-Davis +- Codex + +Current unit tests verify some parts of this contract: the river and +sizing-field setup helpers use mesh-specific shared subdirectories, shared configs +are reused when the same product is requested more than once, and the named +mesh configs provide the required `resolution_latlon` options. + +Tests should still verify that all supported target-grid tiers produce the +expected lon/lat dimensions, that dependent shared steps reuse cached outputs +when the same tier is selected, and that switching tiers produces separate +products rather than silently reusing incompatible cached artifacts. + +### Testing and Validation: Polaris-Native Configuration and Execution + +Date last modified: 2026/04/10 + +Contributors: + +- Xylar Asay-Davis +- Codex + +The new workflow should be validated through standard Polaris setup and run +commands, showing that configuration is expressed entirely through Polaris +config files and that shared preprocessing steps can be reused by dependent +tasks. + +If the workflow is split across multiple components, tests should also verify +that the dependency chain remains clear to users through `polaris list +--verbose` and standard work-directory links. + +### Testing and Validation: Selective Migration and Maintainability + +Date last modified: 2026/04/10 + +Contributors: + +- Xylar Asay-Davis +- Codex + +Any helper code extracted from +[`mpas_land_mesh`](https://github.com/changliao1025/mpas_land_mesh) +should receive targeted tests that protect the specific behavior Polaris +depends on. + +The first implementation should also document which external conda-forge +packages were chosen in place of direct code migration so future contributors +can understand why a given helper was or was not carried over from the +standalone workflow. diff --git a/docs/design_docs/unified_mesh_build_sizing_field.md b/docs/design_docs/unified_mesh_build_sizing_field.md new file mode 100644 index 0000000000..644bd46e82 --- /dev/null +++ b/docs/design_docs/unified_mesh_build_sizing_field.md @@ -0,0 +1,547 @@ +# Unified Mesh: Sizing-Field Construction + +date: 2026/04/13 + +Contributors: + +- Xylar Asay-Davis +- Codex + +## Summary + +This design describes the shared `build_sizing_field` step and associated task +that can run that shared step on its own for the unified global base-mesh +workflow. The purpose of the step is to combine baseline mesh-resolution +choices with coastline and river controls into a single global lon/lat sizing +field that can be passed directly to the final spherical JIGSAW mesh step. + +The shared sizing-field workflow is implemented in Polaris pull request +. + +The design assumes that `prepare_coastline` and `prepare_river_network` have +already converted raw source datasets into shared products with explicit +interfaces. Those stages are implemented in Polaris pull requests + and +. `build_sizing_field` +consumes those products directly rather than mixing raw-data interpretation, +feature preprocessing, and mesh-sizing logic in one place. + +Feature refinement is expressed as clearly as practical in the sizing field +itself. For coastline refinement, this points strongly toward explicit raster +candidate fields. For rivers, the current implementation uses target-grid +river masks to drive sizing-field refinement. The separate direct use of river +geometry in final mesh generation is now handled downstream by the +mesh-conditioned river products from `prepare_river_network` and by +`create_base_mesh`. `build_sizing_field` continues to own only the raster river +controls. + +Success means that Polaris gains a documented, reusable sizing-field workflow +whose inputs from earlier steps are clear, whose outputs are directly usable by +the final mesh step, and whose diagnostics make it easy to see why a given +region is refined. + +## Workflow Context + +The overall unified-mesh workflow is described in +[Unified Mesh: Global Base Mesh Workflow](unified_base_mesh.md). + +The upstream unified-mesh workflow designs are: + +- [Unified Mesh: Coastline Preparation](unified_mesh_prepare_coastline.md) +- [Unified Mesh: River Network Preparation](unified_mesh_prepare_river_network.md) + +The downstream unified-mesh workflow design is: + +- [Unified Mesh: Base-Mesh Creation and Downstream Integration](unified_mesh_create_base_mesh.md) + +## Requirements + +### Requirement: JIGSAW-Ready Global Sizing Field + +Date last modified: 2026/04/10 + +Contributors: + +- Xylar Asay-Davis +- Codex + +`build_sizing_field` shall produce a global sizing field on a regular lon/lat +grid that can be consumed directly by the final spherical mesh-generation +step. + +The sizing field shall encode the raster part of the requested spatial +variation in target mesh resolution and shall interoperate cleanly with any +retained feature geometry that the final mesh step uses directly. + +### Requirement: Explicit Consumption of Shared Coastline and River Products + +Date last modified: 2026/04/10 + +Contributors: + +- Xylar Asay-Davis +- Codex + +`build_sizing_field` shall consume the outputs of `prepare_coastline` and +`prepare_river_network` through explicit interfaces. + +The sizing-field step shall not need to re-read raw coastline, raw topography, +or raw HydroRIVERS source datasets in the standard workflow. + +### Requirement: Composable Feature-Based Resolution Controls + +Date last modified: 2026/04/10 + +Contributors: + +- Xylar Asay-Davis +- Codex + +The workflow shall support a baseline resolution pattern together with local +refinement controls for coastline and river features. + +The first design shall support separate control of at least: + +- background ocean resolution; +- background land resolution; +- coastline refinement and transition zones; and +- river-channel and river-outlet refinement. + +The design shall allow additional feature classes such as watershed +boundaries, lakes, or dams to be added later without redesigning the full +sizing-field logic. + +### Requirement: Compatibility with Shared Target-Grid Tiers + +Date last modified: 2026/04/10 + +Contributors: + +- Xylar Asay-Davis +- Codex + +The sizing field shall be defined on the same supported target-grid tier used +by the upstream shared preprocessing steps. + +The first design shall work with a small discrete set of supported target-grid +resolutions rather than assuming arbitrary default resolutions. + +### Requirement: Standalone Sizing-Field Task + +Date last modified: 2026/04/10 + +Contributors: + +- Xylar Asay-Davis +- Codex + +Polaris shall provide a task that runs the shared `build_sizing_field` step +and the shared steps it depends on (e.g. `prepare_coastline` and +`prepare_river_network`). + +The standalone task shall make it practical to inspect candidate refinement +fields and the final sizing field without running the full unified mesh +workflow. + +The same shared step and configuration shall be reusable from the full unified +workflow when settings match. + +## Algorithm Design + +### Algorithm Design: JIGSAW-Ready Global Sizing Field + +Date last modified: 2026/04/10 + +Contributors: + +- Xylar Asay-Davis +- Codex + +The sizing field should be built on a regular lon/lat grid using the shared +target-grid tier selected for the workflow. The resulting field should be in +the same basic form already expected by Polaris spherical mesh generation: +`cellWidth(lat, lon)` or an equivalent gridded `h(x)` product. + +The output should therefore be a directly inspectable and cacheable artifact +rather than an implicit side effect of JIGSAW geometry handling. This makes +the final `unified_base_mesh` step simpler because it only needs to consume the +finished sizing field and convert it into a JIGSAW mesh and then an MPAS mesh. + +### Algorithm Design: Explicit Consumption of Shared Coastline and River Products + +Date last modified: 2026/04/10 + +Contributors: + +- Xylar Asay-Davis +- Codex + +The intended input contract should be explicit: + +- from `prepare_coastline`: a land/ocean mask on the selected target grid and + a signed coastal-distance field, together with any needed coastline-edge + diagnostics; and +- from `prepare_river_network`: a simplified vector river network suitable for + downstream geometry use, plus target-grid river-channel and river-outlet + masks, together with outlet metadata. + +With this contract, `build_sizing_field` can focus on mesh-resolution logic +rather than source-data interpretation. + +The first design should avoid making `prepare_river_network` responsible for +the full river-refinement policy. If `build_sizing_field` needs a river +distance field, it can derive that distance from the simplified river products +it consumes. At the same time, the first Polaris design should explicitly +retain the existing standalone use of river geometry in the final mesh step. + +### Algorithm Design: Composable Feature-Based Resolution Controls + +Date last modified: 2026/04/26 + +Contributors: + +- Xylar Asay-Davis +- Codex + +The first sizing-field algorithm should be framed as a set of candidate fields +combined into a final mesh-spacing field. + +The background field should be constructed first. A reasonable first design is +to use the land/ocean mask from `prepare_coastline` to choose between: + +- an ocean background, which may be constant or may reuse existing Polaris + latitude-dependent functions such as `RRS_CellWidthVsLat()`, or may be + delegated to a mesh-specific sizing-field step for more complex regional + ocean profiles; and +- a land background, which may be constant at first. + +Feature refinement should then be expressed as additional candidate fields: + +- a coastline candidate derived from the signed coastal-distance field, with + configurable transition widths and potentially different treatment on the + land and ocean sides; +- a river candidate derived from distance to the simplified river-channel + network or, in the simplest first pass, from the channel mask itself; and +- an outlet candidate derived from the river-outlet mask, since outlets may + merit stronger or separate refinement. + +The final sizing field should be the pointwise minimum of the background field +and all active feature candidates. This is a clearer design than sequential +overwrites because it makes each contribution explicit and guarantees that +adding a new feature control cannot accidentally coarsen the mesh. + +For coastline refinement, this is also where the Polaris design can diverge +most clearly from the current standalone workflow by favoring explicit raster +candidate fields. For rivers, however, the first Polaris design should be more +conservative. In +[`mpas_land_mesh`](https://github.com/changliao1025/mpas_land_mesh), river +influence is split between raster products and separate geometry handling. +Because that behavior is the least well-understood part of the workflow, +Polaris should preserve that division of labor as much as practical in the +early implementation. + +In that formulation, `build_sizing_field` still owns the raster candidate +fields associated with rivers and outlets. The final `unified_base_mesh` step +may additionally pass simplified river geometry to JIGSAW to preserve existing +cell-placement behavior, but that is not part of the implemented +`build_sizing_field` workflow yet. + +If abrupt changes remain after candidate-field composition, the first design +may include a light regularization or smoothing stage, but that should be a +small post-processing step on the final field, not a substitute for clear +feature definitions. + +### Algorithm Design: Compatibility with Shared Target-Grid Tiers + +Date last modified: 2026/04/10 + +Contributors: + +- Xylar Asay-Davis +- Codex + +`build_sizing_field` should not choose its own grid resolution independently. +Instead, it should consume the selected workflow target-grid tier and produce +its output on that same grid. + +The first design should therefore support a small discrete set of target-grid +tiers shared with `prepare_coastline` and `prepare_river_network`. This keeps +the interfaces between stages simple and makes cached reuse of expensive +preprocessing products practical. + +### Algorithm Design: Standalone Sizing-Field Task + +Date last modified: 2026/04/10 + +Contributors: + +- Xylar Asay-Davis +- Codex + +The standalone task should be a thin wrapper around the shared +`build_sizing_field` step rather than a separate implementation path. + +The task should depend on the selected coastline and river products and should +write diagnostics that make the sizing-field composition easy to inspect, for +example the background field, coastline candidate, river candidate, outlet +candidate, and final field. + +Because the task wraps the shared step, the same sizing-field products can +later be reused by the final mesh step and the full unified workflow when +configuration choices match. + +## Implementation + +### Implementation: JIGSAW-Ready Global Sizing Field + +Date last modified: 2026/04/27 + +Contributors: + +- Xylar Asay-Davis +- Codex + +The current implementation is organized under +`polaris/tasks/mesh/spherical/unified/sizing_field/` as: + +- `build.py` for the shared `BuildSizingFieldStep` and + `build_sizing_field_dataset()` composition function; +- `steps.py` for shared-step construction; +- `task.py` and `tasks.py` for standalone task registration; +- `configs.py` for unified-mesh config loading; and +- `viz.py` for diagnostic plots. + +The shared step writes `sizing_field.nc`. The main output is `cellWidth` on the +shared `lat`/`lon` grid, with units of km. The dataset also includes +diagnostic fields: + +- `background_cell_width`; +- `ocean_background_cell_width`; +- `land_river_cell_width`; +- `pre_coastline_cell_width`; +- `coastline_cell_width`; +- `coastal_transition_delta`; +- `river_channel_cell_width`; +- `river_outlet_cell_width`; and +- `active_control`. + +The downstream base-mesh implementation now uses `UnifiedBaseMeshStep` in +`polaris.mesh.spherical.unified.base_mesh`. That step reads `cellWidth`, +`lon`, and `lat` from `sizing_field.nc`, links mesh-conditioned river geometry, +and then reuses the existing spherical mesh-generation machinery. + +### Implementation: Explicit Consumption of Shared Coastline and River Products + +Date last modified: 2026/04/25 + +Contributors: + +- Xylar Asay-Davis +- Codex + +The implementation keeps step interfaces explicit and avoids reintroducing +raw-dataset dependencies inside `build_sizing_field`. + +The sibling `add-lat-lon-topo-combine` branch already implements the shared +lat-lon `e3sm/init/topo/combine` tasks at 1.0, 0.25, 0.125, 0.0625 and +0.03125 degree and the associated `CombineStep` support. That branch provides +the upstream target-grid topography path assumed by this design. See Polaris +pull request . + +`BuildSizingFieldStep.setup()` links only two upstream data products: +`coastline.nc` from the selected coastline convention and `river_network.nc` +from the corresponding river lat-lon step. It does not read raw topography or +HydroRIVERS data. + +### Implementation: Composable Feature-Based Resolution Controls + +Date last modified: 2026/04/26 + +Contributors: + +- Xylar Asay-Davis +- Codex + +The implementation builds the final field from explicit candidate fields. +`build_sizing_field_dataset()` now composes a precomputed ocean background +with land, river, coastline, and active-control products rather than deciding +which ocean algorithm to use itself. + +The generic `BuildSizingFieldStep` supports only two built-in ocean +backgrounds: `constant` and `rrs_latitude`. The previous `ec_latitude` +branch has been removed from the unified-mesh path and now raises a clear +error if requested. More complex ocean backgrounds are handled by +mesh-family implementations selected from the unified-mesh config. + +The first concrete specialized path is +`ocn_so_12to30km_lnd_10km_riv_10km`, which uses the `so_region` +mesh family. That family reuses the existing Southern Ocean +`high_res_region.geojson` and the shared Southern Ocean background helper from +`polaris.mesh.base.so` to build a 12 km to 30 km regional ocean profile on +the unified target grid. Land starts from a configurable constant background. + +River channel and river outlet candidates are mask-based. They are controlled +separately with `enable_river_channel_refinement`, `river_channel_km`, +`enable_river_outlet_refinement`, and `river_outlet_km`. + +Coastline refinement uses the signed-distance field from `prepare_coastline`. +The current algorithm composes land and river controls first, then applies a +linear coastline transition on the land side using +`coastline_transition_land_km`. The coastline target is the local ocean +background, so the coastal buffer can either coarsen or refine nearby land +and river controls depending on the adjacent ocean resolution. + +`active_control` records the winning control with +`0=background 1=coastline 2=river_channel 3=river_outlet`. Dataset attributes +also count how many river-channel and river-outlet mask cells are finer than, +equal to, or coarser than the background. + +### Implementation: Compatibility with Shared Target-Grid Tiers + +Date last modified: 2026/04/26 + +Contributors: + +- Xylar Asay-Davis +- Codex + +The implementation uses the named unified-mesh configs in +`polaris/mesh/spherical/unified/` to select the shared target-grid tier. +The currently implemented named meshes are: + +- `ocn_240km_lnd_240km_riv_240km`, using 0.25 degree; +- `ocn_30km_lnd_10km_riv_10km`, using 0.125 degree; and +- `ocn_rrs_6to18km_lnd_12km_riv_6km`, using 0.03125 degree; and +- `ocn_so_12to30km_lnd_10km_riv_10km`, using 0.0625 degree. + +The sizing-field shared-step subdirectory includes the mesh name: +`spherical/unified//sizing_field/build`. This makes the mesh +configuration part of the work-directory layout and cache key. + +### Implementation: Standalone Sizing-Field Task + +Date last modified: 2026/04/26 + +Contributors: + +- Xylar Asay-Davis +- Codex + +The implementation adds `SizingFieldTask`, a lightweight task wrapper around +the shared steps. For each named mesh, the standalone task links: + +- the shared lat-lon topography combine step; +- the shared coastline step and an optional coastline visualization step; +- the mesh-specific shared river source step; +- the mesh-specific shared river lat-lon step; and +- the selected sizing-field build step plus its visualization step. + +The sizing-field factory remains config-driven. It discovers mesh configs from +`polaris.mesh.spherical.unified`, then reads the `mesh_family` declared in the +mesh config and lets the generic `BuildSizingFieldStep` delegate ocean +background construction and any extra inputs to that mesh-family +implementation. + +`add_build_sizing_field_tasks()` registers one standalone sizing-field task per +named mesh. The task-specific path is +`spherical/unified//sizing_field/task`. + +## Testing + +### Testing and Validation: JIGSAW-Ready Global Sizing Field + +Date last modified: 2026/04/27 + +Contributors: + +- Xylar Asay-Davis +- Codex + +Current unit tests in `tests/mesh/spherical/unified/test_sizing_field.py` +verify that `build_sizing_field_dataset()` writes a `cellWidth` field with +the expected values for representative configurations and that +`UnifiedBaseMeshStep` reads `cellWidth`, `lon`, and `lat` from +`sizing_field.nc`. + +There is not yet an end-to-end task-level test that passes this sizing field +through JIGSAW and verifies `base_mesh.nc` and `graph.info`. + +### Testing and Validation: Explicit Consumption of Shared Coastline and River Products + +Date last modified: 2026/04/25 + +Contributors: + +- Xylar Asay-Davis +- Codex + +Unit tests build synthetic coastline and river products and pass them directly +to `build_sizing_field_dataset()`. The step implementation links only +`coastline.nc` and `river_network.nc`, so raw topography and raw HydroRIVERS +inputs are outside the sizing-field interface. + +The remaining validation gap is task-level: the full standalone sizing-field +task should be run on real shared coastline and river outputs for each named +mesh. + +### Testing and Validation: Composable Feature-Based Resolution Controls + +Date last modified: 2026/04/26 + +Contributors: + +- Xylar Asay-Davis +- Codex + +Current unit tests check: + +- a uniform 240 km case with no active refinement; +- a split 30 km ocean and 10 km land/river case; +- an RRS latitude-dependent ocean background; +- rejection of `ec_latitude` in the generic unified sizing-field path; +- coastline transition composition using signed distance; +- river outlet controls composed before coastline transitions; and +- `active_control` values for representative cells; +- discovery and task registration for + `ocn_so_12to30km_lnd_10km_riv_10km`; and +- the Southern Ocean specialized sizing-field step linking the shared + GeoJSON and producing a nonuniform ocean background that is finer in the + Southern Ocean than outside it. + +There is not yet validation on full global products showing that coastline, +river-channel, and outlet controls influence the intended real-world regions +with the intended relative strengths. + +### Testing and Validation: Compatibility with Shared Target-Grid Tiers + +Date last modified: 2026/04/25 + +Contributors: + +- Xylar Asay-Davis +- Codex + +Current unit tests verify that `get_sizing_field_config()` uses the unified +mesh configs, that the code that constructs the sizing-field step uses mesh-specific +subdirectories, and that the registered standalone task count matches the +number of named unified-mesh configs. + +There is not yet a full run validating all supported target-grid dimensions or +cache reuse across setup/run workflows. + +### Testing and Validation: Standalone Sizing-Field Task + +Date last modified: 2026/04/25 + +Contributors: + +- Xylar Asay-Davis +- Codex + +The standalone task is now implemented as the primary place to inspect the +component refinement fields and the final sizing field before they are used in +the full unified workflow. Unit tests verify task registration for each named +mesh. The visualization step writes `sizing_field_overview.png`. + +Smoke tests for the full sizing-field workflow is being performed on Frontier. +An update will follow once satisfactory results are available. diff --git a/docs/design_docs/unified_mesh_create_base_mesh.md b/docs/design_docs/unified_mesh_create_base_mesh.md new file mode 100644 index 0000000000..380278cd4a --- /dev/null +++ b/docs/design_docs/unified_mesh_create_base_mesh.md @@ -0,0 +1,687 @@ +# Unified Mesh: Base-Mesh Creation and Downstream Integration + +date: 2026/04/26 + +Contributors: + +- Xylar Asay-Davis +- Codex + +## Summary + +This design describes the shared final `create_base_mesh` step for the +unified global mesh workflow, the standalone tasks that run that step for each +named unified mesh, and the downstream workflow variants that consume the new +base meshes for topography remapping and mesh culling. + +The shared `prepare_coastline`, `prepare_river_network`, and +`build_sizing_field` stages described in the earlier design documents are now +implemented, and the final stage described here is implemented as a shared +unified base-mesh step, standalone base-mesh tasks for each named mesh, and +explicit downstream topography-remap and cull task variants. + +The current implementation provides standalone base-mesh tasks for the +four currently defined named unified meshes in +`polaris.mesh.spherical.unified`, all of which currently use the +`calving_front` Antarctic coastline convention. At the same time, the shared +infrastructure should remain compatible with any supported coastline +convention, even if only `calving_front` is exercised in the current automated +tests. + +Success means that Polaris can create each current unified base mesh as a +standard MPAS mesh, inspect the result together with the input sizing field +and retained river geometry, and pass the produced mesh directly into explicit +downstream topography-remap and land or ocean culling task variants without +an ad hoc conversion stage. + +## Workflow Context + +The overall unified-mesh workflow is described in +[Unified Mesh: Global Base Mesh Workflow](unified_base_mesh.md). + +The upstream unified-mesh workflow designs are: + +- [Unified Mesh: Coastline Preparation](unified_mesh_prepare_coastline.md) +- [Unified Mesh: River Network Preparation](unified_mesh_prepare_river_network.md) +- [Unified Mesh: Sizing-Field Construction](unified_mesh_build_sizing_field.md) + +There are no later stage-specific unified-mesh design documents downstream of +this one in the current series. This document itself covers the final +base-mesh stage together with downstream remap and culling integration. + +## Requirements + +### Requirement: Final JIGSAW-to-MPAS Unified Base Mesh + +Date last modified: 2026/04/25 + +Contributors: + +- Xylar Asay-Davis +- Codex + +Polaris shall support a final unified-mesh stage that creates a global, +spherical MPAS base mesh from the shared unified-mesh products. + +The primary output of that stage shall be a standard MPAS base mesh that can +be consumed directly by existing MPAS and E3SM tooling. + +The final stage shall preserve the spatially varying resolution described by +the unified sizing field. + +### Requirement: Explicit Consumption of Shared Unified-Mesh Products + +Date last modified: 2026/04/27 + +Contributors: + +- Xylar Asay-Davis +- Codex + +The final base-mesh stage shall consume the outputs of `build_sizing_field` +and the mesh-conditioned river products from `prepare_river_network` through +explicit shared interfaces. + +The standard workflow shall not need to re-read or reinterpret raw topography, +raw coastline, or raw HydroRIVERS source datasets inside the final mesh +generation stage, nor should it perform its own coastline-aware river clipping +inside the final mesh step. + +The downstream remap and culling workflow variants shall likewise consume the +resulting MPAS base mesh through explicit task interfaces rather than through +manual work-directory edits. + +### Requirement: River-Geometry Influence on Final Cell Placement + +Date last modified: 2026/04/26 + +Contributors: + +- Xylar Asay-Davis +- Codex + +Retained river geometry shall influence final mesh generation directly rather +than only through the raster sizing field. + +The requirement is on the resulting behavior, namely that final cell placement +can reflect the retained river network, especially along important channels and +near outlets. + +### Requirement: River Snapping Shall Not Refine Coastal Ocean Resolution + +Date last modified: 2026/04/26 + +Contributors: + +- Xylar Asay-Davis +- Codex + +Snapping cell centers to retained river-network geometry shall not introduce +finer-than-intended ocean resolution along the coastline. + +In particular, the final workflow shall prevent river-geometry treatment near +the coast from pulling neighboring ocean cells into a locally over-refined +state that would constrain the ocean time step relative to the requested mesh +design. + +This requirement is on the realized ocean mesh, not just on the input sizing +field. The base-mesh stage shall preserve the intended coastal ocean +resolution even when river geometry is used to improve inland cell placement. + +### Requirement: Shared Final Step and Per-Mesh Standalone Tasks + +Date last modified: 2026/04/26 + +Contributors: + +- Xylar Asay-Davis +- Codex + +Polaris shall provide one shared final base-mesh step that can be reused by +multiple workflows. + +Polaris shall also provide one standalone task per named unified mesh defined +by the config files in `polaris.mesh.spherical.unified`. + +The first implementation shall cover the four currently defined named meshes. +The shared design shall remain compatible with additional named meshes and with +supported Antarctic coastline conventions without requiring a different code +path for each one. + +The standalone tasks shall run the shared final step together with the shared +prerequisite steps they depend on. + +### Requirement: Standalone Visualization for Mesh and Inputs + +Date last modified: 2026/04/25 + +Contributors: + +- Xylar Asay-Davis +- Codex + +The standalone unified base-mesh task shall include a visualization step that +makes it practical to inspect the generated mesh together with the main inputs +that controlled it. + +At a minimum, the standalone visualization shall show the final MPAS mesh, the +input lat-lon sizing field, and the retained river geometry. + +That visualization step shall run in the standalone base-mesh task but shall +not run by default in other workflows that reuse the shared final step. + +### Requirement: Downstream Remap and Culling Variants for Unified Meshes + +Date last modified: 2026/04/25 + +Contributors: + +- Xylar Asay-Davis +- Codex + +Polaris shall provide explicit downstream workflow variants that use each +unified base mesh as input to topography remapping and mesh culling. + +These downstream variants shall cover remapped topography on the unified base +mesh, land and ocean masks on that base mesh, and the resulting culled land +and ocean meshes. + +The downstream variants shall be expressed as standard Polaris tasks rather +than as manual follow-on instructions. + +## Algorithm Design + +### Algorithm Design: Final JIGSAW-to-MPAS Unified Base Mesh + +Date last modified: 2026/04/25 + +Contributors: + +- Xylar Asay-Davis +- Codex + +The sizing-field stage already defines the main raster contract for the final +mesh stage: a regular lon/lat `cellWidth` field in `sizing_field.nc`. The +final mesh stage should treat that product as the authoritative raster spacing +input and should reuse the existing Polaris spherical-mesh path for converting +JIGSAW output into a standard MPAS mesh. + +In other words, the final stage should add only the logic that is truly new to +the unified workflow: consuming the shared products, incorporating retained +river geometry into final mesh generation, and wiring the result into +standalone and downstream tasks. It should not redesign the existing +JIGSAW-to-MPAS conversion path already present in `SphericalBaseStep` and +`QuasiUniformSphericalMeshStep`. + +### Algorithm Design: Explicit Consumption of Shared Unified-Mesh Products + +Date last modified: 2026/04/27 + +Contributors: + +- Xylar Asay-Davis +- Codex + +The intended final-stage input contract is: + +- `sizing_field.nc` from `build_sizing_field` as the authoritative raster + spacing field; +- mesh-conditioned vector river geometry and outlet metadata from + `prepare_river_network` for direct final-stage geometry use and for + visualization; and +- the named unified-mesh configuration, including the selected target-grid + tier and Antarctic coastline convention, for consistent downstream labeling + and task selection. + +The final stage should not go back to raw source data to infer these products +again. That keeps the workflow layered in the same way as the earlier design +documents: source interpretation belongs in shared preprocessing steps, sizing +policy belongs in `build_sizing_field`, coastline-aware river conditioning +belongs in `prepare_river_network`, and final mesh generation belongs in +`create_base_mesh`. + +The downstream topography-remap and culling variants should then consume the +generated `base_mesh.nc` through the same standard interfaces used by existing +Polaris `e3sm/init` workflows. The design should favor task composition over +special one-off scripts. + +### Algorithm Design: River-Geometry Influence on Final Cell Placement + +Date last modified: 2026/04/27 + +Contributors: + +- Xylar Asay-Davis +- Codex + +Retained river geometry must influence final mesh generation directly. The +sizing field already expresses raster refinement around rivers and outlets, but +the +standalone reference workflow suggests that raster refinement alone is not the +whole story when the goal is to place cell centers well along river channels. + +The design should therefore keep two distinct river signals in the final mesh +stage: + +- a raster resolution signal from `build_sizing_field`; and +- a vector geometry signal from the conditioned river network prepared for the + selected mesh. + +The design should follow the algorithmic approach used by the standalone +reference solution in +[`mpas_land_mesh`](https://github.com/changliao1025/mpas_land_mesh) +for using river-network geometry to place cell centers. We do not require a +byte-for-byte match to the standalone implementation, but we do want Polaris to +preserve that reference workflow's geometry-driven method for river alignment +and outlet treatment rather than substitute a different first-cut approach. + +Because outlet regions are especially sensitive, the geometry path should also +leave room for stronger treatment near retained outlets than along the generic +channel network if later tuning shows that is needed. + +### Algorithm Design: River Snapping Shall Not Refine Coastal Ocean Resolution + +Date last modified: 2026/04/27 + +Contributors: + +- Xylar Asay-Davis +- Codex + +The simplest way to satisfy this requirement is to trim or ignore retained +river geometry before it reaches the final base-mesh stage. + +The design should therefore use the coastal signed-distance field already +produced on the shared lat-lon grid during river-network preparation to +evaluate retained river geometry points before they are written to the +base-mesh-facing river products. + +Any river-network geometry that falls within the configured coastal clipping +zone should be excluded from the geometry-driven snapping path. In other words, +the river geometry consumed by `create_base_mesh` should already stop inland of +the coastline by a configurable clip distance consistent with the intended +coastal transition treatment. + +The current implementation keeps this cutoff explicit in the river workflow as +`base_mesh_clip_distance_km` rather than deriving it directly from +`coastline_transition_land_km`. + +Because the target field is periodic in longitude, the interpolation used for +this cutoff should account for longitude periodicity so that river features +near the dateline are handled consistently with those elsewhere on the globe. + +### Algorithm Design: Shared Final Step and Per-Mesh Standalone Tasks + +Date last modified: 2026/04/26 + +Contributors: + +- Xylar Asay-Davis +- Codex + +The final stage should follow the same pattern as the new coastline, river, +and sizing-field stages: one shared implementation plus thin standalone task +wrappers. + +The shared step should be parameterized by the named unified-mesh config. The +code that registers standalone tasks should iterate over the named mesh configs in +`polaris.mesh.spherical.unified` and register one standalone task per mesh. +With the current configs, that means one standalone task each for: + +- `ocn_240km_lnd_240km_riv_240km`; +- `ocn_30km_lnd_10km_riv_10km`; and +- `ocn_rrs_6to18km_lnd_12km_riv_6km`; and +- `ocn_so_12to30km_lnd_10km_riv_10km`. + +Each standalone task should compose the shared prerequisite steps in the same +workflow instance: coastline preparation, river-network preparation, +sizing-field construction, final mesh creation, and standalone visualization. +The design should still permit other workflows to reuse the shared final step +without paying the cost of standalone diagnostics by default. + +### Algorithm Design: Standalone Visualization for Mesh and Inputs + +Date last modified: 2026/04/25 + +Contributors: + +- Xylar Asay-Davis +- Codex + +The standalone visualization step should combine the most important views that +would otherwise require multiple task families to inspect together. + +At a minimum, the visualization should include: + +- the generated MPAS mesh, preferably in a way that makes local resolution and + feature alignment easy to see; +- the input sizing field on its lat-lon grid, because the sizing-field + workflow's own visualization may not be run in the same work directory; and +- the retained river geometry overlaid with the mesh or with a closely related + diagnostic view. + +The step may also include coastline or mask diagnostics, but those are not the +core requirement for this design because the dedicated coastline workflow +already covers them. The important point is that the final-stage standalone +task must make it possible to see both the requested raster resolution pattern +and the realized mesh in one place. + +### Algorithm Design: Downstream Remap and Culling Variants for Unified Meshes + +Date last modified: 2026/04/25 + +Contributors: + +- Xylar Asay-Davis +- Codex + +Once a unified base mesh exists as a standard MPAS mesh, the next workflow +steps are conceptually straightforward. The design should therefore treat them +as part of the same planned capability even if they are implemented as +separate task families. + +For each named unified mesh, Polaris should provide downstream task variants +that: + +- remap topography to the new base mesh; +- derive land and ocean masks on that base mesh using the selected coastline + interpretation; and +- produce culled land and ocean meshes. + +These downstream variants should reuse as much of the existing `e3sm/init` +task machinery as practical. Their main new responsibility should be to wire in +the unified base mesh and any mesh-specific configuration, not to reimplement +topography remapping or culling algorithms. + +## Implementation + +### Implementation: Final JIGSAW-to-MPAS Unified Base Mesh + +Date last modified: 2026/04/27 + +Contributors: + +- Xylar Asay-Davis +- Codex + +The current implementation uses +`polaris.mesh.spherical.unified.base_mesh.UnifiedBaseMeshStep`. That class +reads `cellWidth`, `lat`, and `lon` from `sizing_field.nc`, links the prepared +`clipped_river_network.geojson` product, and reuses the existing spherical +mesh-generation machinery. + +The shared-step factory in +`polaris/tasks/mesh/spherical/unified/base_mesh/steps.py` wires the upstream +coastline, river source, river lat-lon, river base-mesh, and sizing-field +steps together for one named mesh, and `BaseMeshTask` exposes that chain as a +standalone task. + +The important point is to keep the raster sizing-field handoff simple and to +isolate the new behavior in the final unified-mesh stage. + +### Implementation: Explicit Consumption of Shared Unified-Mesh Products + +Date last modified: 2026/04/27 + +Contributors: + +- Xylar Asay-Davis +- Codex + +The software layout under `polaris/tasks/mesh/spherical/unified/base_mesh/` is +now concrete, with modules such as: + +- `viz.py` for standalone visualization; +- `steps.py` for shared-step setup helpers; +- `task.py` and `tasks.py` for standalone task wrappers; and +- `base_mesh.cfg` for shared configuration options specific to final mesh + generation and visualization. + +The shared build step should link upstream `sizing_field.nc` and the +conditioned river vector products from the river workflow, rather than +re-reading raw source datasets. In practice, `UnifiedBaseMeshStep.setup()` links +`sizing_field.nc` from `build_sizing_field` and `clipped_river_network.geojson` +from `PrepareRiverForBaseMeshStep`. The standalone task composes the already +established shared prerequisites in the same style as the current sizing-field +task. + +For downstream workflows, the implementation favors thin task variants around +existing `e3sm/init/topo` remap and cull machinery, with the unified base mesh +linked as the upstream mesh input. + +### Implementation: River-Geometry Influence on Final Cell Placement + +Date last modified: 2026/04/27 + +Contributors: + +- Xylar Asay-Davis +- Codex + +The current implementation reads the conditioned river geometry produced by +`prepare_river_network` and applies it during final mesh creation. +`UnifiedBaseMeshStep.make_jigsaw_mesh()` follows the approach in +[`mpas_land_mesh`](https://github.com/changliao1025/mpas_land_mesh) +for using river-network geometry to influence cell-center placement, while +building on the existing Polaris raster HFUN workflow and JIGSAW-to-MPAS +conversion path. + +The implementation should not aim for byte-for-byte parity with the standalone +reference. However, it should preserve the same basic algorithmic approach, +with the standalone reference serving as the primary guide for river alignment +and outlet treatment. + +To keep river snapping from distorting coastal ocean resolution, the coastline- +aware clipping now happens upstream in `PrepareRiverForBaseMeshStep`. The final +base-mesh step consumes the already conditioned `clipped_river_network` +product and converts those line features into JIGSAW `edge2` constraints. + +### Implementation: Shared Final Step and Per-Mesh Standalone Tasks + +Date last modified: 2026/04/27 + +Contributors: + +- Xylar Asay-Davis +- Codex + +The standalone task registration follows the same mesh-config discovery pattern +already used by the unified sizing-field and river tasks. + +In practice, `add_unified_base_mesh_tasks()` iterates over `UNIFIED_MESH_NAMES` +from `polaris.mesh.spherical.unified.configs` and registers one standalone +`base_mesh__task` per mesh. + +The standalone tasks include the visualization step by default. Other +workflows that reuse the shared final step depend only on the build step unless +they explicitly opt into diagnostics. + +The first implementation should assume the currently defined named meshes use +`calving_front`, but the shared-step and task-registration code should avoid +hard-coding that convention so future mesh configs can select others. + +### Implementation: Standalone Visualization for Mesh and Inputs + +Date last modified: 2026/04/25 + +Contributors: + +- Xylar Asay-Davis +- Codex + +The visualization step should write a small set of durable, easy-to-review +artifacts rather than relying on interactive inspection alone. + +A reasonable first set is: + +- one or more figures of the MPAS mesh at global and regional scales; +- a figure of the lat-lon `cellWidth` field from `sizing_field.nc`; and +- figures that overlay retained river geometry on top of the mesh or on top of + a related final-stage diagnostic. + +The implementation does not need to duplicate every diagnostic already present +in the upstream coastline or river workflows. Its job is to make the final +handoff from requested mesh controls to realized mesh structure easy to assess. + +### Implementation: Downstream Remap and Culling Variants for Unified Meshes + +Date last modified: 2026/04/27 + +Contributors: + +- Xylar Asay-Davis +- Codex + +The downstream work is organized as explicit task variants keyed by the same +named unified meshes used by the standalone base-mesh tasks. + +`add_remap_topo_tasks()` and `add_cull_topo_tasks()` both iterate over +`UNIFIED_MESH_NAMES`, retrieve the shared unified base-mesh step from the mesh +component, and register `e3sm/init` remap and cull tasks that reuse the +existing topography-remap and cull steps. Where the downstream workflows need +mesh-specific defaults, those come from the same named unified-mesh configs. + +This design is intentionally broader than "just create the base mesh" because +the real value of the new mesh appears only when the mesh enters the existing +topography and culling pipeline. Treating those downstream task variants as +part of the same planned capability keeps the workflow boundary honest. + +## Testing + +### Testing and Validation: Final JIGSAW-to-MPAS Unified Base Mesh + +Date last modified: 2026/04/27 + +Contributors: + +- Xylar Asay-Davis +- Codex + +Current automated coverage includes unit tests for `UnifiedBaseMeshStep` and +for standalone base-mesh task registration, but not yet a coarse end-to-end +smoke test that runs JIGSAW and produces `base_mesh.nc` and `graph.info`. + +Validation should confirm that the final task uses the standard +JIGSAW-to-MPAS conversion path and that the result is a valid MPAS mesh. + +### Testing and Validation: Explicit Consumption of Shared Unified-Mesh Products + +Date last modified: 2026/04/27 + +Contributors: + +- Xylar Asay-Davis +- Codex + +Tests should verify that the final build step links only the shared upstream +products it needs, especially `sizing_field.nc` and retained river vector +artifacts, and does not reach back to raw source datasets. + +Current unit tests verify the first part of that contract by exercising the +base-mesh and river shared-step factories and by confirming that downstream +remap and cull variants are registered for each named unified mesh. + +Task-level execution tests should still verify that the remap and cull variants +accept the produced unified base mesh through standard task interfaces. + +### Testing and Validation: River-Geometry Influence on Final Cell Placement + +Date last modified: 2026/04/27 + +Contributors: + +- Xylar Asay-Davis +- Codex + +This requirement needs more than a file-exists test. Automated validation +should include at least one focused check that would fail if the final stage +ignored retained river geometry and used only the raster sizing field. + +The current unit tests verify that the prepared clipped river geometry is +converted into JIGSAW line constraints. The precise mesh-quality check can +still evolve. Examples include a comparison against a raster-only control mesh, +a diagnostic that measures mesh alignment near retained channels, or a small +regression case that verifies a known outlet or main-stem placement pattern. + +### Testing and Validation: River Snapping Shall Not Refine Coastal Ocean Resolution + +Date last modified: 2026/04/27 + +Contributors: + +- Xylar Asay-Davis +- Codex + +Validation for this requirement should include an ocean-focused check on the +realized mesh, not only on the retained river inputs or on the raster sizing +field. + +One required diagnostic is that `dcEdge` after culling to the ocean-only mesh +must not show a band of higher resolution along the coastline than elsewhere +in the mesh, beyond what is expected from the intended ocean resolution +pattern. + +Automated coverage should also include at least one focused regression test of +the river-conditioning step, verifying that river segments are clipped before +they reach the coastline clipping zone and that periodic longitude handling +does not break the cutoff near the dateline. + +Current unit tests cover inland clipping of retained segments and outlet +removal near the coastline. What is still missing is a realized-ocean-mesh +regression that checks `dcEdge` after ocean culling. + +### Testing and Validation: Shared Final Step and Per-Mesh Standalone Tasks + +Date last modified: 2026/04/27 + +Contributors: + +- Xylar Asay-Davis +- Codex + +Tests should verify that task registration produces one standalone base-mesh +task per named unified mesh and that the tasks load the intended named config. + +Current unit tests verify that task registration produces one standalone +base-mesh task per named unified mesh, that the visualization step is included, +and that the shared final step and config are reused when multiple dependent +requests target the same mesh product. + +### Testing and Validation: Standalone Visualization for Mesh and Inputs + +Date last modified: 2026/04/25 + +Contributors: + +- Xylar Asay-Davis +- Codex + +The standalone task tests should verify that visualization artifacts are +produced and that the visualization step reads both the input sizing field and +the generated base mesh. + +If practical, tests should also verify that river geometry is included in the +visualization path so the final diagnostic package really covers the mesh, +resolution field, and retained river inputs together. + +### Testing and Validation: Downstream Remap and Culling Variants for Unified Meshes + +Date last modified: 2026/04/27 + +Contributors: + +- Xylar Asay-Davis +- Codex + +At least one integration-style test should run a coarse unified base mesh into +the downstream topography-remap and cull variants and verify that the expected +intermediate and final products are produced. + +Success for this requirement is not tuned scientific quality on the first +attempt. It is that the unified mesh products pass cleanly into the existing +downstream pipeline and produce the expected remapped topography, masks, and +culled land and ocean meshes. + +Current unit tests already verify that the explicit unified remap and cull task +variants are registered for each named mesh and that the coarsest unified mesh +selects the expected low-resolution topography path. \ No newline at end of file diff --git a/docs/design_docs/unified_mesh_prepare_coastline.md b/docs/design_docs/unified_mesh_prepare_coastline.md new file mode 100644 index 0000000000..4951458ccc --- /dev/null +++ b/docs/design_docs/unified_mesh_prepare_coastline.md @@ -0,0 +1,617 @@ +# Unified Mesh: Coastline Preparation + +date: 2026/04/13 + +Contributors: + +- Xylar Asay-Davis +- Codex + +## Summary + +This design describes the shared `prepare_coastline` step and an associated +task that can run that shared step on its own for the unified global base-mesh +workflow. The purpose of the step is to create a single coastline +interpretation that downstream steps can reuse, especially +`prepare_river_network` and `build_sizing_field`. + +The shared coastline workflow is implemented in Polaris pull request +. + +The preferred first source for coastline information is the combined +topography already used in `e3sm/init/topo`, because that gives the strongest +consistency with downstream topography remapping and culling. The resulting +coastline products should be defined on the same regular lon/lat grid that +`build_sizing_field` will consume. + +The implementation keeps the shared coastline interface raster-first. In +particular, the public output contract uses target-grid masks and +coastal-distance fields rather than a persisted polygonal coastline product. +If temporary contour extraction is ever needed internally, it should remain an +implementation detail rather than the main workflow artifact. + +Success means that Polaris gains a documented, reusable coastline-preparation +workflow whose outputs can be consumed directly by downstream steps and whose +standalone task makes it practical to inspect and iterate on coastline choices +without running the full unified mesh workflow. + +## Workflow Context + +The overall unified-mesh workflow is described in +[Unified Mesh: Global Base Mesh Workflow](unified_base_mesh.md). + +There are no earlier stage-specific unified-mesh design documents upstream of +this coastline workflow in the current series. + +The downstream unified-mesh workflow designs are: + +- [Unified Mesh: River Network Preparation](unified_mesh_prepare_river_network.md) +- [Unified Mesh: Sizing-Field Construction](unified_mesh_build_sizing_field.md) +- [Unified Mesh: Base-Mesh Creation and Downstream Integration](unified_mesh_create_base_mesh.md) + +## Requirements + +### Requirement: Raster-First Coastline Products for Downstream Steps + +Date last modified: 2026/04/13 + +Contributors: + +- Xylar Asay-Davis +- Codex + +`prepare_coastline` shall provide a shared coastline representation that can +be consumed directly by both `prepare_river_network` and +`build_sizing_field`. + +The shared product shall retain both land/ocean classification and coastal +proximity information over the global domain. + +The target-grid topography and any coastline-derived sizing inputs shall be +finer than the local destination mesh resolution whenever coastline fidelity +matters, rather than merely matching it. In particular, coarse remapped +topography can produce an unacceptably degraded coastline because of bilinear +interpolation, so a 1-degree product should not be treated as generally +adequate for coastline preparation. + +The downstream steps shall not need to reinterpret raw coastline or raw +topography source datasets independently. + +### Requirement: Topography-Consistent and Explicit Coastline Definition + +Date last modified: 2026/04/25 + +Contributors: + +- Xylar Asay-Davis +- Codex + +The preferred coastline definition shall be consistent with the combined +topography interpretation already used by the existing `e3sm/init/topo` +workflow. + +The treatment of floating Antarctic ice shall be explicit and reproducible, +rather than being left implicit in overlapping land and ocean masks. + +The coastline workflow shall derive an exclusive ocean mask by starting from +the ocean side and flood filling connected ocean regions, so the ocean +interpretation remains contiguous and disconnected depressions are not +misidentified as ocean simply because their remapped topography falls below +sea level. + +The coastline workflow shall support critical ocean passages and critical land +blockages that are applied before flood fill. Critical passages are needed to +keep semi-enclosed seas such as the Mediterranean connected to the ocean +domain when the remapped topography would otherwise close them; critical land +blockages are needed to close known artificial openings that should remain +land for the selected coastline interpretation. + +The coastline workflow shall support multiple explicit Antarctic coastline +definitions within the shared design rather than baking in only the first +consumer's needs. + +If the topography-derived coastline proves unsuitable for some workflows, the +design shall allow an alternate source such as Natural Earth without changing +the downstream interface. + +### Requirement: Global Coastal Distance on the Sphere + +Date last modified: 2026/04/10 + +Contributors: + +- Xylar Asay-Davis +- Codex + +The coastline product shall support smooth coastal transition zones for mesh +sizing on the sphere, including across the antimeridian. + +The coastal-distance definition shall be suitable for the regular lon/lat grid +used by `build_sizing_field`. + +The first design shall avoid assuming that planar buffering or planar +Euclidean distance is adequate on a periodic global grid. + +### Requirement: Standalone Coastline Task + +Date last modified: 2026/04/10 + +Contributors: + +- Xylar Asay-Davis +- Codex + +Polaris shall provide a task that runs the shared `prepare_coastline` step and +the shared steps it depends on (e.g. `e3sm/init/topo/combine`). + +The standalone task shall make it practical to inspect coastline outputs and +compare coastline options without running the full unified mesh workflow. + +The same shared step and configuration shall be reusable from the full unified +workflow when settings match. + +## Algorithm Design + +### Algorithm Design: Raster-First Coastline Products for Downstream Steps + +Date last modified: 2026/04/14 + +Contributors: + +- Xylar Asay-Davis +- Codex + +The authoritative coastline products should be defined on the same regular +lon/lat grid that `build_sizing_field` will use. This implies that target-grid +selection should happen once in shared configuration, not independently inside +each downstream step. + +The preferred upstream source is the existing `e3sm/init/topo/combine` +workflow, because `CombineStep` already supports `target_grid = lat_lon`. +Rather than inventing a separate remap path, the coastline workflow should +reuse that capability to obtain combined topography on the target grid. + +The target-grid choice should be constrained by coastline fidelity, not only +by downstream convenience. Because the coastline is inferred from remapped +topography, the remapped product and any derived sizing array should be +meaningfully finer than the local destination mesh spacing. A 0.25-degree +product is useful as a cheaper inspection tier, but testing shows it is too +coarse for scientifically valid coastline preparation because semi-enclosed +basins such as the Mediterranean can disappear. The 1-degree product is +valuable mainly for very coarse mesh workflows such as smoke-test meshes near +240 km, but `prepare_coastline` should support four coastline target-grid +tiers: 0.25, 0.125, 0.0625, and 0.03125 degree. The shared lat-lon combine +workflow should support those same four resolutions plus the 1.0-degree +smoke-test tier. + +The shared output contract should remain raster-first. The first design should +assume outputs such as: + +- combined topography on the target grid, either as a direct dependency or as + a shared input artifact, not necessarily a new coastline output; +- one convention-specific coastline product per supported Antarctic + convention, each containing exclusive land/ocean masks on the shared target + grid; +- coastline-cell or coastline-edge indicators for those conventions, plus any + lightweight boundary-sample diagnostics needed by downstream steps; and +- signed coastal-distance fields for those conventions. + +With this contract, `prepare_river_network` can use the mask or coastline-edge +information for the convention chosen by workflow config, while +`build_sizing_field` can consume the corresponding signed-distance field +directly. + +This approach avoids making a polygonal coastline product part of the public +interface. If temporary contour extraction is ever needed for an internal +experiment, it should not become the required downstream artifact. + +### Algorithm Design: Topography-Consistent and Explicit Coastline Definition + +Date last modified: 2026/04/25 + +Contributors: + +- Xylar Asay-Davis +- Codex + +The preferred coastline definition should start from the combined topography +fields already used downstream, especially `base_elevation`, `ice_mask`, and +`grounded_mask`. + +Outside Antarctica, or more generally where floating ice is absent, the coast +can be interpreted as the zero contour of `base_elevation` after remapping to +the target lon/lat grid. + +Around Antarctica, the existing topography masking logic does not define a +single exclusive coastline by itself because floating ice contributes to the +land interpretation while the water below it may still contribute to the ocean +interpretation. The coastline workflow should therefore define an explicit +Antarctic convention instead of inheriting that ambiguity. + +The first design should produce three related Antarctic coastline products from +the same remapped topography inputs and mask-building logic: + +- `calving_front`, where floating ice is treated as land for coastline + purposes, so the ocean excludes Antarctic ice-shelf cavities and the + coastline follows the calving front; +- `grounding_line`, where floating ice is treated as ocean for coastline + purposes, so the ocean includes Antarctic ice-shelf cavities and the + coastline follows the grounding line; and +- `bedrock_zero`, where ocean additionally includes grounded Antarctic ice + below sea level, so the coastline follows the zero contour of bedrock. + +These three products should be generated together and cached together rather +than treated as separate future workflow branches. Omega may initially consume +only `calving_front`, but the unified mesh design should preserve the other two +because static cavities, wetting-and-drying, and dynamic grounding-line work +are expected downstream use cases. + +The coastline step should expose these variants through separate but +simultaneously generated products, and downstream steps should explicitly +choose which convention to consume through workflow configuration. This is +expected to align naturally with different unified-mesh variants, such as +meshes that exclude Antarctic ice-shelf cavities and meshes that include them. + +An exclusive ocean mask should not be inferred solely from a local threshold +such as `base_elevation < 0`. Instead, each Antarctic convention should first +define a candidate ocean mask and then perform a flood fill from trusted +ocean-side seed cells to determine the connected ocean region. Before flood +fill, default critical transects from `geometric_features` should be +rasterized onto the target grid. Transects tagged as critical land blockages +remove cells from the candidate ocean mask, while transects tagged as critical +passages add cells to it. This gives the flood fill enough connectivity +information to include major semi-enclosed seas and enough blockage +information to prevent known false ocean connections. + +The first design should seed from all candidate-ocean cells on the +northernmost latitude row. Cells that are below sea level but disconnected +from the global ocean should remain on the land side of the partition unless a +later workflow explicitly decides otherwise. This flood-fill step is important +both in Antarctica and elsewhere for preserving a contiguous ocean +interpretation. + +If one default must be chosen early for existing downstream workflows, +`calving_front` appears to be the safer first shared product because it gives a +single land-ocean partition that is more naturally aligned with land and river +outlet logic. However, the standalone task should make it easy to compare that +default with the other two shared products before the full workflow commits to +consumer-specific assumptions. + +If the topography-derived coastline proves too noisy, too expensive, or +otherwise unsuitable, a fallback source such as Natural Earth should be +rasterized onto the same target grid and normalized into the same output +contract. In this way, downstream steps can remain agnostic about the +coastline source. + +### Algorithm Design: Global Coastal Distance on the Sphere + +Date last modified: 2026/04/14 + +Contributors: + +- Xylar Asay-Davis +- Codex + +The preferred first algorithm is to compute coastal distance directly from the +exclusive raster mask on the periodic lon/lat grid, rather than requiring a +persisted vector geometry product. + +The basic formulation should be: + +1. For each requested coastline convention, construct a candidate ocean mask + on the target grid from the remapped topography fields. +2. Flood fill from trusted ocean-side seed cells to obtain an exclusive, + ocean-connected land/ocean mask. +3. Identify coastline transitions wherever neighboring grid cells switch + between land and ocean, wrapping in longitude across the antimeridian. +4. Represent each coastline transition by one or more boundary samples located + on the corresponding grid-cell edges. +5. Convert the boundary samples and all target-grid points to Cartesian + coordinates on the sphere. +6. Use nearest-neighbor search in Cartesian space to estimate the unsigned + distance from each grid point to the nearest coastline sample. +7. Apply the sign from the exclusive land/ocean mask. + +This formulation has two advantages for the present design. First, it keeps +the public interface raster-based. Second, it turns antimeridian handling into +a periodic-neighbor problem on the target grid rather than a vector-topology +problem. + +The initial distance estimate can follow the same boundary-sample and KD-tree +style already used in `mpas_tools.mesh.creation.signed_distance`, but with the +boundary samples extracted from raster coastline transitions instead of from +vector geometry. If later testing shows that this approximation is too noisy +or too inaccurate, we can refine the boundary sampling or temporarily extract +contours internally without changing the external workflow contract. + +The sign convention should be recorded explicitly. For example, the workflow +can define negative distance over land and positive distance over ocean, or the +reverse, as long as `build_sizing_field` interprets it consistently. + +### Algorithm Design: Standalone Coastline Task + +Date last modified: 2026/04/10 + +Contributors: + +- Xylar Asay-Davis +- Codex + +The standalone task should be a thin wrapper around the shared +`prepare_coastline` step rather than a separate implementation path. + +The task will likely depend on a shared target-grid topography product, ideally +reused from the existing `combine_topo` capability on a lat/lon grid. From +there, the task can run the shared coastline step and any lightweight +diagnostic or visualization steps that prove useful. + +This standalone task is important for design iteration. It provides a place to +compare topography-derived and fallback coastlines, to compare Antarctic +conventions, and to inspect the target-grid mask and signed-distance products +without also running river preprocessing, sizing-field construction, or mesh +generation. + +Because the task wraps the shared step, the same outputs can later be reused +by the full unified workflow when configuration choices match. + +## Implementation + +### Implementation: Raster-First Coastline Products for Downstream Steps + +Date last modified: 2026/04/27 + +Contributors: + +- Xylar Asay-Davis +- Codex + +The current implementation adds a shared coastline-preparation workflow in +`polaris.tasks.mesh.spherical.unified.coastline` and reuses the shared lat-lon +combined-topography steps through `get_lat_lon_topo_steps()` rather than +adding a separate remap path. + +That enabling work was already put in place by the shared lat-lon combined- +topography support in Polaris pull request +, which added shared lat-lon +combined-topography tasks and steps at 1.0, 0.25, 0.125, 0.0625, and 0.03125 +degree, with combined outputs including `base_elevation`, `ice_draft`, +`ice_thickness`, `ice_mask`, and `grounded_mask`. `prepare_coastline` now +treats those shared lat-lon topography products as the authoritative upstream +inputs for the preferred topo-derived path. + +The implemented coastline workflow supports those same four coastline +target-grid tiers other than the 1.0-degree smoke-test product. Standalone +coastline tasks exist for 0.25, 0.125, 0.0625, and 0.03125 degree. The +expected usage is that 0.25 degree remains the cheaper inspection tier, while +0.125, 0.0625, and 0.03125 degree are the scientifically credible coastline +tiers. See Polaris pull request +. + +The shared coastline step writes one convention-specific NetCDF file for each +of `calving_front`, `grounding_line`, and `bedrock_zero`. Each file currently +contains: + +- `ocean_mask` and `signed_distance`; and +- metadata including the coastline convention, target-grid type and + resolution, source type, mask threshold, sea-level threshold, flood-fill + seed strategy, sign convention, and text descriptions of the coastline-edge + and distance definitions. + +The current implementation also records the source combined-topography file +and source step in the output attributes. This satisfies part of the intended +lightweight metadata and diagnostics contract for recording the selected +target-grid tier, source type, mask thresholds, flood-fill seed strategy, and +sign convention. The implementation still does not write boundary samples as a +public product, so any later reuse of those samples by +`prepare_river_network` remains future work. + +The implemented source path is only the topo-derived one so far. A Natural +Earth fallback has not been added yet. + +### Implementation: Topography-Consistent and Explicit Coastline Definition + +Date last modified: 2026/04/18 + +Contributors: + +- Xylar Asay-Davis +- Codex + +The current implementation always generates all three Antarctic coastline +products in one run and writes them as separate files that downstream steps +can select from explicitly. + +The implemented topo-derived path is organized around the following concrete +operations: + +1. read `base_elevation`, `ice_mask`, and `grounded_mask` from the shared + combined-topography dataset on the target lat-lon grid; +2. threshold the remapped `ice_mask` and `grounded_mask` arrays with the + configurable `mask_threshold` option, whose default is 0.5; +3. build candidate ocean masks for `calving_front`, `grounding_line`, and + `bedrock_zero`; +4. label connected candidate-ocean regions, merge labels that wrap across the + eastern and western grid edges, and keep only regions connected to the + northernmost latitude row; +5. optionally rasterize critical land blockages and passages from + `geometric_features` and apply them to the candidate ocean masks before + flood fill; +6. derive the transient coastline-edge diagnostics needed for signed-distance + sampling; and +7. write one output file per convention. + +The current candidate-mask definitions are: + +- `calving_front`: below sea level and not covered by ice; +- `grounding_line`: below sea level and not covered by grounded ice; and +- `bedrock_zero`: below sea level, regardless of ice state. + +Outside Antarctica, where `ice_mask` and `grounded_mask` are effectively zero, +the three candidate masks reduce to the same open-ocean interpretation. + +The implemented workflow still does not include a Natural Earth fallback. It +does, however, write diagnostics that make the mask-building process auditable +through the `viz` step, including final ocean-mask and signed-distance plots +for each convention. + +The default configuration sets `include_critical_transects = True`, so the +shared critical land blockages and passages are included in normal task runs. + +### Implementation: Global Coastal Distance on the Sphere + +Date last modified: 2026/04/18 + +Contributors: + +- Xylar Asay-Davis +- Codex + +The current implementation starts from the raster land/ocean masks, identifies +coastline locations where neighboring cells switch between land and ocean, and +computes spherical distance from each target-grid cell to the nearest such +coastline sample without introducing a persisted vector coastline product. + +In practice, the implemented workflow: + +1. derives coastline transitions directly from the exclusive ocean masks; +2. builds transient east-edge and north-edge coastline diagnostics; +3. places coastline samples at east-edge angular midpoints and north-edge + latitudinal midpoints; +4. converts target-grid cell centers and coastline samples to Cartesian + coordinates on the sphere; +5. uses `scipy.spatial.cKDTree` to compute nearest-sample chord distances, + then converts those to spherical arc distance; and +6. applies the sign convention of negative over land and positive over ocean. + +Signed-distance fields are currently generated for all three conventions in +every run. + +The implemented `viz` step writes global and Antarctic binary plots of the +final `ocean_mask`, signed-distance plots for each convention, and +`debug_summary.txt`. + +The same rasterization machinery used for critical passages and land blockages +handles diagonal paths as four-connected raster paths and treats longitude as +periodic across the antimeridian. + +### Implementation: Standalone Coastline Task + +Date last modified: 2026/04/18 + +Contributors: + +- Xylar Asay-Davis +- Codex + +The current implementation adds a lightweight task wrapper around the shared +steps and does not introduce a separate task-specific coastline algorithm. + +`LatLonCoastlineTask` depends on the selected shared lat-lon combined- +topography step and then adds the shared coastline step plus the shared `viz` +step with `include_viz=True`. The shared-step helper for this path is +`get_lat_lon_coastline_steps()`. + +The standalone task subdirectories are currently: + +- `spherical/unified/coastline/lat_lon/0.25000_degree/task` +- `spherical/unified/coastline/lat_lon/0.12500_degree/task` +- `spherical/unified/coastline/lat_lon/0.06250_degree/task` +- `spherical/unified/coastline/lat_lon/0.03125_degree/task` + +Each task links the shared `coastline.cfg` file and exposes the `combine_topo`, +`prepare`, and `viz` step directories within the task work directory. + +The task is therefore the current place to inspect whether a target-grid tier +is adequate for a given intended mesh resolution before using the product in a +later unified workflow. + +## Testing + +### Testing and Validation: Raster-First Coastline Products for Downstream Steps + +Date last modified: 2026/04/25 + +Contributors: + +- Xylar Asay-Davis +- Codex + +Automated tests now verify the public output contract at the dataset-builder +level. In `tests/mesh/spherical/unified/test_coastline.py`, the current +coverage confirms that the convention-specific coastline products expose the +expected variables, +that the conventions are returned together in the expected order, and that the +output metadata records items such as the coastline convention and flood-fill +seed strategy. + +There is not yet automated validation that compares different target-grid +tiers. Downstream contract coverage now exists at the unit-test level in the +river and sizing-field tests, but there is not yet a task-level integration +test that runs the full coastline-to-river-to-sizing-field chain. + +### Testing and Validation: Topography-Consistent and Explicit Coastline Definition + +Date last modified: 2026/04/25 + +Contributors: + +- Xylar Asay-Davis +- Codex + +The current automated tests in this area are unit tests on synthetic target- +grid datasets rather than full global products. + +Those tests currently cover: + +- a case where `calving_front`, `grounding_line`, and `bedrock_zero` differ in + Antarctica; +- a disconnected below-sea-level basin that remains on the land side after + flood fill; and +- a case confirming that the northernmost latitude row is used for flood-fill + seeding even when latitude values are ordered south to north; +- a critical land blockage that closes a narrow ocean connection; and +- a critical passage that connects an otherwise disconnected ocean region. + +The current tests do not yet include dedicated threshold-sensitivity cases or +full-resolution comparisons against realistic global datasets. + +### Testing and Validation: Global Coastal Distance on the Sphere + +Date last modified: 2026/04/25 + +Contributors: + +- Xylar Asay-Davis +- Codex + +The current automated coverage checks the signed-distance field indirectly in +synthetic dataset tests. Existing assertions confirm that the field is finite +for the tested cases and that the sign matches the intended convention of +negative over land and positive over ocean. + +There is an antimeridian-specific automated test for critical-transect +rasterization, but not yet for the signed-distance field itself. There is also +not yet a task-level baseline that checks the smoothness of the signed-distance +field on realistic global products. Manual inspection is still needed for +those cases. + +### Testing and Validation: Standalone Coastline Task + +Date last modified: 2026/04/25 + +Contributors: + +- Xylar Asay-Davis +- Codex + +The standalone task is the intended place to inspect and compare coastline +choices before they are used in a later unified workflow, but there is not yet +an automated task-level smoke test for it. + +A test has been performed on Frontier, showing the expected behavior: + +- comparing 0.25, 0.125, 0.0625, and 0.03125 degree coastline fidelity; +- comparing the three Antarctic coastline conventions; +- inspecting the global and Antarctic coastline and signed-distance plots; and +- reviewing `debug_summary.txt` for ocean-mask counts and signed-distance + ranges. diff --git a/docs/design_docs/unified_mesh_prepare_river_network.md b/docs/design_docs/unified_mesh_prepare_river_network.md new file mode 100644 index 0000000000..3729217c96 --- /dev/null +++ b/docs/design_docs/unified_mesh_prepare_river_network.md @@ -0,0 +1,476 @@ +# Unified Mesh: River Network Preparation + +date: 2026/04/19 + +Contributors: + +- Xylar Asay-Davis +- Codex + +## Summary + +This design describes the shared `prepare_river_network` step and associated +tasks that can run the shared river steps on their own for the unified global +base-mesh workflow. The purpose of the step is to simplify a global river +dataset into products that can be consumed directly by `build_sizing_field` +without re-reading or reinterpreting the raw source data. + +The shared river-network workflow is implemented in Polaris pull request +. + +The preferred first source is HydroRIVERS or an equivalent global flowline +dataset. Unlike the standalone +[`mpas_land_mesh`](https://github.com/changliao1025/mpas_land_mesh) +workflow, the Polaris design makes the downstream interface explicit. In +particular, the workflow distinguishes between the authoritative simplified +river network, the target-grid products needed by `build_sizing_field`, and +the mesh-conditioned products needed by `create_base_mesh`, rather than +overloading a single raster with mixed semantics. + +Because river-network simplification and river-driven meshing are the parts of +the workflow where Xylar's design intuition is currently weakest, the first +Polaris design should preserve the +[`mpas_land_mesh`](https://github.com/changliao1025/mpas_land_mesh) +river algorithms as closely as is practical. + +The implementation aligns `prepare_river_network` with the shared target-grid +tier and coastline interpretation chosen for the workflow, so that river +outlets and coastal refinement can be made consistent. + +Success means that Polaris gains a documented, reusable river-network +preprocessing workflow that preserves the major hydrographic controls relevant +for mesh generation and makes its outputs easy to inspect and easy for +downstream steps to consume. + +## Workflow Context + +The overall unified-mesh workflow is described in +[Unified Mesh: Global Base Mesh Workflow](unified_base_mesh.md). + +The upstream unified-mesh workflow design is: + +- [Unified Mesh: Coastline Preparation](unified_mesh_prepare_coastline.md) + +The downstream unified-mesh workflow designs are: + +- [Unified Mesh: Sizing-Field Construction](unified_mesh_build_sizing_field.md) +- [Unified Mesh: Base-Mesh Creation and Downstream Integration](unified_mesh_create_base_mesh.md) + +## Requirements + +### Requirement: Downstream-Ready River Network Products + +Date last modified: 2026/04/27 + +Contributors: + +- Xylar Asay-Davis +- Codex + +`prepare_river_network` shall provide source-level, target-grid, and +mesh-conditioned river products that can be consumed directly by +`build_sizing_field` and `create_base_mesh`. + +The shared products shall retain the major river-network information needed for +mesh refinement and direct cell-center placement, including channel locations +and outlet locations. + +The downstream sizing-field and base-mesh steps shall not need to rerun +HydroRIVERS filtering, network reconstruction, outlet discovery, or +coastline-aware river clipping and simplification. + +### Requirement: Hydrologically Meaningful Simplification + +Date last modified: 2026/04/10 + +Contributors: + +- Xylar Asay-Davis +- Codex + +The first implementation shall preserve the dominant global river outlets, +main stems, and major tributaries needed to inform mesh resolution. + +The design shall support filtering by drainage area and by proximity so the +retained network reflects the target mesh scale rather than the full source +dataset density. + +The simplification shall preserve connectivity and confluence structure rather +than reducing the product to disconnected local segments. + +Where practical, the first Polaris design shall preserve the existing +[`mpas_land_mesh`](https://github.com/changliao1025/mpas_land_mesh) +river-network algorithms rather than redesigning them. + +### Requirement: Coastline-Consistent Outlets and Explicit Inland Sinks + +Date last modified: 2026/04/10 + +Contributors: + +- Xylar Asay-Davis +- Codex + +River outlets that drain to the ocean shall be made consistent with the +coastline interpretation selected in `prepare_coastline`. + +Endorheic basins and other inland sinks shall remain explicit rather than +being folded into the ocean-outlet logic. + +The workflow shall not assume that raw river-source outlet locations are +already perfectly consistent with the preferred coastline source. + +### Requirement: Standalone River-Network Task + +Date last modified: 2026/04/22 + +Contributors: + +- Xylar Asay-Davis +- Codex + +Polaris shall provide a standalone task for the shared source-level river +preprocessing and a standalone lat-lon task that runs the shared river +rasterization together with the shared steps it depends on (for example +`e3sm/init/topo/combine` and `prepare_coastline`). + +These standalone tasks shall make it practical to inspect retained basins, +outlets, target-grid river masks, and outlet-snapping diagnostics without +running the full unified mesh workflow. + +The same shared steps and configuration shall be reusable from the full unified +workflow when settings match. + +### Requirement: Reproducible Source Data Access + +Date last modified: 2026/04/19 + +Contributors: + +- Xylar Asay-Davis +- Codex + +All source datasets needed by `prepare_river_network` shall be obtained either +from documented public sources or, if that is not feasible, from the Polaris +database. + +The preferred implementation shall download raw source data from public +sources and perform any needed preprocessing within Polaris rather than +requiring users to provide local input-file paths. + +Adding preprocessed artifacts to the Polaris database should be treated as a +fallback for cases where the source data are not publicly distributable or the +required preprocessing cannot be reproduced robustly within Polaris. + +## Algorithm Design + +### Algorithm Design: Downstream-Ready River Network Products + +Date last modified: 2026/04/27 + +Contributors: + +- Xylar Asay-Davis +- Codex + +The current implementation separates source-level hydrographic products from +target-grid products rather than trying to make one step serve both roles. +This aligns with the design intent that downstream consumers should not need to +reinterpret HydroRIVERS or infer outlet semantics from one overloaded raster. + +At the source level, the workflow writes: + +- `source_river_network.geojson`, containing the converted HydroRIVERS source; +- `simplified_river_network.geojson`, containing retained segments with + `hyriv_id`, `main_riv`, `ord_stra`, `drainage_area`, `next_down`, + `endorheic`, `outlet_type`, and `outlet_hyriv_id`; and +- `retained_outlets.geojson`, containing the retained outlet points and their + basic classification. + +At the target-grid level, the workflow writes: + +- `river_network.nc`, with `river_channel_mask`, `river_outlet_mask`, + `river_ocean_outlet_mask`, and `river_inland_sink_mask`; and +- `river_outlets.geojson`, containing the snapped outlet points together with + source coordinates, snapped coordinates, snapped grid indices, snapping + distance, and `matched_to_ocean`. + +This is intentionally clearer than the standalone workflow's mixed raster +semantics. The present implementation does not yet add stream-order rasters or +basin IDs, but it does establish a clean product split that the +`build_sizing_field` implementation now consumes directly. + +For base-mesh consumers, the workflow also writes a mesh-conditioned product +set: + +- `clipped_river_network.geojson`, containing river segments clipped inland of + the coastline and simplified for direct JIGSAW geometry use; +- `clipped_outlets.geojson`, containing only outlets that remain relevant after + that conditioning; and +- `clipped_river_network.nc`, containing masks regenerated from the clipped + network for diagnostics. + +These products are where the river workflow becomes aware of the selected +unified mesh and its direct cell-placement needs. `build_sizing_field` uses the +target-grid masks, while `create_base_mesh` consumes the conditioned vector +geometry. + +### Algorithm Design: Hydrologically Meaningful Simplification + +Date last modified: 2026/04/22 + +Contributors: + +- Xylar Asay-Davis +- Codex + +The current Polaris implementation is a focused reimplementation built around +HydroRIVERS attributes such as `HYRIV_ID`, `MAIN_RIV`, `ORD_STRA`, +`UPLAND_SKM`, `NEXT_DOWN`, and `ENDORHEIC`. Its staged logic is: + +1. Filter source flowlines by a minimum drainage-area threshold tied to the + intended river-refinement scale. +2. Merge multiple source features with the same `hyriv_id` into one canonical + segment when needed. +3. Validate that the retained `NEXT_DOWN` graph is acyclic before attempting + basin traversal. +4. Identify candidate outlets from segments with `next_down == 0`, then retain + large, well-separated outlets based on geodesic distance while preserving + distinct inland sinks. +5. Traverse upstream iteratively from each retained outlet, keeping the + largest upstream segment at each confluence as the main stem. +6. Retain additional tributaries when either their drainage area exceeds a + configurable fraction of the main stem or their minimum distance from the + already retained basin skeleton exceeds the outlet-distance tolerance. + +The key point is that simplification should be basin-aware and topology-aware. +The Polaris design should preserve connectivity and confluences, not just apply +independent Douglas-Peucker style simplification to each source feature. + +### Algorithm Design: Coastline-Consistent Outlets and Explicit Inland Sinks + +Date last modified: 2026/04/22 + +Contributors: + +- Xylar Asay-Davis +- Codex + +The simplified network is finalized in two phases: source-level retention and +target-grid reconciliation. The source-level step identifies retained outlet +points, and the lat-lon step then reconciles those points against the shared +coastline product. + +For ocean-draining basins, the current implementation searches for the nearest +ocean cell in `coastline.nc`, computes the haversine distance to that cell, and +marks the outlet as matched only if the distance is within the configured +`outlet_match_tolerance`. If no ocean cell is close enough, the outlet is still +snapped to the nearest grid cell but is recorded as `matched_to_ocean = false` +with the snapping distance preserved for diagnostics. + +Endorheic basins bypass ocean matching and are snapped to the nearest land cell +derived from the coastline `ocean_mask`. They retain the explicit +`inland_sink` classification in both the vector outlet metadata and the target- +grid masks. + +If an ocean-draining outlet cannot be matched within tolerance, the workflow +flags that basin through per-feature metadata and through the dataset attribute +`unmatched_ocean_outlets`. + +Once outlet reconciliation is complete, the simplified river network can be +rasterized onto the shared target grid. Rasterization should produce separate +channel and outlet masks rather than a single overloaded integer raster. + +### Algorithm Design: Standalone River-Network Task + +Date last modified: 2026/04/22 + +Contributors: + +- Xylar Asay-Davis +- Codex + +The current standalone task design uses two thin wrappers rather than one +monolithic task. + +`PrepareRiverNetworkTask` wraps only the shared source-level step and is the +right place to inspect HydroRIVERS conversion and source-grid-independent +simplification choices. `LatLonRiverNetworkTask` adds the shared lat-lon topo +combine step, the shared coastline step, the shared lat-lon river step, and an +optional visualization step so outlet matching and rasterization can be +inspected on a concrete target grid. + +This split keeps each task close to one layer of the interface while still +reusing the same shared steps that the `build_sizing_field` task consumes. + +## Implementation + +### Implementation: Downstream-Ready River Network Products + +Date last modified: 2026/04/27 + +Contributors: + +- Xylar Asay-Davis +- Codex + +The file naming and class layout are now concrete. The river implementation is +organized under `polaris/tasks/mesh/spherical/unified/river/` as: + +- `source.py` for HydroRIVERS download, unpacking, shapefile conversion, and + source-level simplification; +- `lat_lon.py` for target-grid rasterization and outlet reconciliation; +- `base_mesh.py` for coastline-aware clipping and conditioning of retained + river geometry for final mesh generation; +- `viz.py` for diagnostic plotting and text summaries; +- `steps.py` for shared-step setup helpers; +- `task.py` for standalone task wrappers; and +- `river_network.cfg` for the shared configuration sections. + +This implementation prioritizes a clean output contract over carrying forward +the standalone workflow's mixed raster conventions. + +The first Polaris implementation should also avoid making the default workflow +depend on a user-supplied local source-file path. Instead, it should identify +the required public datasets and either download them directly or, only if +necessary, consume them from the Polaris database. + +The source step obtains HydroRIVERS through `add_input_file()` using the public +archive URL in `[prepare_river_network]`, with the Polaris database still +available as a fallback cache location. The lat-lon step then consumes the +shared coastline dataset selected by `[prepare_river_lat_lon]`. The +`PrepareRiverForBaseMeshStep` consumes the simplified network together with the +selected coastline product and writes the clipped river geometry consumed by +the unified base-mesh step. + +### Implementation: Hydrologically Meaningful Simplification + +Date last modified: 2026/04/22 + +Contributors: + +- Xylar Asay-Davis +- Codex + +The current simplification logic lives in +`simplify_river_network_feature_collection()` in +`polaris/tasks/mesh/spherical/unified/river/source.py`. It uses small focused +helpers for canonicalizing segments, validating downstream topology, filtering +outlets, and traversing retained basin structure. + +The implementation favors a compact Polaris-native reimplementation over a +direct migration of +[`mpas_land_mesh`](https://github.com/changliao1025/mpas_land_mesh) +helper layers. No clear defect emerged from the current unit tests, but this +remains an area where additional comparison against real HydroRIVERS output +would strengthen confidence. + +### Implementation: Coastline-Consistent Outlets and Explicit Inland Sinks + +Date last modified: 2026/04/22 + +Contributors: + +- Xylar Asay-Davis +- Codex + +The current implementation keeps coastline matching and inland-sink treatment +explicit in both NetCDF and GeoJSON outputs. `river_network.nc` separates +channel cells, all outlet cells, ocean outlets, and inland sinks, and +`river_outlets.geojson` records both source and snapped positions together with +match status and snapping distance. + +### Implementation: Standalone River-Network Task + +Date last modified: 2026/04/22 + +Contributors: + +- Xylar Asay-Davis +- Codex + +The current implementation adds two lightweight task wrappers in +`polaris/tasks/mesh/spherical/unified/river/task.py` and avoids any separate +task-specific river-processing code path. `PrepareRiverNetworkTask` exposes the +shared source step, while `LatLonRiverNetworkTask` exposes the target-grid +workflow and diagnostics for each supported resolution. + +## Testing + +### Testing and Validation: Downstream-Ready River Network Products + +Date last modified: 2026/04/27 + +Contributors: + +- Xylar Asay-Davis +- Codex + +The implementation now has unit tests for the source-level and target-grid +product contracts in `tests/mesh/spherical/unified/test_river.py`. These tests +verify that the expected masks and snapped-outlet metadata are written, that +ocean-outlet and inland-sink cases remain distinct, and that named unified-mesh +configs provide the required river options. + +Those tests also verify the coastline-aware conditioning used for base-mesh +products, including inland clipping, outlet removal near the coastline, and the +mesh-specific shared-step factory wiring for `river/base_mesh`. The base-mesh +tests then verify that `UnifiedBaseMeshStep` consumes the prepared +`clipped_river_network.geojson` product rather than raw river geometry. + +`build_sizing_field` unit tests consume the target-grid river masks. There is +still not a task-level integration test showing the full river workflow feeding +either the sizing-field task or the final base-mesh task on real data. + +### Testing and Validation: Hydrologically Meaningful Simplification + +Date last modified: 2026/04/25 + +Contributors: + +- Xylar Asay-Davis +- Codex + +Current unit tests validate whether major outlets, main stems, and major +tributaries are retained for representative synthetic networks, including deep +main stems and branching cases. They also verify that invalid cyclic +`NEXT_DOWN` graphs are rejected, that duplicate source features can be +converted from HydroRIVERS shapefile data, and that the HydroRIVERS archive +unpack path behaves as expected for a lightweight archive. + +What is still missing is validation against real HydroRIVERS subsets to ensure +the present heuristics retain scientifically appropriate networks across +different hydrographic settings. + +### Testing and Validation: Coastline-Consistent Outlets and Explicit Inland Sinks + +Date last modified: 2026/04/25 + +Contributors: + +- Xylar Asay-Davis +- Codex + +Current unit tests verify matched and unmatched ocean-outlet behavior, inland- +sink snapping to land cells, derivation of the land mask from `ocean_mask`, +and physical channel-buffering behavior. The visualization step also writes +`river_network_overview.png` and `debug_summary.txt`, which makes outlet +matching diagnostics straightforward to inspect in task runs. + +### Testing and Validation: Standalone River-Network Task + +Date last modified: 2026/04/25 + +Contributors: + +- Xylar Asay-Davis +- Codex + +The standalone tasks are now the primary implementation path for inspecting +river simplification and target-grid diagnostics. Unit tests verify that the +source and lat-lon task-registration code registers the expected named-mesh +tasks, uses mesh-specific subdirectories, and reuses shared configs and steps. + +Standalone smoke tests for each of the 3 supported unified meshes have been +run on Frontier, showing the expected rasterized river networks at each +resolution. Specific parameter choices still need to be fine-tuned.