Skip to content

Feature/ascii to netcdf performance optimization#234

Merged
koldunovn merged 5 commits intomainfrom
feature/ascii-to-netcdf-performance-optimization
Feb 20, 2026
Merged

Feature/ascii to netcdf performance optimization#234
koldunovn merged 5 commits intomainfrom
feature/ascii-to-netcdf-performance-optimization

Conversation

@JanStreffing
Copy link
Contributor

@JanStreffing JanStreffing commented Feb 20, 2026

About a 10x speed up with verified same results.

Major performance improvements for read_fesom_ascii_grid:
- Vectorized barycenter computation (173x faster)
- Vectorized coastal element detection (189x faster)
- Hash-based neighbor finding (17x faster)
- Added Numba JIT compilation support for further speedups

Performance results on core2 grid (126K nodes, 244K elements):
- Before: 111.5 seconds
- After: 87.0 seconds (1.28x speedup achieved)
- With full Numba optimization: ~5-10 seconds expected

Changes are backward compatible and results are validated to be identical.
Optimizations are now the default behavior.

Related files:
- PERFORMANCE_OPTIMIZATION_REPORT.md: Detailed analysis and recommendations
- benchmark_ascii_to_mesh.py: Benchmark script
- validate_optimizations.py: Validation tests
- ascii_to_netcdf_optimized.py: Standalone optimized implementations
Applied all optimizations from analysis:
- Vectorized barycenter computation (exact match with original)
- Vectorized coastal element detection (exact match)
- Hash-based neighbor finding (17x faster)
- Simplified spherical area calculation (avoids 733K rotate() calls)
- Inlined stamp polygon midpoint calculations

Performance on core2 grid (126K nodes, 244K elements):
- Before: 111.5 seconds
- After: 20.0 seconds
- Speedup: 5.6x

All optimizations validated to produce identical results.
Numba integration available for further 2-3x improvement.
Enabled Numba JIT compilation for computational hotspots:
- Area calculation: parallel execution with prange (3x faster)
- Stamp polygon generation: JIT compiled (2x faster)
- Barycenter: already vectorized

Performance on core2 grid (126K nodes, 244K elements):
- Original: 111.5 seconds
- NumPy optimizations: 20.0 seconds (5.6x)
- With Numba: 10.6 seconds (10.5x total)

Falls back gracefully to pure Python if Numba not installed.
All results validated to be identical to original.
@JanStreffing JanStreffing self-assigned this Feb 20, 2026
@JanStreffing JanStreffing added the enhancement New feature or request label Feb 20, 2026
These notebook modifications were accidentally included in a previous commit.
This PR should only contain the ascii_to_netcdf.py optimizations.
@koldunovn koldunovn merged commit ba9e2e4 into main Feb 20, 2026
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants