Why Virtualize GeoTIFFs or COGs?

This repository is a simple set of demonstrations to prompt discussions over whether and how we should approach Virtualizing GeoTIFFs and COGs.

First, some thoughts on why we should virtualize GeoTIFFs and/or COGS:

Provide faster access to non-cloud-optimized GeoTIFFS that contain some form of internal tiling without any data duplication see notebook #1.
Provide fully async I/O for both GeoTIFFs and COGs using Zarr-Python
Allow loading a stack of GeoTIFFS/COGS into a data cube while minimizing the number of GET requests relative to using stackstac/xstac, thereby decreasing cost and increasing performance
Provide users access to a lazily loaded DataTree providing both the data and the overviews, allowing scientists to use the overviews not only for tile-based visualization but also quickly iterating on analytics
Include etags in the virtualized datasets to support reproducibility
A motivation that's less clear to me, but maybe possible, is using the virtualization layer to access COGs with disparate CRSs as a single dataset (zarr-developers/geozarr-spec#53)

License

why-virtualize-geotiff is distributed under the terms of the MIT license.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
01_faster_loading_3.0.ipynb		01_faster_loading_3.0.ipynb
LICENSE.txt		LICENSE.txt
README.md		README.md
pyproject.toml		pyproject.toml