Skip to content

Commit f39c1c1

Browse files
bitnermmcfarland
andauthored
Reorg0.7.0 (#160)
* Change directory structure to put pypgstac and pgstac under /src * switch pypgstac to use hatch * move migrations to the pgstac tree * make symlink in pypgstac for migrations * move pgstac.sql to src/pgstac/ * update scripts and docker setup * Cleanup unused files. Adjust tests to work with new scripts * update sql with partitioning rework and maintenance tooling * fix: allow missing aws credential in pre-commit * switch from methodtools to cachetools, remove commented out code * add fix for #156 --------- Co-authored-by: Matt McFarland <[email protected]>
1 parent f42e233 commit f39c1c1

File tree

187 files changed

+7938
-2354
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

187 files changed

+7938
-2354
lines changed

.devcontainer/devcontainer.json

+5-25
Original file line numberDiff line numberDiff line change
@@ -1,26 +1,6 @@
11
{
2-
"name": "Ubuntu",
3-
"build": {
4-
"dockerfile": "../docker/Dockerfile",
5-
},
6-
7-
// Set *default* container specific settings.json values on container create.
8-
"settings": {
9-
"terminal.integrated.shell.linux": "/bin/bash"
10-
},
11-
12-
// Add the IDs of extensions you want installed when the container is created.
13-
"extensions": [],
14-
15-
// Use 'forwardPorts' to make a list of ports inside the container available locally.
16-
"forwardPorts": [5432],
17-
18-
// Use 'postCreateCommand' to run commands after the container is created.
19-
//"postCreateCommand": "/docker-entrypoint.sh postgres",
20-
"overrideCommand": false,
21-
22-
"containerEnv": {"POSTGRES_HOST_AUTH_METHOD": "trust","POSTGRES_USER":"postgres"},
23-
24-
// Comment out connect as root instead. More info: https://aka.ms/vscode-remote/containers/non-root.
25-
"remoteUser": "root"
26-
}
2+
"name": "PGStac",
3+
"dockerComposeFile": "../docker-compose.yml",
4+
"service": "pgstac",
5+
"workspaceFolder": "/opt/src"
6+
}

.dockerignore

+11
Original file line numberDiff line numberDiff line change
@@ -5,3 +5,14 @@
55
*.eggs
66
venv/*
77
*/.direnv/*
8+
*/.ruff_cache/*
9+
*/.vscode/*
10+
*/.mypy_cache/*
11+
*/.pgadmin/*
12+
*/.ipynb_checkpoints/*
13+
*/.git/*
14+
*/.github/*
15+
*/env/*
16+
Dockerfile
17+
docker-compose.yml
18+
*/.devcontainer/*

.flake8

-4
This file was deleted.

.github/workflows/continuous-integration.yml

+19-3
Original file line numberDiff line numberDiff line change
@@ -6,11 +6,27 @@ on:
66
- main
77
pull_request:
88

9+
env:
10+
REGISTRY: ghcr.io
11+
IMAGE_NAME: ${{ github.repository }}
12+
DOCKER_BUILDKIT: 1
13+
914
jobs:
1015
test:
1116
name: test
1217
runs-on: ubuntu-latest
1318
steps:
14-
- uses: actions/checkout@v2
15-
- name: Execute linters and test suites
16-
run: ./scripts/cibuild
19+
- uses: actions/checkout@v3
20+
- uses: docker/setup-buildx-action@v1
21+
- name: builder
22+
id: builder
23+
uses: docker/build-push-action@v2
24+
with:
25+
context: .
26+
load: true
27+
push: false
28+
cache-from: type=gha
29+
cache-to: type=gha, mode=max
30+
31+
- name: Run tests
32+
run: docker run --rm ${{ steps.builder.outputs.imageid }} test

.github/workflows/release.yml

+12-3
Original file line numberDiff line numberDiff line change
@@ -24,14 +24,23 @@ jobs:
2424
- name: Install release dependencies
2525
run: |
2626
python -m pip install --upgrade pip
27-
pip install setuptools wheel twine
27+
pip install setuptools wheel twine build
2828
29-
- name: Build and publish package
29+
- name: Build pypgstac release
30+
run: |
31+
pushd src/pypgstac
32+
rm -rf dist
33+
python -m build --sdist --wheel
34+
popd
35+
36+
- name: Publish pypgstac release
3037
env:
3138
TWINE_USERNAME: ${{ secrets.PYPI_STACUTILS_USERNAME }}
3239
TWINE_PASSWORD: ${{ secrets.PYPI_STACUTILS_PASSWORD }}
3340
run: |
34-
scripts/cipublish
41+
pushd src/pypgstac
42+
twine upload dist/*
43+
popd
3544
3645
- name: Tag Release
3746
uses: "marvinpinto/[email protected]"

.pre-commit-config.yaml

+61
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,61 @@
1+
# See https://pre-commit.com for more information
2+
# See https://pre-commit.com/hooks.html for more hooks
3+
repos:
4+
- repo: https://github.com/pre-commit/pre-commit-hooks
5+
rev: v4.4.0
6+
hooks:
7+
- id: trailing-whitespace
8+
- id: check-yaml
9+
- id: check-added-large-files
10+
- id: check-toml
11+
- id: detect-aws-credentials
12+
args: [--allow-missing-credential]
13+
- id: detect-private-key
14+
- id: check-json
15+
- id: mixed-line-ending
16+
- id: check-merge-conflict
17+
18+
- repo: https://github.com/charliermarsh/ruff-pre-commit
19+
rev: 'v0.0.231'
20+
hooks:
21+
- id: ruff
22+
files: pypgstac\/.*\.py$
23+
24+
- repo: local
25+
hooks:
26+
- id: sql
27+
name: sql
28+
entry: scripts/test
29+
args: [--basicsql, --pgtap]
30+
language: script
31+
pass_filenames: false
32+
verbose: true
33+
fail_fast: true
34+
files: sql\/.*\.sql$
35+
- id: formatting
36+
name: formatting
37+
entry: scripts/test
38+
args: [--formatting]
39+
language: script
40+
pass_filenames: false
41+
verbose: true
42+
fail_fast: true
43+
always_run: true
44+
- id: pypgstac
45+
name: pypgstac
46+
entry: scripts/test
47+
args: [--pypgstac]
48+
language: script
49+
pass_filenames: false
50+
verbose: true
51+
fail_fast: true
52+
files: pypgstac\/.*\.py$
53+
- id: migrations
54+
name: migrations
55+
entry: scripts/test
56+
args: [--migrations]
57+
language: script
58+
pass_filenames: false
59+
verbose: true
60+
fail_fast: true
61+
files: migrations\/.*\.sql$

CHANGELOG.md

+21
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,27 @@ All notable changes to this project will be documented in this file.
44
The format is based on [Keep a Changelog](http://keepachangelog.com/)
55
and this project adheres to [Semantic Versioning](http://semver.org/).
66

7+
## [v0.7.0]
8+
9+
### Added
10+
- Reorganize code base to create clearer separation between pgstac sql code and pypgstac.
11+
- Move Python tooling to use hatch with all python project configuration in pyproject.toml
12+
- Rework testing framework to not rely on pypgstac or migrations. This allows to run tests on any code updates without creating a version first. If a new version has been staged, the tests will still run through all incremental migrations to make sure they pass as well.
13+
- Add pre-commit to run formatting as well as the tests appropriate for which files have changed.
14+
- Add a query queue to allow for deferred processing of steps that do not change the ability to get results, but enhance performance. The query queue allows to use pg_cron or similar to run tasks that are placed in the queue.
15+
- Modify triggers to allow the use of the query queue for building indexes, adding constraints that are used solely for constraint exclusion, and updating partition and collection spatial and temporal extents. The use of the queue is controlled by the new configuration parameter "use_queue" which can be set as the pgstac.use_queue GUC or by setting in the pgstac_settings table.
16+
- Reorganize how partitions are created and updated to maintain more metadata about partition extents and better tie the constraints to the actual temporal extent of a partition.
17+
- Add "partitions" view that shows stats about number of records, the partition range, constraint ranges, actual date range and spatial extent of each partition.
18+
- Add ability to automatically update the extent object on a collection using the partition metadata via triggers. This is controlled by the new configuration parameter "update_collection_extent" which can be set as the pgstac.update_collection_extent GUC or by setting in the pgstac_settings table. This can be combined with "use_queue" to defer the processing.
19+
- Add many new tests.
20+
- Migrations now make sure that all objects in the pgstac schema are owned by the pgstac_admin role. Functions marked as "SECURITY DEFINER" have been moved to the lower level functions responsible for creating/altering partitions and adding records to the search/search_wheres tables. This should open the door for approaches to using Row Level Security.
21+
- Allow pypgstac loader to load data on pgstac databases that have the same major version even if minor version differs. [162] (https://github.com/stac-utils/pgstac/issues/162) Cherry picked from https://github.com/stac-utils/pgstac/pull/164.
22+
23+
### Fixed
24+
- Allow empty strings in datetime intervals
25+
- Set search_path and application_name upon connection rather than as kwargs for compatibility with RDS [156] (https://github.com/stac-utils/pgstac/issues/156)
26+
27+
728
## [v0.6.13]
829

930
### Fixed

CONTRIBUTING.md

+31-13
Original file line numberDiff line numberDiff line change
@@ -55,24 +55,42 @@ scripts/stageversion 0.2.8
5555
This will create a base migration for the new version and will create incremental migrations between any existing base migrations. The incremental migrations that are automatically generated by this script will have the extension ".staged" on the file. You must manually review (and make any modifications necessary) this file and remove the ".staged" extension to enable the migration.
5656

5757
### Making Changes to SQL
58-
All changes to SQL should only be made in the `/sql` directory. SQL Files will be run in alphabetical order.
58+
All changes to SQL should only be made in the `/src/pgstac/sql` directory. SQL Files will be run in alphabetical order.
5959

6060
### Adding Tests
61-
PGStac uses PGTap to test SQL. Tests can be found in tests/pgtap.sql and are run using `scripts/test`
61+
PGStac tests can be written using PGTap or basic SQL output comparisons. Additional testing is available using PyTest in the PyPgSTAC module. Tests can be run using the `scripts/test` command.
62+
63+
PGTap tests can be written using [PGTap](https://pgtap.org/) syntax. Tests should be added to the `/src/pgstac/tests/pgtap` directory. Any new sql files added to this directory must be added to `/src/pgstac/tests/pgtap.sql`.
64+
65+
The Basic SQL tests will run any file ending in '.sql' in the `/src/pgstac/tests/basic` directory and will compare the exact results to the corresponding '.sql.out' file.
66+
67+
PyPgSTAC tests are located in `/src/pypgstac/tests`.
68+
69+
All tests can be found in tests/pgtap.sql and are run using `scripts/test`
70+
71+
Individual tests can be run with any combination of the following flags "--formatting --basicsql --pgtap --migrations --pypgstac". If pre-commit is installed, tests will be run on commit based on which files have changed.
72+
73+
74+
### To make a PR
75+
1) Make any changes.
76+
2) Make sure there are tests if appropriate.
77+
3) Update Changelog using "### Unreleased" as the version.
78+
4) Make any changes necessary to the docs.
79+
5) Ensure all tests pass (pre-commit will take care of this if installed and the tests will also run on CI)
80+
6) Create PR against the "main" branch.
81+
6282

6383

6484
### Release Process
65-
1) Make sure all your code is added and committed
66-
2) Create a PR against the main branch
67-
3) Once the PR has been merged, start the release process.
68-
4) Upate the version in `pypgstac/pypgstac/version.py`
69-
5) Use `scripts/stageversion VERSION` as documented in migrations section above making sure to rename any files ending in ".staged" in the migrations section
70-
6) Add details for release to the CHANGELOG
71-
7) Add/Commit any changes
72-
8) Run tests `scripts/test`
73-
9) Create a git tag `git tag v0.2.8` using new version number
74-
10) Push the git tag `git push origin v0.2.8`
75-
11) The CI process will push pypgstac to PyPi, create a docker image on ghcr.io, and create a release on github.
85+
1) Run "scripts/stageversion VERSION" (where version is the next version using semantic versioning ie 0.7.0
86+
2) Check the incremental migration created in the /src/pgstac/migrations file with the .staged extension to make sure that the generated SQL looks appropriate.
87+
3) Run the tests against the incremental migrations "scripts/test --migrations"
88+
4) Move any "Unreleased" changes in the CHANGELOG.md to the new version.
89+
5) Open a PR for the version change.
90+
6) Once the PR has been merged, start the release process.
91+
7) Create a git tag `git tag v0.2.8` using new version number
92+
8) Push the git tag `git push origin v0.2.8`
93+
9) The CI process will push pypgstac to PyPi, create a docker image on ghcr.io, and create a release on github.
7694

7795

7896
### Get Involved

Dockerfile

+30-44
Original file line numberDiff line numberDiff line change
@@ -1,55 +1,41 @@
1-
FROM postgres:13 as pg
2-
3-
LABEL maintainer="David Bitner"
4-
1+
FROM postgres:15-bullseye as pg
2+
ENV PGSTACDOCKER=1
53
ENV POSTGIS_MAJOR 3
6-
ENV PGUSER postgres
7-
ENV PGDATABASE postgres
8-
ENV PGHOST localhost
9-
ENV \
10-
PYTHONUNBUFFERED=1 \
11-
PYTHONFAULTHANDLER=1 \
12-
PYTHONDONTWRITEBYTECODE=1 \
13-
PIP_NO_CACHE_DIR=off \
14-
PIP_DISABLE_PIP_VERSION_CHECK=on \
15-
PIP_DEFAULT_TIMEOUT=100
16-
17-
RUN \
18-
apt-get update \
4+
ENV POSTGIS_VERSION 3.3.2+dfsg-1.pgdg110+1
5+
ENV PYTHONPATH=/opt/src/pypgstac:/opt/python:${PYTHONPATH}
6+
ENV PATH=/opt/bin:${PATH}
7+
ENV PYTHONWRITEBYTECODE=1
8+
ENV PYTHONBUFFERED=1
9+
10+
RUN set -ex \
11+
&& apt-get update \
1912
&& apt-get install -y --no-install-recommends \
20-
gnupg \
21-
apt-transport-https \
22-
debian-archive-keyring \
23-
software-properties-common \
13+
ca-certificates \
14+
python3 python-is-python3 python3-pip \
15+
postgresql-$PG_MAJOR-postgis-$POSTGIS_MAJOR=$POSTGIS_VERSION \
16+
postgresql-$PG_MAJOR-postgis-$POSTGIS_MAJOR-scripts \
2417
postgresql-$PG_MAJOR-pgtap \
2518
postgresql-$PG_MAJOR-partman \
26-
postgresql-$PG_MAJOR-postgis-$POSTGIS_MAJOR \
27-
postgresql-$PG_MAJOR-postgis-$POSTGIS_MAJOR-scripts \
28-
build-essential \
29-
python3 \
30-
python3-pip \
31-
python3-setuptools \
32-
&& pip3 install -U pip setuptools packaging \
33-
&& pip3 install -U psycopg2-binary \
34-
&& pip3 install -U psycopg[binary] \
35-
&& pip3 install -U migra[pg] \
3619
&& apt-get remove -y apt-transport-https \
37-
&& apt-get -y autoremove \
38-
&& rm -rf /var/lib/apt/lists/*
20+
&& apt-get clean && apt-get -y autoremove \
21+
&& rm -rf /var/lib/apt/lists/* \
22+
&& mkdir -p /opt/src/pypgstac/pypgstac \
23+
&& touch /opt/src/pypgstac/pypgstac/__init__.py \
24+
&& touch /opt/src/pypgstac/README.md \
25+
&& echo '__version__ = "0.0.0"' > /opt/src/pypgstac/pypgstac/version.py
3926

40-
EXPOSE 5432
27+
COPY ./src/pypgstac/pyproject.toml /opt/src/pypgstac/pyproject.toml
4128

42-
RUN mkdir -p /docker-entrypoint-initdb.d
43-
RUN echo "#!/bin/bash \n unset PGHOST \n pypgstac migrate" >/docker-entrypoint-initdb.d/initpgstac.sh && chmod +x /docker-entrypoint-initdb.d/initpgstac.sh
44-
45-
RUN mkdir -p /opt/src/pypgstac
46-
47-
WORKDIR /opt/src/pypgstac
48-
49-
COPY pypgstac /opt/src/pypgstac
29+
RUN \
30+
pip3 install --upgrade pip \
31+
&& pip3 install /opt/src/pypgstac[dev,test,psycopg]
5032

51-
RUN pip3 install -e /opt/src/pypgstac[psycopg]
33+
COPY ./src /opt/src
34+
COPY ./scripts/bin /opt/bin
5235

53-
ENV PYTHONPATH=/opt/src/pypgstac:${PYTHONPATH}
36+
RUN \
37+
echo "initpgstac" > /docker-entrypoint-initdb.d/999_initpgstac.sh \
38+
&& chmod +x /docker-entrypoint-initdb.d/999_initpgstac.sh \
39+
&& chmod +x /opt/bin/*
5440

5541
WORKDIR /opt/src

Dockerfile.dev

-25
This file was deleted.

README.md

+11-5
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,11 @@
2424

2525
---
2626

27-
**PgSTAC** is a set of SQL function and schema to build highly performant database for Spatio-Temporal Asset Catalog (STAC). The project also provide **pypgstac** python module to help with the database migration and documents ingestion (collections and items).
27+
**PgSTAC** is a set of SQL function and schema to build highly performant database for Spatio-Temporal Asset Catalog ([STAC](https://stacspec.org)). The project also provide **pypgstac** python module to help with the database migration and documents ingestion (collections and items).
28+
29+
PgSTAC provides functionality for STAC Filters and CQL2 search along with utilities to help manage indexing and partitioning of STAC Collections and Items.
30+
31+
PgSTAC is used in production to scale to hundreds of millions of STAC items. PgSTAC implements core data models and functions to provide a STAC API from a PostgreSQL database. As PgSTAC is fully within the database, it does not provide an HTTP facing API. The (Stac FastAPI)[https://github.com/stac-utils/stac-fastapi] PgSTAC backend and (Franklin)[https://github.com/azavea/franklin] can be used to expose a PgSTAC catalog. It is also possible to integrate PgSTAC with any other language that has PostgreSQL drivers.
2832

2933
PgSTAC Documentation: https://stac-utils.github.io/pgstac/pgstac
3034

@@ -36,10 +40,12 @@ pyPgSTAC Documentation: https://stac-utils.github.io/pgstac/pypgstac
3640

3741
```
3842
/
39-
├── pypgstac/ - pyPgSTAC python module
40-
├── scripts/ - scripts to set up the environment
41-
├── sql/ - PgSTAC SQL code
42-
└── test/ - test suite
43+
├── src/pypgstac - pyPgSTAC python module
44+
├── src/pypgstac/tests/ - pyPgSTAC tests
45+
├── scripts/ - scripts to set up the environment, create migrations, and run tests
46+
├── src/pgstac/sql/ - PgSTAC SQL code
47+
├── src/pgstac/migrations/ - Migrations for incremental upgrades
48+
└── src/pgstac/tests/ - test suite
4349
```
4450

4551
## Contribution & Development

0 commit comments

Comments
 (0)