Skip to content

Tooling to enable generic keyset pagination across sources by expressing the keyset in CQL2. #720

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

bitner
Copy link
Collaborator

@bitner bitner commented Apr 29, 2025

PROOF OF CONCEPT, NOT FOR MERGING

This exposes a next_page_cql2(Item) function on the Items struct that uses Items.sortby to generate a CQL2 Expression that can be appended to the existing CQL2 Expression that will start fetching data beginnning with the passed in Item.

Anticipated workflow would be that the Client would try to fetch one more row than the requested limit. If this extra row exists, it will create a next link that has the additional expression applied. If the Client began with a next link being passed to it, that would then be returned as the prev link.

I think if we architect this right in the Client that this could allow keyset based pagination for any STAC API regardless of backend.

As we look at using rustac as the primary entry point for creating an API for pgstac / stac-geoparquet, this lets us add in this functionality in a reusable way.

Additionally, as we chatted before, I think that moving all the logic for parsing other parameters (items, collections, datetime, ...) so that those just get mapped into CQL2 expressions. Then we can centralize the logic for either "solving" or converting into SQL expressions to CQL2-rs. Basically, I'm proposing that anything that filters data should get converted to CQL2.

Copy link
Member

@gadomski gadomski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Core concept seems fine on first glance. One bit of weirdness with the "fetch one extra" model is that I fetch to arrow record batches in stac-duckdb, so peeling off the last item feels a little weird? ... I guess it's fine if we're returning JSON, but if we add geoparquet responses that breaks down a bit?

/// This trait defines methods for checking JSON operations, combining JSON values,
/// and creating JSON filters for various conditions.
/// A trait for performing JSON-based operations.
pub trait JsonOps {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is Jason+cql2 should it live in cql2-rs?

@bitner
Copy link
Collaborator Author

bitner commented Apr 29, 2025 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants