feat(core): optional distributed object cache for query results#1378
feat(core): optional distributed object cache for query results#1378scottbuscemi wants to merge 6 commits into
Conversation
Add an opt-in read-through cache that sits beneath the per-request cache and above the database, so content and chrome (settings, menus, taxonomies) reads can be served from a fast key/value store instead of hitting D1/SQLite on every request. - New ObjectCache abstraction (interface + descriptor + virtual module + per-isolate backend), mirroring the storage adapter pattern. Off by default: when unconfigured, cachedQuery is a transparent passthrough. - Backends: in-isolate memory (emdash/object-cache/memory via memoryCache() from emdash/astro) and Cloudflare KV (@emdash-cms/cloudflare/cache/kv via kvCache()). - JSON codec preserves Date instances; content entries are snapshotted (dropping the .edit proxy, capturing the CURSOR_RAW_VALUES symbol) and rebuilt on read. - Epoch-based invalidation at the repository chokepoint (content, seo, byline, taxonomy, menu) and settings; content reads fold in shared bylines/taxonomies epochs so author/term renames invalidate correctly. - Auth/preview/edit-mode and isolated DBs always bypass. Existing sites are unaffected until they opt in.
A KV read that stalls without resolving or rejecting (cold cross-region read, or one queued behind the Workers connection limit) could hang the isolate: getEpoch cached the never-settling promise and every later cached read on that namespace reused it, poisoning the isolate until it recycled. - Race every backend read against a timeout (default 2000ms, configurable via the `timeout` option on kvCache/objectCache, 0 disables). A timed-out read degrades to a cache miss; the database stays the source of truth. - Apply the timeout in the KV backend (get/set/delete) and in the core read path (getEpoch + cachedQuery value read), so any backend that stalls self-heals once the bounded read settles. - Also switch the cache debug-log gate from process.env to import.meta.env.DEV (repo convention). Adds regression tests: a never-settling backend resolves via load() instead of hanging, the namespace self-heals afterward, and the KV backend rejects a stalled get/set.
Public renders that read an entry's terms still hit D1 on every request even with the object cache on: getEmDashEntry caches the entry (and bakes in byline/term hydration), but templates that call getEntryTerms / getTermsForEntries / getTerm directly fell through to D1, because only the taxonomy *definitions* (getTaxonomyDefs) and full term *lists* (getTaxonomyTerms) were wrapped. On a warm content-cache hit, hydration (which used to prime the request cache for getEntryTerms) doesn't run, so those direct calls query the database — a cache-busted load that should be served entirely from KV still pays D1 round-trips. Wrap the per-entry/term taxonomy reads in cachedQuery: - getEntryTerms, getTermsForEntries — namespaced under [content:<collection>, taxonomies]; assignments bump taxonomies and content writes bump content:<collection>, so they invalidate correctly. getEntryTerms keeps its requestCached wrapper so hydration priming still short-circuits within a request. - getTerm — namespaced under taxonomies (count is TTL-bounded). - getTermsForEntries returns a Map (not JSON-serializable): cache it as an array of [entryId, terms] pairs and rebuild the Map on read. Large id batches (which come from collection hydration, already served by the content cache) bypass the object cache to stay under KV's key-size limit. getEntriesByTerm already delegates to the cached getEmDashCollection, and getAllTermsForEntries only runs behind a content-cache miss, so neither needs separate wrapping. Test: with a configured backend, getEntryTerms and getTermsForEntries serve the second read from KV with D1 made unavailable, and the Map round-trips correctly.
…-trip
The read path did two sequential KV round-trips per cached query: read the
namespace epoch(s) to build the key, then read the value. On a cold isolate
(epochs not yet cached in-memory) a page making several cached reads paid
that doubled latency on each one.
Make the value key epoch-independent and store the namespace epochs inside
the value envelope ({ e: epochs, v: value }). A read now fetches the value
and all epochs concurrently (Promise.all) and treats it as a HIT only when
every stored epoch still matches the current one — one round-trip instead of
two. Invalidation is unchanged from the caller's view (bump the epoch; the
next read sees a mismatch and reloads), but a stale value is now overwritten
in place under its stable key rather than orphaned under a dead epoch-keyed
name — so KV no longer accumulates orphaned generations between TTL sweeps.
Note this parallelizes the epoch/value reads *within* each cached query;
ordering across a template's awaits is still the template's concern (use
Promise.all for independent reads).
Existing object-cache, content, taxonomy, and edge-cache tests pass
unchanged (behavior is identical: hit after first load, reload after
invalidation, multi-namespace busting, timeout-to-miss).
…ache These were the last per-request D1 reads on a public post render. The <Comments> component server-renders two reads on every page — even with content/taxonomy reads already served from KV: - getCollectionInfo (the commentsEnabled / supports / fields lookup), and - getComments (approved comments), when comments are enabled. Wrap both in cachedQuery: - getCollectionInfo → `schema` namespace, busted by invalidateUrlPatternCache (every schema-mutation path already routes through it, so editing a collection's settings/fields invalidates it). - getComments → `comments` namespace, busted by any CommentRepository write (create / status change / delete), so a new or moderated comment shows without waiting for TTL. With this, a warm-isolate logged-out post render makes no D1 query — the whole render is served from KV. Tests: getCollectionInfo and getComments serve the second read with D1 unavailable, and reload after a schema change / comment write respectively.
🦋 Changeset detectedLatest commit: 18164d6 The changes in this PR will be included in the next version bump. This PR includes changesets to release 14 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
PR template validation failedPlease fix the following issues by editing your PR description:
See CONTRIBUTING.md for the full contribution policy. |
Scope checkThis PR changes 2,636 lines across 39 files. Large PRs are harder to review and more likely to be closed without review. If this scope is intentional, no action needed. A maintainer will review it. If not, please consider splitting this into smaller PRs. See CONTRIBUTING.md for contribution guidelines. |
Deploying with
|
| Status | Name | Latest Commit | Updated (UTC) |
|---|---|---|---|
| ✅ Deployment successful! View logs |
docs | 18164d6 | Jun 08 2026, 05:10 AM |
Deploying with
|
| Status | Name | Latest Commit | Updated (UTC) |
|---|---|---|---|
| ❌ Deployment failed View logs |
emdash-playground | 18164d6 | Jun 08 2026, 05:14 AM |
@emdash-cms/admin
@emdash-cms/auth
@emdash-cms/auth-atproto
@emdash-cms/blocks
@emdash-cms/cloudflare
@emdash-cms/contentful-to-portable-text
emdash
create-emdash
@emdash-cms/gutenberg-to-portable-text
@emdash-cms/plugin-cli
@emdash-cms/plugin-types
@emdash-cms/registry-client
@emdash-cms/registry-lexicons
@emdash-cms/sandbox-workerd
@emdash-cms/x402
@emdash-cms/plugin-ai-moderation
@emdash-cms/plugin-atproto
@emdash-cms/plugin-audit-log
@emdash-cms/plugin-color
@emdash-cms/plugin-embeds
@emdash-cms/plugin-field-kit
@emdash-cms/plugin-forms
@emdash-cms/plugin-webhook-notifier
commit: |
Deploying with
|
| Status | Name | Latest Commit | Updated (UTC) |
|---|---|---|---|
| ✅ Deployment successful! View logs |
emdash-demo-cache | 18164d6 | Jun 08 2026, 05:12 AM |
Overlapping PRsThis PR modifies files that are also changed by other open PRs:
This may cause merge conflicts or duplicated work. A maintainer will coordinate. |
What does this PR do?
Adds an optional, opt-in distributed object cache for query results. Content reads (
getEmDashCollection,getEmDashEntry,resolveEmDashPath) and chrome reads (site settings, menus, taxonomies, per-entry terms, collection info, public comments) can be served from a fast key/value store instead of hitting the database on every request. It sits beneath the per-request cache and above the database, dramatically reducing read pressure on D1/SQLite — especially on Cloudflare, where KV absorbs far more requests than D1.The cache is off by default and fully opt-in. Configure a backend in
astro.config.mjs:Invalidation is epoch-based and automatic: content, byline, taxonomy, menu, and settings writes bump a per-namespace version, instantly orphaning stale entries (no key enumeration). Value and epoch are fetched in one parallel round-trip via a stable-key envelope, so there are no orphaned keys to accumulate. Authenticated, preview, and visual-edit requests always bypass the cache, so editors see live content immediately; anonymous visitors may see content up to
revalidatems stale (default 1s, configurable). Backend reads are bounded by a timeout so a slow/unavailable cache can never hang a render.Existing sites are unaffected until they opt in.
Closes #
Type of change
Checklist
pnpm typecheckpassespnpm lintpassespnpm testpasses (or targeted tests for my change) — object-cache unit/content/comments/schema/entry-terms + cloudflare kv-timeout suitespnpm formathas been runAI-generated code disclosure
Screenshots / test output