Skip to content

Batch-backfill 104 unfilled Seeker stub research entries #20253

@rjwalters

Description

@rjwalters

Problem

104 entries in `src/data/research/problems/*.json` are stubs — either `title === "[Problem Title]"` or `problemStatement.formal` contains the literal `[LaTeX formulation of the theorem/conjecture]` placeholder. PR #20251 hides these from `research-listings.json` so the public gallery stops surfacing them, but the data is still on disk and would render placeholder content if anyone deep-links to the slug.

Inventory (from PR #20251 build summary)

`Listings: 1626 (skipped 104 unfilled Seeker stubs)`

Sample slugs filtered today: erdos-1023-oq-01, bezout-identity-oq-02-oq-02, bounded-prime-gaps-oq-01, brouwer-fixed-point-oq-01, buffons-needle-oq-01-oq-01, cayley-hamilton-minpoly-oq-01, cevas-theorem-oq-02-oq-01, derangements-oq-02, greens-theorem-oq-01, buffons-needle-oq-02. Full set printed by running `npx tsx scripts/research/build.ts`.

Three tiers of stubs based on what's available to backfill from:

  1. Has substantial knowledge.md (>1KB) and selection-report.md — e.g. `brouwer-fixed-point-oq-04-oq-04`, `buffons-needle-oq-01-oq-01`, `divisibility-rules-oq-02`, `lovasz-local-lemma-oq-02`. Backfill from those sources, mirroring the manual approach in PR fix(research): backfill law-of-cosines-oq-06 stub + remove misnamed duplicate #19979.
  2. Has selection-report.md only — synthesize `problem.md` from the selection report's rationale and target.
  3. Has only the minimal-template knowledge.md (~340 bytes) — Seeker created the entry but no Researcher ever worked on it. These are honestly empty; the right move may be to delete the directory and `src/data/research/problems/.json` rather than fabricate content.

Acceptance criteria

  • All entries currently listed by the `Listings: ... (skipped N unfilled Seeker stubs)` summary either:
    • have a proper `title`, `problemStatement.formal`, `problemStatement.plain`, and reappear in listings; OR
    • are removed entirely (research dir + site JSON + registry entry).
  • `isUnfilledStub()` reports 0 skipped stubs in the final build.

Suggested approach

  • Categorize all 104 via a one-shot script (use the `isUnfilledStub` heuristic and check the sizes of `knowledge.md` / `selection-report.md`).
  • Tier 1 → Researcher/Curator agent batch task, one PR per ~10 entries.
  • Tier 2 → smaller manual or scripted backfill.
  • Tier 3 → delete in a single PR.

Labels

curator-please-prioritize, area:research-data

Metadata

Metadata

Assignees

No one assigned

    Labels

    loom:curatedEnhanced by Curator, awaiting human approval

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions