Skip to content

feat: 通用评审生态位 — 产出普通代码级问题 (generalFindings)#13

Merged
NeverENG merged 2 commits into
mainfrom
feat/general-niche-coverage
May 31, 2026
Merged

feat: 通用评审生态位 — 产出普通代码级问题 (generalFindings)#13
NeverENG merged 2 commits into
mainfrom
feat/general-niche-coverage

Conversation

@NeverENG

Copy link
Copy Markdown
Owner

背景

架构级 findings 是 BanGD 的差异化价值,但只产出架构级问题会留下缺口:任何称职通用评审者都会指出的普通 bug(分支写反、off-by-one、空指针、吞错、资源泄漏、明显竞态)如果 BanGD 不报,就把这块生态位让给了 Copilot。

本 PR 让一次评审同时产出第二类结果 generalFindings——普通代码级问题——在占住通用生态位的同时,保住垂类架构深度。

设计(与架构级刻意区分,详见 DESIGN.md §七)

  • 结构更轻file/line/severity/category/title/description/suggestion不走四段式——指出位置 + 普通修法即可。
  • 质量红线(写进系统提示词):只报有 diff 证据、能定位的确凿正确性/逻辑问题;不报风格/命名/格式/"建议加测试"等 nits;不与架构级 finding 重复;最多约 6 条,没有就 []。目的是不让 BanGD 沦为嘈杂的通用 linter
  • 同样经对抗式复核verifyGeneralFindings 复用多反驳者、严格多数否决的同一套验证(反驳提示词改为"是否真实正确性缺陷")。
  • 投递方式不同:架构问题以 Issue 跟踪;普通问题只在 PR 汇总评论里内联列出(即时 bug 提示,随 diff 失效,不必建 Issue)。
  • 可降级generalFindings 在 schema 里 .default([]),模型遗漏字段不致整条评审解析失败。

改动

  • schema.tsGeneralFindingSchema + generalFindings + JSON schema 同步(测试守护)
  • prompt.ts / system-prompt.md:产出第 4 部分 + 质量红线
  • verify.ts:泛型化 VerifyOutcome<T> / verifyItems<T>,新增 verifyGeneralFindings
  • review.ts:两类 finding 并行验证,ReviewOutcome.droppedGeneralFindings
  • format.ts:普通问题内联渲染,无则省略整节
  • action.ts / action.ymlgeneral_finding_count 输出,dropped 计数合并两类
  • DESIGN.md:§一表格 + 新增 §七

验证

typecheck / lint 通过;npm test 82 passed;Action bundle 已重打 (dist/)。

已知后续(非本 PR 范围)

  • generalFindings 目前零样本——尚无 few-shot 范例(架构维度每个都有)。这是最高杠杆的后续补强。

🤖 Generated with Claude Code

NeverENG and others added 2 commits May 31, 2026 10:55
…eralFindings)

Architecture findings are BanGD's differentiator, but emitting only those
cedes ordinary bugs (off-by-one, swallowed errors, nil deref, races) to
Copilot. Add a second result class, generalFindings: concrete diff-evidenced
correctness/logic defects, lightweight (no four-段式), filed inline in the PR
comment (not as tracked issues).

- schema: GeneralFindingSchema + generalFindings (.default([]) for graceful
  degradation), kept in sync with the tool JSON schema by a test
- prompt/system-prompt: 4th output part + quality red-lines (diff-evidenced
  only, no style nits, no dup of architecture findings, <=6, [] when none)
- verify: generalize VerifyOutcome<T>/verifyItems<T>; verifyGeneralFindings
  runs the SAME adversarial majority-refute pass (refuter prompt reframed to
  "is this a real correctness defect")
- review: verify both finding kinds in parallel; ReviewOutcome.droppedGeneralFindings
- format: render general findings inline; omit the section when empty
- action(.yml): general_finding_count output; dropped count covers both kinds
- DESIGN.md §七: why both classes coexist, structural/delivery differences
- tests: rendering, verification, schema-sync, default-omit coverage

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
PR #13 shipped the generalFindings niche described in the system prompt but
with no worked example, while every architecture dimension has one. A wrong-or-
absent few-shot is the single biggest quality lever per CLAUDE.md, so add the
missing exemplar.

- prompts/examples/general-findings.md: a worked example (binary-search off-by-
  one that infinite-loops) showing the 7-field generalFinding shape AND, just as
  important, the red-line — what NOT to report (style/naming nits, "add a test",
  unfounded speculation, dup of an architecture finding), plus the boundary vs
  architecture findings.
- prompt.ts: assembleSystemPrompt takes an always-on generalExample, appended in
  its own block regardless of selected dimensions (generalFindings is requested
  on every review). Architecture examples relabeled "架构级 Few-shot 范例".
- prompts.ts: PromptTexts.generalExample, loaded unconditionally (not dimension-
  gated); lives in the prompt-cached system block so marginal token cost ~= nil.
- DESIGN.md §七: note the niche now ships with its own few-shot (parity with the
  per-dimension architecture examples).
- tests: new prompt.test.ts (assembleSystemPrompt always-on behavior + user
  prompt parts); prompts.test.ts loads + red-line check; review.test.ts asserts
  the exemplar is present regardless of dimension selection.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant