Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 27 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ It is built to outgrow assistant-grade discovery: classical NLP, neural NLP, sem

## Product thesis

Watson-style systems answer. Holmes investigates.
Component NLP annotates. Holmes investigates.

Holmes is not a chatbot wrapper, a loose model zoo, or a domain NLP repo. It is the governed language layer above search, evidence, retrieval, casefiles, semantic graphs, tools, models, evals, and agents.

Expand All @@ -30,8 +30,32 @@ Holmes is not a chatbot wrapper, a loose model zoo, or a domain NLP repo. It is
5. Neural NLP: transformers, embeddings, rerankers, span extraction, relation extraction, multilingual encoders.
6. Foundation language services: extraction, summarization, generation, translation, RAG answering, long-context analysis, tool planning.
7. Retrieval and knowledge: sparse/dense/hybrid retrieval, vector stores, GraphBrain, semantic-serdes, ontogenesis, Slash Topics, Sherlock Search.
8. Guardrails and governance: PII checks, source provenance, prompt-injection checks, policy gates, eval gates, factsheets, promotion records.
9. Agent and tool orchestration: tool contracts, agent identity, sessions, memory, MCP/A2A, execution traces, model routing.
8. Topic-model training support: topic seeds, topic boundaries, candidate labels, taxonomy candidates, topic-pack generation receipts, and Slash Topics training references.
9. Guardrails and governance: PII checks, source provenance, prompt-injection checks, policy gates, eval gates, factsheets, promotion records.
10. Agent and tool orchestration: tool contracts, agent identity, sessions, memory, MCP/A2A, execution traces, model routing.

## Slash Topics training role

Holmes supports Slash Topics by producing governed topic-model training artifacts, not opaque topic labels.

For Slash Topics, Holmes emits or prepares topic seeds, positive/negative/adjacent/ambiguous boundary evidence, candidate labels, topic taxonomy candidates, eval slices, and replayable topic-pack generation receipts. Slash Topics owns `/topic` pack semantics and policy membranes; Holmes owns language evidence, candidate generation, model-training support, and promotion evidence required before a topic model or topic pack can become stable.

## NLP component alignment

Holmes explicitly covers these component families:

- basic primitives;
- advanced primitives;
- rule techniques;
- classical ML;
- neural NLP;
- transformers;
- foundation-language services;
- retrieval and knowledge;
- guardrails and governance;
- agent and tool orchestration.

The alignment contract is documented in [`docs/NLP_COMPONENT_ALIGNMENT.md`](docs/NLP_COMPONENT_ALIGNMENT.md). That document is the lower-layer NLP map for Holmes, nlplab, Sherlock Search, Slash Topics, and the platform runtime.

## Repo role

Expand Down
130 changes: 130 additions & 0 deletions docs/NLP_COMPONENT_ALIGNMENT.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,130 @@
# NLP Component Alignment

## Purpose

Holmes needs a disciplined NLP component map so it can support primitive analysis, task models, retrieval, evidence, semantic graph conversion, policy, topic-model training, and agentic investigation without becoming a loose model zoo.

This document records the lower-level NLP families Holmes must cover and how those families map across Holmes, nlplab, Sherlock Search, Slash Topics, and prophet-platform.

## Component map

| Family | Holmes surface | Lab/runtime owner | Sherlock and Slash Topics boundary |
| --- | --- | --- | --- |
| Basic primitives | `language.primitive.v1/Analyze` | `SociOS-Linux/nlplab` prototypes adapters; `prophet-platform` hosts stable services | Sherlock indexes primitive outputs only as pointer-backed evidence; Slash Topics may consume normalized language features as training evidence, not admitted topics |
| Advanced primitives | dependency parsing, semantic role labeling, coreference, morphology extensions | `nlplab` evaluates parser, SRL, and coreference adapters | Sherlock searches over parse, entity, and relation evidence; Slash Topics may use these structures for candidate topic boundaries and topic-feature extraction |
| Rule techniques | rule packs, gazetteers, dictionaries, regular expressions, table/header rules | `nlplab` keeps rule DSL experiments; Holmes promotes validated rule packs | Preserve rule version, policy decision, source, handling tags, evidence refs, and topic-pack training refs |
| Classical ML | CRF, SVM/logistic/maxent, clustering, topic modeling, similarity baselines | `nlplab` benchmarks and calibrates classical models | Sherlock retrieves model outputs with corpus, model, and eval refs; Slash Topics consumes clustering/topic assignments as governed topic-model candidates |
| Neural NLP | sequence/text models and embedding pipelines | `nlplab` handles PyTorch/ONNX experiments and benchmarks | Index spans, classes, and embedding metadata under evidence controls; Slash Topics consumes embeddings only through receipt-backed training packs |
| Transformers | token classification, text classification, relation extraction, embeddings, reranking, translation, summarization, RAG | `nlplab` evaluates candidate models; Mycroft routes by cost, quality, privacy, and latency | Search and rerank evidence packets under policy ceilings; topic-training outputs require corpus, eval, guardrail, and rollback records |
| Task models | entities, numeric entities, PII, sentiment, target sentiment, categories, concepts, keywords, relations, emotion, tone, topic assignments | Holmes exposes stable contract families after eval and promotion | Sherlock indexes outputs with provenance and confidence; Slash Topics receives topic seeds, labels, taxonomies, negative examples, and training receipts |

## Architectural claim

A component NLP library can extract spans, tags, classes, relations, and task predictions. Holmes must do more:

- bind every output to corpus, model, policy, eval, and evidence references;
- route among rule, classical, neural, transformer, and foundation-language paths using explicit cost, latency, quality, and privacy constraints;
- preserve source provenance and rollback metadata;
- convert selected outputs into semantic graph candidates;
- produce governed topic-training inputs for Slash Topics, including topic seeds, candidate labels, topic boundaries, negative examples, evaluation slices, and topic-pack generation receipts;
- support contradiction detection, claim extraction, and casefile assembly;
- keep retrieval, topic training, indexing, and truth promotion as separate surfaces;
- require promotion evidence before a pipeline becomes stable.

The target position is:

> Component NLP annotates. Holmes investigates, governs, retrieves, graphs, trains topic surfaces, reasons, and promotes with evidence.

## Slash Topics training alignment

Slash Topics are governed, signed, replayable scopes for search and knowledge surfaces. Holmes must help Slash Topics train new topic models by emitting evidence-bound training artifacts rather than opaque model outputs.

Holmes-owned outputs for Slash Topics should include:

1. `TopicSeedCandidate` records derived from keywords, concepts, entities, clusters, claims, and evidence spans;
2. `TopicBoundaryEvidence` records that separate positive, negative, adjacent, and ambiguous topic examples;
3. `TopicLabelCandidate` records with source spans, language, confidence, and curator-review status;
4. `TopicTaxonomyCandidate` records mapping broader, narrower, related, excluded, and membrane-scoped topic relations;
5. `SlashTopicTrainingRef` records pointing to corpus snapshots, model versions, rule packs, eval slices, policy decisions, and rollback refs;
6. `TopicPackGenerationReceipt` records for candidate `/topic` pack creation, replay, and promotion.

Holmes may propose topic-model candidates. Slash Topics owns topic-pack semantics and membranes. Policy Fabric owns admission. Sherlock indexes topic evidence and retrieval behavior. The Canon records accepted topic evidence and source trust.

No Holmes topic model may be promoted without:

- corpus snapshot and split manifest;
- topic taxonomy version;
- positive, negative, adjacent, and ambiguous examples;
- topic-model eval record;
- membrane/policy decision reference;
- training eligibility and redaction check;
- replayable topic-pack generation receipt;
- rollback reference.

## Algorithm selection doctrine

Holmes should not default every task to transformers.

Use rules when the variation space is bounded, latency requirements are strict, labels are unavailable, or policy-sensitive patterns need deterministic inspection.

Use classical ML when training must be fast, features are strong, labels exist, and the workload is CPU-bound or latency-sensitive.

Use neural non-transformer models when higher quality is required but transformer runtime cost is unacceptable.

Use transformers and foundation-language services when multilinguality, semantic abstraction, long-context synthesis, or task quality justifies compute cost and governance overhead.

Use hybrid pipelines when deterministic guards, statistical extraction, retrieval grounding, topic training, and foundation-language synthesis must be composed under one evidence contract.

## Required executable proof

Holmes should not claim runtime superiority until `nlplab` produces benchmark receipts for:

1. primitive quality and speed;
2. entity, relation, and classification metrics;
3. PII and sensitive-context precision/recall;
4. retrieval impact through Sherlock evidence packets;
5. semantic graph conversion fidelity;
6. Slash Topics topic-model training quality, topic-boundary precision/recall, and topic-pack replay fidelity;
7. policy propagation and rollback coverage;
8. cost, latency, and memory profiles across CPU and GPU lanes.

## Required records

The next standards and runtime work should define or import these records:

- `LanguageAnalysisRecord`;
- `PrimitiveSpan`;
- `EntityMention`;
- `RelationMention`;
- `ClassificationDecision`;
- `TopicAssignment`;
- `TopicSeedCandidate`;
- `TopicBoundaryEvidence`;
- `TopicLabelCandidate`;
- `TopicTaxonomyCandidate`;
- `SlashTopicTrainingRef`;
- `TopicPackGenerationReceipt`;
- `SentimentDecision`;
- `KeywordCandidate`;
- `ClaimRecord`;
- `SemanticGraphCandidate`;
- `LanguagePipelineReceipt`;
- `HolmesEvidencePacket`.

## Promotion rule

A Holmes NLP component graduates only when it has:

1. corpus reference;
2. pipeline or model reference;
3. algorithm family declaration;
4. task contract;
5. quality evaluation;
6. latency and footprint measurement;
7. Slash Topics training reference when the output affects topic models or topic packs;
8. guardrail policy result;
9. evidence receipt;
10. promotion record;
11. rollback reference.

This keeps local labs, governed platform services, SourceOS clients, Slash Topics, and Sherlock retrieval connected without collapsing those layers into one monolith.
54 changes: 53 additions & 1 deletion examples/holmes-surface.json
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
},
"spec": {
"product": "Holmes",
"tagline": "Watson-style systems answer. Holmes investigates.",
"tagline": "Component NLP annotates. Holmes investigates.",
"components": [
"sherlock-search",
"221b",
Expand All @@ -17,6 +17,51 @@
"the-canon",
"deduction-engine"
],
"componentFamilies": [
"basic-primitives",
"advanced-primitives",
"rule-techniques",
"classical-ml",
"neural-nlp",
"transformers",
"foundation-language-services",
"retrieval-and-knowledge",
"guardrails-and-governance",
"agent-and-tool-orchestration"
],
"nlpTasks": [
"language-identification",
"sentence-segmentation",
"tokenization",
"lemmatization",
"part-of-speech-tagging",
"morphological-features",
"dependency-parsing",
"semantic-role-labeling",
"entity-extraction",
"numeric-entity-extraction",
"pii-extraction",
"coreference-resolution",
"relation-extraction",
"text-classification",
"zero-shot-classification",
"sentiment-classification",
"target-sentiment-extraction",
"keyword-extraction",
"category-classification",
"concept-linking",
"topic-modeling",
"topic-model-training",
"topic-taxonomy-induction",
"topic-pack-generation",
"topical-clustering",
"text-similarity",
"table-header-identification",
"claim-extraction",
"contradiction-detection",
"semantic-graph-conversion",
"evidence-governance"
],
"methodFamilies": [
"language.primitive.v1/Analyze",
"language.entity.v1/Extract",
Expand All @@ -27,13 +72,19 @@
"language.translate.v1/Translate",
"language.summarize.v1/Summarize",
"language.rag.v1/Answer",
"language.topic.v1/Propose",
"language.topic.v1/Train",
"language.graph.v1/ToSemanticGraph",
"language.govern.v1/Evaluate"
],
"requiredPromotionEvidence": [
"corpusRef",
"pipelineOrModelRef",
"algorithmFamily",
"taskContract",
"evalRecord",
"latencyFootprintRecord",
"slashTopicsTrainingRef",
"guardrailPolicy",
"evidenceReceipt",
"promotionRecord",
Expand All @@ -43,6 +94,7 @@
"standards": "SocioProphet/functional-model-surfaces",
"platform": "SocioProphet/prophet-platform",
"search": "SocioProphet/sherlock-search",
"slashTopics": "SocioProphet/slash-topics",
"lab": "SociOS-Linux/nlplab",
"sourceosCarry": "SourceOS-Linux/sourceos-model-carry"
}
Expand Down
81 changes: 72 additions & 9 deletions tools/validate_holmes.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,10 +16,63 @@
"the-canon",
"deduction-engine",
}
REQUIRED_COMPONENT_FAMILIES = {
"basic-primitives",
"advanced-primitives",
"rule-techniques",
"classical-ml",
"neural-nlp",
"transformers",
"foundation-language-services",
"retrieval-and-knowledge",
"guardrails-and-governance",
"agent-and-tool-orchestration",
}
REQUIRED_NLP_TASKS = {
"language-identification",
"sentence-segmentation",
"tokenization",
"lemmatization",
"part-of-speech-tagging",
"morphological-features",
"dependency-parsing",
"semantic-role-labeling",
"entity-extraction",
"numeric-entity-extraction",
"pii-extraction",
"coreference-resolution",
"relation-extraction",
"text-classification",
"zero-shot-classification",
"sentiment-classification",
"target-sentiment-extraction",
"keyword-extraction",
"category-classification",
"concept-linking",
"topic-modeling",
"topic-model-training",
"topic-taxonomy-induction",
"topic-pack-generation",
"topical-clustering",
"text-similarity",
"table-header-identification",
"claim-extraction",
"contradiction-detection",
"semantic-graph-conversion",
"evidence-governance",
}
REQUIRED_METHOD_FAMILIES = {
"language.topic.v1/Propose",
"language.topic.v1/Train",
}
REQUIRED_EVIDENCE = {
"corpusRef",
"pipelineOrModelRef",
"algorithmFamily",
"taskContract",
"evalRecord",
"latencyFootprintRecord",
"slashTopicsTrainingRef",
"guardrailPolicy",
"evidenceReceipt",
"promotionRecord",
Expand All @@ -32,6 +85,14 @@ def fail(message: str) -> int:
return 1


def require_set(spec: dict, field: str, required: set[str]) -> int | None:
observed = set(spec.get(field, []))
missing = required - observed
if missing:
return fail(f"missing {field}: {sorted(missing)}")
return None


def main() -> int:
if not EXAMPLE.exists():
return fail("missing examples/holmes-surface.json")
Expand All @@ -41,16 +102,18 @@ def main() -> int:
if data.get("kind") != "HolmesSurface":
return fail("wrong kind")
spec = data.get("spec", {})
components = set(spec.get("components", []))
missing_components = REQUIRED_COMPONENTS - components
if missing_components:
return fail(f"missing components: {sorted(missing_components)}")
evidence = set(spec.get("requiredPromotionEvidence", []))
missing_evidence = REQUIRED_EVIDENCE - evidence
if missing_evidence:
return fail(f"missing promotion evidence: {sorted(missing_evidence)}")
for field, required in [
("components", REQUIRED_COMPONENTS),
("componentFamilies", REQUIRED_COMPONENT_FAMILIES),
("nlpTasks", REQUIRED_NLP_TASKS),
("methodFamilies", REQUIRED_METHOD_FAMILIES),
("requiredPromotionEvidence", REQUIRED_EVIDENCE),
]:
result = require_set(spec, field, required)
if result is not None:
return result
integrations = spec.get("integrations", {})
for key in ["standards", "platform", "search", "lab", "sourceosCarry"]:
for key in ["standards", "platform", "search", "slashTopics", "lab", "sourceosCarry"]:
if key not in integrations:
return fail(f"missing integration: {key}")
print("OK: Holmes contracts validated")
Expand Down
Loading