QWED Version
5.0.0
Python Version
3.11.9
Operating System
Windows
Which engine is affected?
Fact Verifier
Input that caused the bug
from qwed_new.core.fact_verifier import FactVerifier
verifier = FactVerifier(use_llm_fallback=True)
claim = "The treaty was signed in 1842."
context = """
The treaty negotiations began in 1840.
Sources disagree on the final signing year.
Some references mention 1841, others 1843.
"""
result = verifier.verify_fact(
claim=claim,
context=context,
min_confidence=0.95,
provider="openai", # any configured provider path that enables fallback
)
print(result)
### Expected behavior
When deterministic fact verification is low-confidence or inconclusive, the engine must fail closed.
Expected behavior:
- low-confidence deterministic analysis must not be upgraded into a final fact verdict by an LLM
- external model output may be attached as advisory reasoning only
- final verdict must remain deterministic, explicit, and non-pass unless proven from deterministic checks
QWED principle:
Risky interpretation may inform analysis, but it must never decide truth.
### Actual behavior
`FactVerifier.verify_fact()` currently allows LLM fallback output to overwrite the deterministic verdict when confidence is below the threshold.
Relevant code in `src/qwed_new/core/fact_verifier.py`:
```python
if confidence < min_confidence and self.use_llm_fallback and provider:
methods_used.append("llm_fallback")
llm_result = self._llm_fallback(claim, context, provider)
if llm_result:
verdict = llm_result.get("verdict", verdict)
confidence = max(confidence, llm_result.get("confidence", 0) * 0.8)
reasoning += f"\n\nLLM Analysis: {llm_result.get('reasoning', '')}"
This means:
- deterministic verification can end in one verdict
- an external LLM can replace that verdict
- the final engine output can become model-decided rather than verifier-decided
Additional context
Severity
HIGH
Why this violates QWED philosophy
QWED’s core principles are explicit:
- fail-closed by default
- zero trust in LLM outputs
- no heuristic acceptance of results
- verification must be deterministic and explicit
This code path violates those principles because the LLM is not just providing commentary. It is allowed to modify the final fact verdict.
That turns the fallback model into a truth-decider inside a verifier.
Attack / bypass scenario
An attacker can craft:
- low-signal claims
- ambiguous supporting context
- semantically noisy passages that keep deterministic scores below the confidence threshold
Once the engine enters the fallback path, an LLM response can steer the final verdict toward SUPPORTED or REFUTED, even though the deterministic verifier could not prove that outcome.
This creates a trust-leak from advisory model output into the core verification boundary.
Fix direction
Do not allow LLM fallback to mutate the final verifier verdict.
Required direction:
- keep deterministic verdict ownership inside the deterministic verifier only
- preserve LLM output as advisory metadata or separate analysis
- if deterministic confidence is below threshold, return a non-pass state such as
INCONCLUSIVE rather than a model-derived final verdict
- ensure
methods_used distinguishes proof-bearing methods from advisory methods
Suggested regression tests
- low-confidence deterministic path + LLM fallback -> final verdict must not change
- LLM says
SUPPORTED while deterministic path is inconclusive -> final status must remain non-pass
- deterministic high-confidence path -> fallback must not run
- fallback reasoning may be attached, but final fact verdict must remain deterministic
QWED Principle To Preserve
An LLM may explain uncertainty, but it must never resolve truth on behalf of the verifier.
QWED Version
5.0.0
Python Version
3.11.9
Operating System
Windows
Which engine is affected?
Fact Verifier
Input that caused the bug
This means:
Additional context
Severity
HIGH
Why this violates QWED philosophy
QWED’s core principles are explicit:
This code path violates those principles because the LLM is not just providing commentary. It is allowed to modify the final fact verdict.
That turns the fallback model into a truth-decider inside a verifier.
Attack / bypass scenario
An attacker can craft:
Once the engine enters the fallback path, an LLM response can steer the final verdict toward
SUPPORTEDorREFUTED, even though the deterministic verifier could not prove that outcome.This creates a trust-leak from advisory model output into the core verification boundary.
Fix direction
Do not allow LLM fallback to mutate the final verifier verdict.
Required direction:
INCONCLUSIVErather than a model-derived final verdictmethods_useddistinguishes proof-bearing methods from advisory methodsSuggested regression tests
SUPPORTEDwhile deterministic path is inconclusive -> final status must remain non-passQWED Principle To Preserve
An LLM may explain uncertainty, but it must never resolve truth on behalf of the verifier.