Skip to content

[Bug]: FactVerifier allows LLM fallback to overwrite deterministic verdicts #133

@rahuldass19

Description

QWED Version

5.0.0

Python Version

3.11.9

Operating System

Windows

Which engine is affected?

Fact Verifier

Input that caused the bug

from qwed_new.core.fact_verifier import FactVerifier

verifier = FactVerifier(use_llm_fallback=True)

claim = "The treaty was signed in 1842."
context = """
The treaty negotiations began in 1840.
Sources disagree on the final signing year.
Some references mention 1841, others 1843.
"""

result = verifier.verify_fact(
    claim=claim,
    context=context,
    min_confidence=0.95,
    provider="openai",  # any configured provider path that enables fallback
)

print(result)


### Expected behavior

When deterministic fact verification is low-confidence or inconclusive, the engine must fail closed.

Expected behavior:
- low-confidence deterministic analysis must not be upgraded into a final fact verdict by an LLM
- external model output may be attached as advisory reasoning only
- final verdict must remain deterministic, explicit, and non-pass unless proven from deterministic checks

QWED principle:
Risky interpretation may inform analysis, but it must never decide truth.


### Actual behavior

`FactVerifier.verify_fact()` currently allows LLM fallback output to overwrite the deterministic verdict when confidence is below the threshold.

Relevant code in `src/qwed_new/core/fact_verifier.py`:

```python
if confidence < min_confidence and self.use_llm_fallback and provider:
    methods_used.append("llm_fallback")
    llm_result = self._llm_fallback(claim, context, provider)
    if llm_result:
        verdict = llm_result.get("verdict", verdict)
        confidence = max(confidence, llm_result.get("confidence", 0) * 0.8)
        reasoning += f"\n\nLLM Analysis: {llm_result.get('reasoning', '')}"

This means:

  • deterministic verification can end in one verdict
  • an external LLM can replace that verdict
  • the final engine output can become model-decided rather than verifier-decided

Additional context

Severity

HIGH

Why this violates QWED philosophy

QWED’s core principles are explicit:

  • fail-closed by default
  • zero trust in LLM outputs
  • no heuristic acceptance of results
  • verification must be deterministic and explicit

This code path violates those principles because the LLM is not just providing commentary. It is allowed to modify the final fact verdict.

That turns the fallback model into a truth-decider inside a verifier.

Attack / bypass scenario

An attacker can craft:

  • low-signal claims
  • ambiguous supporting context
  • semantically noisy passages that keep deterministic scores below the confidence threshold

Once the engine enters the fallback path, an LLM response can steer the final verdict toward SUPPORTED or REFUTED, even though the deterministic verifier could not prove that outcome.

This creates a trust-leak from advisory model output into the core verification boundary.

Fix direction

Do not allow LLM fallback to mutate the final verifier verdict.

Required direction:

  1. keep deterministic verdict ownership inside the deterministic verifier only
  2. preserve LLM output as advisory metadata or separate analysis
  3. if deterministic confidence is below threshold, return a non-pass state such as INCONCLUSIVE rather than a model-derived final verdict
  4. ensure methods_used distinguishes proof-bearing methods from advisory methods

Suggested regression tests

  • low-confidence deterministic path + LLM fallback -> final verdict must not change
  • LLM says SUPPORTED while deterministic path is inconclusive -> final status must remain non-pass
  • deterministic high-confidence path -> fallback must not run
  • fallback reasoning may be attached, but final fact verdict must remain deterministic

QWED Principle To Preserve

An LLM may explain uncertainty, but it must never resolve truth on behalf of the verifier.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions