On-device Gemma 4 contextual PHI redaction. A small, focused TypeScript toolkit that uses Google's Gemma 4 to catch the protected-health-information shapes that regex and named-entity recognition systematically miss: inline diagnoses in clinical prose, medication mentions, treatment narratives, indirect health context, sensitive social data, and genetic references. Dual path: Ollama when the local daemon is available, WebLLM with the gemma-4-E2B-it-q4f16_1-MLC build in the browser otherwise. Never on a server.
This is the open-source pipeline that powers the contextual-PHI layer of Bounds Pro, a closed-source PDF redaction workspace. The toolkit on its own is enough to reproduce that layer end to end on your own documents.
The HIPAA Safe Harbor de-identification standard at 45 CFR 164.514(b)(2) lists eighteen identifier categories. The first sixteen are structured — phone numbers, social-security numbers, medical-record numbers, dates of birth — and the long-standing rule-based redactors handle them. Identifier #17 is "any other unique identifying number, characteristic, or code", and the surrounding clinical narrative is where it lives: a sentence that names a diagnosis without a label, a paragraph that mentions a medication in passing, an aside about a "therapist" or "insulin pump" that re-identifies the patient when triangulated with the rest of the document.
Existing PDF redaction tools force a choice no healthcare reviewer should have to make: send the document to a cloud API and trust their privacy posture, or use a regex-only desktop tool that demonstrably misses everything contextual. This toolkit's argument is that a small, capable on-device model — Gemma 4 E2B at int4 quantisation, ~1.5 GB on disk, running on the user's own browser via WebGPU — closes the gap without ever shipping document bytes off-device.
bounds-gemma exports a small surface area centred on a single async call:
import { startGemmaJob, getGemmaBackend } from 'bounds-gemma/pipeline/GemmaWorker'
// Probe which backend is reachable (Ollama localhost first, WebLLM fallback,
// unavailable if neither works). Cached after first probe.
const backend = await getGemmaBackend()
// Run a page's extracted text through Gemma. Returns the contextual-PHI
// detections the regex and NER layers would have missed.
const detections = await startGemmaJob({
text: pageText,
pageIndex: 0,
})Each detection is a { text, type, confidence, ruleId, reason } object. Confidence has a healthcare-only floor of 0.75; below that, the detection is silently dropped. The text field is verified to be a byte-identical substring of the input page (with NFC Unicode normalisation), so model hallucinations and paraphrases never reach the consumer.
Three guardrails make this safe for healthcare paraphrase tasks:
- In-corpus verification. Every Gemma-emitted span must be a byte-identical substring of the input page text after Unicode NFC normalisation. Model hallucinations and paraphrases are dropped silently before they ever reach the review surface.
- Confidence floor of 0.75. Tuned specifically for healthcare; below it, candidates are omitted. This is a single constant in
gemmaParse.tsand easy to lower for non-clinical use cases. - Default-off in the consumer UI. Every Gemma detection arrives with
enabled: false. The downstream reviewer must opt in per item. Surface-level acceptance is never automatic.
ollama pull gemma4:e2b
ollama serveThen point any consumer at http://localhost:11434/api/chat. The toolkit probes this URL at start-up; if reachable, it routes all subsequent calls there. Sub-second latency per chunk on consumer hardware, zero model-CDN traffic, fully offline after the model pull.
The toolkit ships a WebLLM browser path that activates automatically when no Ollama daemon is reachable. It dynamic-imports @mlc-ai/web-llm, loads the gemma-4-E2B-it-q4f16_1-MLC build (one-time ~1.5 GB download, cached in IndexedDB), and runs the same contextual PHI detection in-tab. The toolkit will not silently fall back to an older Gemma family.
Serving headers required for the WebLLM path:
Cross-Origin-Opener-Policy: same-origin
Cross-Origin-Embedder-Policy: require-corp
npm install bounds-gemma
# Optional, only when MLC ships a Gemma 4 WebLLM build:
npm install @mlc-ai/web-llmThe package has no required runtime dependencies. @mlc-ai/web-llm is a peer dependency you only pull in if you use the browser fallback. Ollama is a separate install (brew install ollama or equivalent).
git clone https://github.com/Aqta-ai/bounds-gemma.git
cd bounds-gemma
npm install
ollama pull gemma4:e2b
ollama serve & # in another terminal
npm run example:ollamaThe example runs a sample clinical-note paragraph through Gemma 4 and prints the contextual-PHI detections.
npm install
npm test16 unit tests in src/__tests__/gemmaParse.test.ts cover the parser, validator, in-corpus check, NFC normalisation, fence-stripping, malformed-JSON handling, and confidence-floor enforcement. They run in <1 second with no model required.
- It does not handle structured PHI (phone numbers, SSNs, dates, MRNs, addresses). Those are the regex and NER layers' job; combine this toolkit with a regex PII detector for full Safe Harbor coverage.
- It does not draw bounding boxes on PDFs. That is the consumer's job; the toolkit returns text spans and lets the consumer resolve them to PDF coordinates.
- It does not run an auditor. The cross-check pattern in the closed-source Bounds Pro pairs Gemma 4 26B with Gemma 4 31B as paraphraser plus auditor; the on-device toolkit ships only the paraphraser side and relies on verbatim-wins-ties as the safety floor.
- It does not phone home. No analytics, no telemetry, no model-CDN ping. Verify with the Network tab.
Released under Apache-2.0 (see LICENSE). The Gemma family and the Gemma Prohibited Use Policy are governed by their own terms; using this toolkit means you accept Google's terms for Gemma 4 as well. The HIPAA Safe Harbor identifier list is in the public domain.
- Google DeepMind for Gemma 4 and the open weights.
- MLC LLM and WebLLM for the browser runtime.
- Ollama for the local-first inference daemon.
- The Centers for Medicare and Medicaid Services for the public-domain HIPAA Safe Harbor specification.