KlarText

KlarText turns dense German or English text into easy-to-understand language and can read it aloud. It’s built for people who get overwhelmed by long sentences, legal/bureaucratic phrasing, or technical language.

Important: KlarText produces “easy language” / plain-language simplifications. It is not certified “Leichte Sprache” and does not guarantee legal/medical accuracy.

🎯 Current Status

Phase 0: ✅ Complete (January 10, 2026)

Component	Status	Details
API Core	✅ Working	`/v1/simplify` endpoint with Groq LLM
PDF Extraction	✅ Working	`/v1/ingest/pdf` with PyMuPDF
Gradio Demo	✅ Working	Direct LLM & Via API modes
Prompt Templates	✅ Working	German & English from `prompts/templates/`
Quality Scoring	✅ Working	LIX, sentence length, readability
Swagger Docs	✅ Available	`http://localhost:8000/docs`
Batch Endpoint	📝 Defined	Logic pending
TTS	📝 Defined	Logic pending
URL Extraction	📝 Defined	Logic pending
Frontend (Next.js)	⏳ Planned	Will replace Gradio

Quick Start:

# First time setup:
cd services/api
cp env.example .env
# Edit .env and add your GROQ_API_KEY

# Terminal 1: API
cd services/api && uvicorn app.main:app --reload --port 8000

# Terminal 2: Demo
cd demo && python app.py

# Open: http://localhost:7860

See docs/phase_0_testing_guide.md for detailed setup instructions.

What it does (MVP)

Paste text → get an easy-to-read version (DE/EN)
Upload a PDF → extract text → simplify it
Optional Text-to-Speech (TTS) for the simplified text
No login required (MVP)

Stretch (optional):

Paste a URL → extract main article text → simplify
Chrome extension: simplify selected text or the current page

Architecture

flowchart LR
  U[User] -->|Paste text<br/>Upload PDF<br/>Enter URL| WEB[Web UI<br/>Next.js]
  WEB -->|REST| API[API<br/>FastAPI]
  API --> INGEST[Ingestion]
  INGEST --> CHUNK[Chunking]
  CHUNK --> LLM[LLM Adapter]
  LLM --> SIMP[Simplified Text]
  SIMP -->|optional| TTS[TTS Provider]
  API --> WEB
  TTS --> API

Components:

Ingestion — Extract text from PDF, pasted text, or URL
Chunking — Split long texts, ensure no hallucinations
LLM Adapter — Configurable provider (OpenAI, Azure, Groq, Google)
TTS Provider — Text-to-speech for audio output

Tech stack (recommended)

Frontend: Next.js (React), TypeScript, Tailwind (or vanilla CSS), accessible UI patterns
Backend: Python + FastAPI + Uvicorn
LLM: configurable provider via an adapter layer
PDF extraction: PyMuPDF or pdfplumber
URL extraction (stretch): trafilatura or readability-lxml
Rate limiting / caching (optional): Redis
Storage (optional): Postgres (metrics/logging) + local file storage in dev

API overview

Base path: /v1

POST /v1/simplify
- Input: { text, target_lang: "de"|"en", level: "very_easy"|"easy"|"medium" }
- Output: { simplified_text, warnings?: string[] }
POST /v1/ingest/pdf
- Input: multipart/form-data with file
- Output: { extracted_text, pages, warnings?: string[] }
POST /v1/ingest/url (stretch)
- Input: { url }
- Output: { extracted_text, title?, warnings?: string[] }
POST /v1/tts
- Input: { text, lang: "de"|"en" }
- Output: { audio_url | audio_base64 }
GET /healthz

Swagger / OpenAPI docs

Swagger UI: http://localhost:8000/docs (when API is running)
ReDoc: http://localhost:8000/redoc
OpenAPI JSON: http://localhost:8000/openapi.json

Quick start (local dev)

1) Prereqs

Docker + Docker Compose

2) Configure env

Copy the example env file:

cp .env.example .env

Then set your LLM/TTS provider keys (see comments inside .env.example).

3) Run

docker compose up --build

4) Open

Repo layout

apps/web – Web UI
services/api – API service
apps/extension – Chrome extension (optional track)
prompts – prompt templates + eval fixtures
docs – architecture notes + screenshots

Design reference

Lovable prototype: https://lovable.dev/projects/7d82e6e7-3919-4389-9b79-30543806c5e0

License

Usage & Licensing Notice (Non-Commercial) This project, including all code, models, datasets, documentation, and related materials, is provided for personal, educational, and non-commercial research use only.

Allowed Personal use Academic and non-commercial research Forking and modifying for non-commercial purposes Contributing via pull requests Not Allowed Any commercial use Selling or monetizing the software, models, datasets, or outputs Using the project in products or services that generate revenue Use within commercial organizations without permission Attribution If you share or publish derivative work, you must provide credit to:

Based on work from the KlarText Team (2025).

Commercial Licensing To request commercial usage rights, contact a repository administrator.

Name		Name	Last commit message	Last commit date
Latest commit History 124 Commits
accessible-word-craft-main		accessible-word-craft-main
apps		apps
data		data
demo		demo
docs		docs
models		models
notebooks		notebooks
outputs		outputs
presentation		presentation
prompts		prompts
scripts		scripts
services/api		services/api
.cursorrules		.cursorrules
.env.example		.env.example
.gitignore		.gitignore
BUILD_PROMPT.md		BUILD_PROMPT.md
LICENSE		LICENSE
README.md		README.md
Requirements.txt		Requirements.txt
accessible-word-craft-main.zip		accessible-word-craft-main.zip
agent.md		agent.md
docker-compose.yml		docker-compose.yml
hybrid_arch.md		hybrid_arch.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

KlarText

🎯 Current Status

What it does (MVP)

Architecture

Tech stack (recommended)

API overview

Swagger / OpenAPI docs

Quick start (local dev)

1) Prereqs

2) Configure env

3) Run

4) Open

Repo layout

Design reference

License

About

Uh oh!

Releases

Packages

Contributors 6

Uh oh!

Languages

License

EasySprache/klartext

Folders and files

Latest commit

History

Repository files navigation

KlarText

🎯 Current Status

What it does (MVP)

Architecture

Tech stack (recommended)

API overview

Swagger / OpenAPI docs

Quick start (local dev)

1) Prereqs

2) Configure env

3) Run

4) Open

Repo layout

Design reference

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 6

Uh oh!

Languages

Packages