fix(desktop): repair inline math rendering for LLM output by lightfront · Pull Request #3666 · esengine/DeepSeek-Reasonix

lightfront · 2026-06-09T10:18:12Z

Fix: Inline math rendering for LLM output

Problem

LaTeX source was being rendered as raw text (with red KaTeX errors) in several common scenarios:

Block math $$ glued to prose without blank lines
Single digits ( $1$ , $2$ ) and multi-digit numbers rejected as currency
Single uppercase letters ( $S$ , $A$ ) rejected as English words
One-sided comparisons (< B, A <) not recognized as math
$ symbols inside code blocks causing KaTeX parse errors
Malformed single-line code blocks not protected
Prose currency like "cost $5 and $6" had dollar signs converted to HTML entities

Solution

1. Block math blank-line repair (`mathNormalize.ts` Step 2.5)

When $$ appears after prose (letter or punctuation) without a blank line, insert \n\n before it. This satisfies CommonMark's requirement that block math be separated from surrounding content.

Example:

# Before (broken)
decomposes as$$
\mathbf{6}\otimes...

# After (fixed)
decomposes as

$$
\mathbf{6}\otimes...

2. Improved math classifier (`mathClassify.ts`)

Added recognition for common minimal-LaTeX patterns:

Pure numbers and letters:

$1$ , $2$ , $42$ , $2.5$ → math (counts, indices, values)
$S$ , $A$ , $I$ → math (set/algebra/group names)

Number combinations:

Comma-separated: A, B, 1, 2, 3, (A, B) → math (tuples, pairs)
With variables: $2.5x$ , $3y^2$ → math (implicit multiplication)
With LaTeX: $10\%$ , $5\cdot3$ → math (math operators)

One-sided comparisons:

< B, <= 0, A <, B <= → math (implicit operand)

Example:

"$1$ 和 $2$ 之间有无穷多有理数" → math (1 and 2)
"$S$ 非空" → math (set S)
"把 $(A, B)$ 整体" → math (ordered pair)
"A 的每个元素 $< B$" → math (comparison)
"$2.5x + 3$" → math (number with variable)

3. Code block `$` protection (`mathNormalize.ts`)

Escape $ to $ inside code blocks and inline code to prevent KaTeX from attempting to parse them as math. The restoration step does NOT unescape $ back to $, keeping the HTML entities in the final output.

Example:

// Input
`r = r.replace(/\$\$/, ...)`

// After normalization
`r = r.replace(/&#36;&#36;/, ...)`

// Final HTML renders as
r = r.replace(/\$\$/, ...)

4. Lenient fenced code detection (`mathNormalize.ts`)

Removed the requirement that ``` must appear after a newline. Now accepts ``` anywhere in the text, handling malformed code blocks that are all on one line.

5. Single-line code block fix (`mathNormalize.ts`)

When the entire document is on one line (no newlines), treat ``` as a simple toggle: first occurrence is opening, second is closing. This handles pasted documentation where code blocks are inline.

6. Preserve prose currency (`mathNormalize.ts` Step 5)

Changed the regex to non-greedy matching and removed HTML entity conversion for non-math pairs. Now "These two apples cost $5 and $6" preserves its dollar signs unchanged instead of converting them to $5 and $6.

Before:

Input:  "These two apples cost $5 and $6"
Output: "These two apples cost &#36;5 and &#36;6"

After:

Input:  "These two apples cost $5 and $6"
Output: "These two apples cost $5 and $6" (unchanged)

Changes

Files modified

desktop/frontend/src/components/mathNormalize.ts
- Added Step 2.5: blank-line insertion before glued $$
- Modified protectMarkdownCode: escape $ to $ in code segments
- Modified restoration: do NOT unescape $ back to $
- Modified fencedCodeEnd: removed newline requirement, added single-line toggle logic
- Modified Step 5: non-greedy regex, preserve original text for non-math pairs
desktop/frontend/src/components/mathClassify.ts
- Accept all pure numbers (single, multi-digit, decimals, percentages) as math
- Accept numbers with variables ( $2.5x$ ) or LaTeX ( $10\%$ ) as math
- Accept comma-separated lists as math
- Accept one-sided comparisons as math
- Accept single uppercase letters as math
desktop/frontend/src/__tests__/math-golden.test.ts
- Added "minimal LaTeX patterns (regression)" test section
- Added tests for single digits, uppercase letters, comma-separated lists, one-sided comparisons
- Added tests for single-line code blocks with $ symbols
- Updated currency tests to reflect new behavior (dollars preserved)

Test coverage

All 122 tests passing (108 + 8 + 6)
Coverage includes real-world examples from user-reported issues
Currency handling now preserves dollar signs in prose

Trade-offs and limitations

Orphan `$$` (model-side issue)

Problem: Model outputs $$... without closing $$, causing the parser to swallow everything until the next $$.

Why not fixed: Every attempt to rescue orphan $$ from the renderer side made the output worse (whole prose paragraphs wrapped in $…$).

Right fix: Upstream—better LLM prompting or post-generation lint.

Testing

All existing tests pass, plus new regression tests covering:

Pure numbers: $1$ , $42$ , $2.5$ → math
Numbers with variables: $2.5x$ → math
Numbers with LaTeX: $10\%$ → math
Comma-separated lists: A, B, (A, B) → math
One-sided comparisons: < B, A < → math
Single uppercase letters: $S$ , $A$ → math
Single-line code blocks with $ → protected
Prose currency: "cost $5 and $6" → dollars preserved

Run tests:

cd desktop/frontend
npm test

Real-world examples

These examples from user reports now render correctly:

### Chinese math text
$1$ 和 $2$ 之间有无穷多有理数
→ Both 1 and 2 render as math

$S$ 非空
→ S renders as math (set name)

把 $(A, B)$ 整体当作一个新对象
→ (A, B) renders as math (ordered pair)

A 的每个元素 $< B$ 的每个元素
→ < B renders as math (comparison)

### Math with numbers
$2.5x + 3$
→ Entire expression renders as math

$10\%$ increase
→ 10\% renders as math

$42$ elements
→ 42 renders as math

### Currency in prose
costs $5 and $6
→ Dollar signs preserved unchanged

costs $5$ today
→ $5$ renders as math (pure number)

price is $10.50$
→ $10.50$ renders as math (decimal)

### Code blocks
```javascript
r = r.replace(/\$\$/, ...);

→ No KaTeX errors, $ symbols protected

Three targeted fixes to the math-pipeline pre-pass that resolve cases where the rendered chat output showed LaTeX source as raw text: 1. mathNormalize.ts (Step 2.5): when the model writes block math with the opening $$ glued to prose on the same line ('…decomposes as$$\n\mathbf{6}…'), CommonMark requires a blank line before the $$. remark-math otherwise creates an empty math node and the formula leaks out as literal text. Insert \n\n before any $$ preceded by a letter or end-of-sentence punctuation. The freshly-rewritten \] → $$ from step 2 is not affected. 2. mathClassify.ts: classify single digits ($1$, $2$) as math — commonly used as set / sequence indices. Multi-digit numbers, decimals, and percentages stay literal (still currency / percentage). This is a deliberate behavior change documented in the comment. 3. mathClassify.ts: allow comma-separated tokens ('A, B', '1, 2, 3', '\\alpha, \\beta', '(A, B)') as math. These are typical of ordered-pair / tuple / enumeration notation. Currency and env-var usage never looks like this. 4. mathClassify.ts: allow single uppercase letters as math. In non-English math prose (Chinese / Japanese / Korean textbooks) single capital letters are extremely common as set / algebra / group / vector-space names, and the closing-dollar form $X$ is essentially never written for English words like I/A/V by hand. Test changes: 4 existing currency/acronym assertions updated to reflect the new behavior, 13 new regression tests covering all four fixes including the user's specific cases ('$1$ 和 $2$' and '$S$ 非空 / $S$ 有上界'). 98 math-golden tests pass, 112/112 across all suites, typecheck clean. Orphan $$ (model wrote display math but forgot the closing $$) is documented as not-fixed-from-the-renderer: every attempt to rescue the orphan from the renderer side made the output worse, so the fix for that case is on the LLM side (post-generation lint or stricter system prompt).

The classifier rules are language-agnostic, not specific to CJK text. Updated test section name and descriptions to reflect that patterns like single digits, comma-separated tokens, and one-sided operators apply universally across languages. Chinese text in test cases remains as real user examples, but the rules themselves are not CJK-specific.

Add defensive escaping for code blocks containing $ characters. When protecting code (inline `...` or fenced ```...```), replace $ with &esengine#36; (HTML entity). On restoration, unescape back to $. This prevents KaTeX from attempting to parse math delimiters that appear in code examples, regex patterns, or template literals. Fixes: Pasted documentation about the math pipeline itself no longer shows red KaTeX error text. Tests: 3 new cases added, 106/106 passing

Remove the requirement that ``` must appear after a newline. This handles cases where documentation is pasted on a single line with embedded code blocks containing $ symbols. Previously: ``` markers were only recognized after \n Now: ``` markers are recognized anywhere This prevents KaTeX errors (red text) when processing malformed code blocks that contain $ in regex patterns, template literals, or other code examples. All 120 tests pass.

Enhancements to inline math detection: - Reject pure numbers (1, 2.5, 10) as currency/percentages - Accept numbers with variables (2.5x, 3y^2) as math - Accept numbers with LaTeX escapes (10\%) as math - Fix single-line code block detection to protect $ in malformed markdown This better matches real-world usage where 'costs $5' is currency but '$2.5x + 3$' is clearly a mathematical expression. All 122 tests pass (108 math-golden + 8 text-size + 6 provider-model-refresh).

Previously, the Step 5 regex would greedily match '$5 and $' as a single math expression with content '5 and ', then convert it to '&esengine#36;5 and &esengine#36;' because the classifier correctly identified it as non-math. This was visually correct but had two problems: 1. The greedy match would consume the closing dollar that belonged to the next currency token, causing cascade replacements. 2. Prose currency like 'These two apples cost $5 and $6' would have its dollar signs converted to HTML entities, which works but is unnecessary noise in the rendered output. Changes: - Step 5 regex now uses non-greedy matching (+\?) so '$5 and $' doesn't match '$5 and $' as a single pair - When the classifier rejects a match, the original text is preserved unchanged (return _m) instead of being wrapped in HTML entities - This keeps dollar signs visible in prose while still preventing them from being parsed as math All 122 tests pass.

lightfront requested review from SivanCola and esengine as code owners June 9, 2026 10:18

github-actions Bot added desktop Wails desktop app (desktop/**) v2 Go rewrite (1.x) — main-v2 branch, active development labels Jun 9, 2026

lightfront force-pushed the fix/inline-math-rendering branch from 5c0c8c4 to a8c5f5a Compare June 9, 2026 11:04

lightfront force-pushed the fix/inline-math-rendering branch from a8c5f5a to 5532392 Compare June 9, 2026 11:07

light-front-theory added 5 commits June 9, 2026 19:12

github-actions Bot added agent Core agent loop (internal/agent, internal/control) provider Model providers & selection (internal/provider) labels Jun 9, 2026

lightfront force-pushed the fix/inline-math-rendering branch from 7843e3b to 83dda9b Compare June 9, 2026 14:52

lightfront mentioned this pull request Jun 10, 2026

fix(desktop): convert \slashed to \not for KaTeX compatibility #3750

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(desktop): repair inline math rendering for LLM output#3666

fix(desktop): repair inline math rendering for LLM output#3666
lightfront wants to merge 6 commits into
esengine:main-v2from
lightfront:fix/inline-math-rendering

lightfront commented Jun 9, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

lightfront commented Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Fix: Inline math rendering for LLM output

Problem

Solution

1. Block math blank-line repair (mathNormalize.ts Step 2.5)

2. Improved math classifier (mathClassify.ts)

3. Code block $ protection (mathNormalize.ts)

4. Lenient fenced code detection (mathNormalize.ts)

5. Single-line code block fix (mathNormalize.ts)

6. Preserve prose currency (mathNormalize.ts Step 5)

Changes

Files modified

Test coverage

Trade-offs and limitations

Orphan $$ (model-side issue)

Testing

Real-world examples

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

lightfront commented Jun 9, 2026 •

edited

Loading

1. Block math blank-line repair (`mathNormalize.ts` Step 2.5)

2. Improved math classifier (`mathClassify.ts`)

3. Code block `$` protection (`mathNormalize.ts`)

4. Lenient fenced code detection (`mathNormalize.ts`)

5. Single-line code block fix (`mathNormalize.ts`)

6. Preserve prose currency (`mathNormalize.ts` Step 5)

Orphan `$$` (model-side issue)