Nexus stopword list is English-only

The Nexus stage's fallback token pruning uses an English stopword list. For multilingual content (common in international agent deployments), this means:

- Non-English text gets almost zero compression from Nexus
- Mixed-language content has inconsistent compression

Could the stopword list be extended to cover top 10 languages, or use a language-detection heuristic from Cortex to load the right list?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Nexus stopword list is English-only #104

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Nexus stopword list is English-only #104

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions