[Feature]: Improve knowledge base chunking strategy (overlap + boundary-aware splitting)

### Problem / motivation

 The KB ingest (POST /kb/ingest) uses a fixed 800-character chunker with no overlap and no awareness of sentence/paragraph boundaries. This can split mid-sentence, degrading to_tsvector full-text search relevance because PostgreSQL stems incomplete fragments.

### Proposed solution

Add a configurable chunk overlap (e.g., 100–200 chars) so context spans chunk boundaries

Split on paragraph/sentence boundaries instead of hard character offsets (recursive chunking)

Future: Consider adding pgvector or leveraging the existing Qdrant instance for semantic vector search on KB documents (the services/threatintel service already uses Qdrant + BAAI/bge-small-en-v1.5)

### Alternatives considered

_No response_

### Component area

Other

### Checklist

- [x] I have searched existing issues and this is not a duplicate
- [x] This feature aligns with the AiSOC roadmap or is a reasonable addition

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature]: Improve knowledge base chunking strategy (overlap + boundary-aware splitting) #277

Problem / motivation

Proposed solution

Alternatives considered

Component area

Checklist

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

[Feature]: Improve knowledge base chunking strategy (overlap + boundary-aware splitting) #277

Description

Problem / motivation

Proposed solution

Alternatives considered

Component area

Checklist

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions