feat: add video input support and related functionality for multimodal models by szw0407 · Pull Request #103 · netdur/llama_cpp_dart

szw0407 · 2026-05-28T10:33:03Z

Done by Claude Haiku from GitHub Colilot.

I am testing whether it works. Hopefully I will open it when I am sure it is correct.

…l models

Video support, done the way libmtmd actually allows: a "video" is a sequence of image frames. libmtmd has no video decoder (only stb_image for stills and miniaudio for audio), so frame extraction is the caller's job — LlamaMedia.videoFrames(frames) wraps pre-extracted frame bytes as one image media item each, to pair with one marker per frame (EngineChat.addUser inserts the markers automatically). example/probes/video_describe.dart shows the full flow: ffmpeg extracts N downscaled frames, videoFrames wraps them, and they go through the normal multimodal chat path. Verified against SmolVLM2-256M-Video-Instruct on a real clip — it produces a coherent description ("A woman is taking a selfie in a store..."). This is the correct alternative to PR #103, whose premise (libmtmd auto-decodes video) does not hold: routing raw mp4 bytes through the image path returns null. No MediaKind.video, no dead video-token params. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

feat: add video input support and related functionality for multimoda…

ede8b92

…l models

szw0407 mentioned this pull request May 28, 2026

Support Video Pipeline #102

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add video input support and related functionality for multimodal models#103

feat: add video input support and related functionality for multimodal models#103
szw0407 wants to merge 1 commit into
netdur:mainfrom
szw0407:main

szw0407 commented May 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

szw0407 commented May 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant