Skip to content

Official code implementation for paper "SAM2S: Segment Anything in Surgical Videos via Semantic Long-term Tracking"

Notifications You must be signed in to change notification settings

jinlab-imvr/SAM2S

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 

Repository files navigation

SAM2S: Segment Anything in Surgical Videos via Semantic Long-term Tracking

Project Page arXiv Paper

Official implementation of SAM2S, a foundation model that enhances SAM2 for surgical interactive Video Object Segmentation (iVOS) through semantic long-term tracking and domain-specific adaptations. The source code is coming soon.

Haofeng Liu, Ziyue Wang, Sudhanshu Mishra, Mingqi Gao, Guanyi Qin, Chang Han Low, Alex Y. W. Kong, Yueming Jin

Overview

Surgical video segmentation is crucial for computer-assisted surgery, enabling precise localization and tracking of instruments and tissues. While SAM2 provides prompt-based flexibility, it faces significant challenges in surgical scenarios due to domain gaps and limited long-term tracking capabilities.

SAM2S addresses these limitations through:

  1. SA-SV Benchmark Construction: Large-scale surgical iVOS dataset spanning diverse procedures (cholecystectomy, colonoscopy, nephrectomy, prostatectomy, etc.) with instance-level spatio-temporal annotations
  2. Enhanced Long-term Tracking: DiveMem employs hybrid temporal sampling during training and diversity-based frame selection during inference
  3. Semantic Understanding: TSL leverages semantic categories of surgical instruments while preserving class-agnostic generalization
  4. Multi-source Robustness: ARL handles annotation ambiguities through uniform label softening

architecture

Overall of SAM2S

architecture

SAM2S Framework Architecture

Citation

@article{liu2025sam2s,
  title={SAM2S: Segment Anything in Surgical Videos via Semantic Long-term Tracking},
  author={Liu, Haofeng and Wang, Ziyue and Mishra, Sudhanshu and Gao, Mingqi and Qin, Guanyi and Low, Chang Han and Kong, Alex Y. W. and Jin, Yueming},
  journal={arXiv preprint arXiv:2511.16618},
  year={2025}
}

About

Official code implementation for paper "SAM2S: Segment Anything in Surgical Videos via Semantic Long-term Tracking"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published