FedST: Spatio-Temporal Representation Decoupling and Enhancement for Federated Instrument Segmentation in Surgical Videos
Official implementation of the paper "Spatio-Temporal Representation Decoupling and Enhancement for Federated Instrument Segmentation in Surgical Videos" (arXiv:2506.23759)
In this paper, we propose a novel Personalized FL scheme, **Spatio-Temporal Representation Decoupling and Enhancement (FedST)**, which wisely leverages surgical domain knowledge during both local-site and global-server training to boost segmentation.Ensure the following dependencies are installed:
- Python ≥ 3.8
- PyTorch ≥ 1.12
- CUDA ≥ 11.3
conda create -n pfedsis python=3.9
conda activate pfedsis
conda install pytorch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 pytorch-cuda=11.8 -c pytorch -c nvidia
cd FedST
pip install -r requirement.txt- Download: Download the
FL_Dataset.zipfile from the Files tab. - Unzip: Unzip the file to your project directory.
⚠️ Note on Data Format: The original codebase relies on.npyfiles for accelerated data loading. However, to optimize download speeds and storage on Hugging Face, the dataset is provided in.pngformat.To resolve this, please choose one of the following:
- Convert the data: Pre-process the
.pngimages back into.npyformat.- Update the loader: Modify
dataloaders/robotics_dataloader.pyto load.pngfiles directly instead of.npy.
# Example command
unzip FL_Dataset.zip
## 🚀 Training and Evaluation
You can start the full training with:
```bash
python3 final_method.py --exp <save_path> --max_epoch 300 --dataset robotics--exp specifies the output directory to save checkpoints and logs.
This research was supported by collaborative efforts within the iMVR Lab and multiple surgical data initiatives. We thank the contributors of the public surgical video datasets that made this benchmark possible.
For any inquiries regarding this project, please contact Zheng Fang.
