Medical image data is often stored in DICOM, which incorporates various modalities of images (such as CT, MRI) as well as tags. Since some tags (such as Patient Name) contain PHI, these information should be removed before the data can be released publicly. This python code utilizes pydicom to load/modify/save dicom files and can be used to anonymize a dicom dataset organized in dataset/patient/exam/series/dicom hierarchy. This code is used to anonymize the Duke Abdominal Dataset published in the following paper:
Z. Zhu et al., "3D Pyramid Pooling Network for Abdominal MRI Series Classification,"
in IEEE Transactions on Pattern Analysis and Machine Intelligence, doi: 10.1109/TPAMI.2020.3033990.
Please cite this paper if you find this code useful. For those who would like to use the DicomAnonymizer module in RSNA MIRC Clinical Trials Processor (CTP),an example can be found here.
Dicom Anonymizer depends on the following libraries:
- Pydicom
- tqdm
We created a toy dataset in dataset/mabaoguo that contains two patients: Jack and Pony. Jack has 2 exams while Pony has 1 exam. In each exam there are several series and in each series there are several dicoms. After running the following script:
python demo.py
There will be a new folder called mabaoguo_deidentified created just under dataset, and the original dataset and the deidentified datset have the same structure. We added as many as possible comments in the code, please refer to the code for more details. Both the original and the anonymized dicoms should be viewed in dicom viewers such as Osirix.
Before running the script, the image and tags viewed in Osirix:
After the deidentification, the image and tags viewed in Osirix:
- Zhe Zhu
- Maciej Mazurowski
- Mustafa Bashir
- Brandon Konkel
Special thanks to Brandon Konkel for testing the code. Please contact Zhe Zhu([email protected]) if you have any question about this code.
@ARTICLE{9242262,
author={Z. {Zhu} and A. {Mittendorf} and E. {Shropshire} and B. {Allen} and C. {Miller} and M. R. {Bashir} and M. A. {Mazurowski}},
journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
title={3D Pyramid Pooling Network for Abdominal MRI Series Classification},
year={2020},
volume={},
number={},
pages={1-1},
doi={10.1109/TPAMI.2020.3033990}}
We take NO responsibility/liability for how you choose to use any of the source code available here, the use of this code is your responsibility. By using any of the files available in this repository, you understand that you are AGREEING TO USE AT YOUR OWN RISK.