Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inefficient Handling of Images with "Bad" Suffixes #1292

Open
souravraha opened this issue Sep 10, 2024 · 1 comment
Open

Inefficient Handling of Images with "Bad" Suffixes #1292

souravraha opened this issue Sep 10, 2024 · 1 comment
Labels
bug Something isn't working

Comments

@souravraha
Copy link

Description

This builds upon #1060.

While running the dataset converter for ADNI, it appears that images with unsupported suffixes such as ADC, real, and imaginary are being generated, only to be removed later. This behavior seems wasteful. A more efficient approach might be to proactively identify and skip these files before processing, rather than generating and deleting them.

Logs

2024-09-10 12:27:23,972:INFO:[fMRI] Processing subject 127_S_4210 in session m120
2024-09-10 12:27:24,025:WARNING:There already exist images : [PosixPath('/DATA1/souraviai/my_dsets_ckpts/adni/preprocessed/bids/sub-ADNI127S4210/ses-M120/func/sub-ADNI127S4210_ses-M120_task-rest_bold.nii.gz'), PosixPath('/DATA1/souraviai/my_dsets_ckpts/adni/preprocessed/bids/sub-ADNI127S4210/ses-M120/func/sub-ADNI127S4210_ses-M120_task-rest_bold.json')]. The parameter 'mod_to_update' is set to False so that they cannot be overwritten.
2024-09-10 12:27:24,026:INFO:[fMRI] Processing subject 127_S_4210 in session m132
2024-09-10 12:27:24,079:WARNING:There already exist images : [PosixPath('/DATA1/souraviai/my_dsets_ckpts/adni/preprocessed/bids/sub-ADNI127S4210/ses-M132/func/sub-ADNI127S4210_ses-M132_task-rest_bold.json'), PosixPath('/DATA1/souraviai/my_dsets_ckpts/adni/preprocessed/bids/sub-ADNI127S4210/ses-M132/func/sub-ADNI127S4210_ses-M132_task-rest_bold.nii.gz')]. The parameter 'mod_to_update' is set to False so that they cannot be overwritten.
2024-09-10 12:27:24,079:INFO:[fMRI] Processing subject 031_S_4218 in session bl
2024-09-10 12:28:05,162:WARNING:Image with bad suffix ADC, real, imaginary was generated by dcm2nix. These are not supported by Clinica so the image will NOT be converted.
2024-09-10 12:28:05,162:INFO:Removing image /DATA1/souraviai/my_dsets_ckpts/adni/preprocessed/bids/sub-ADNI031S4005/ses-M000/func/sub-ADNI031S4005_ses-M000_task-rest_bold_real.json
2024-09-10 12:28:05,162:WARNING:Image with bad suffix ADC, real, imaginary was generated by dcm2nix. These are not supported by Clinica so the image will NOT be converted.
2024-09-10 12:28:05,163:INFO:Removing image /DATA1/souraviai/my_dsets_ckpts/adni/preprocessed/bids/sub-ADNI031S4005/ses-M000/func/sub-ADNI031S4005_ses-M000_task-rest_bold_real.nii.gz
2024-09-10 12:28:05,170:WARNING:[fMRI] Conversion with dcm2niix failed for subject 031_S_4005 and session ses-M000
/DATA1/souraviai/clinica/clinica/utils/stream.py:103: UserWarning: [fMRI] Conversion with dcm2niix failed for subject 031_S_4005 and session ses-M000
  warnings.warn(message, warning_type)

Question

Is the improved command creating images with "bad" suffixes (such as ADC, real, and imaginary), and then deleting them? This seems inefficient. Can the code be optimized to identify these unsupported images ahead of time and avoid generating them in the first place?

Expected Behavior

  • The pipeline should skip generating files with unsupported suffixes, rather than creating and then deleting them.

Suggested Improvement

  • Modify the pipeline logic to proactively check for unsupported suffixes and bypass the generation of these files, thereby reducing unnecessary processing and I/O.
@souravraha souravraha added the bug Something isn't working label Sep 10, 2024
@AliceJoubert
Copy link
Contributor

AliceJoubert commented Sep 10, 2024

Hello,

Thanks for reporting !

To convert dicom to nifti files, the ADNI-to-BIDS converter uses dcm2niix which is an external package and is responsible for the generation of these suffixes. To prevent the converter from generating these files we would need to verify if dcm2niix can actually convert them before running it. In terms of efficiency the piece of logic to implement here seems to be the same as what is currently done, i. e. generating the files and deleting them.

Instead, we have a list of subjects / sessions per modality that are known to cause errors and that should be skipped by our converter. For example, for fieldmaps :

# Exceptions
# ==========
conversion_errors = [
# Multiple images
("029_S_2395", "m72"),
# Real/Imaginary
("002_S_1261", "m60"),
("002_S_1261", "m72"),
("002_S_1261", "m84"),
("002_S_1261", "m96"),
("006_S_4485", "bl"),
("006_S_4485", "m03"),
("006_S_4485", "m06"),
("006_S_4485", "m12"),
("006_S_4485", "m24"),
("006_S_4485", "m48"),
# Unrecognized BIDSCase
("006_S_4485", "m78"),
("009_S_4388", "m03"),
("009_S_4388", "m06"),
("009_S_4388", "m12"),
("009_S_4388", "m24"),
("009_S_4388", "m48"),
("023_S_4115", "bl"),
("023_S_4115", "m03"),
("023_S_4115", "m06"),
("023_S_4115", "m12"),
("023_S_4115", "m24"),
("023_S_4115", "m48"),
("123_S_4127", "bl"),
("123_S_4127", "m12"),
("123_S_4127", "m24"),
("123_S_4127", "m36"),
# Missing EchoTime Keys
("006_S_4485", "m90"),
("036_S_6088", "m12"),
("123_S_4127", "m84"),
# Missing DICOMS
("023_S_4115", "m126"),
("177_S_6448", "m24"),
]

These will be removed systematically from the converter list of subjects/sessions to be converted and not be considered at all.

Feel free to open a PR if you think of another solution or if you want to report a session causing problems for a specific subject !

Best,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants