Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question: overlapping labels in CT data #440

Open
zivy opened this issue Mar 5, 2025 · 0 comments
Open

Question: overlapping labels in CT data #440

zivy opened this issue Mar 5, 2025 · 0 comments

Comments

@zivy
Copy link

zivy commented Mar 5, 2025

Hello @wasserth,

Thanks for the great tool and for sharing the CT data on Zenodo.

From an analysis of the segmentation data, version 2.0.1, it appears that all of the individual CT datasets have overlapping labels, in many cases these are significantly large regions that are identified as multiple structures. Looking at the TotalSegmentator code, combine_masks_to_multilabel_file, this is implicitly resolved by using the last read segmentation label file. This appears to be an arbitrary decision as it could have been the first segmentation label file, or any other label in between.

Is there something that I am missing which motivated the decision?

Code analyzing label overlaps is below (it is not efficient, no parallel processing, so takes hours on a laptop to analyze the whole dataset):

import pathlib
import numpy as np
import tempfile
import pandas as pd
import nibabel as nib
import SimpleITK as sitk


def read_nifti(file_path):
    """
    In some nifti files the qform and sform are both set but sform is
    not orthonormal.
    See https://github.com/SimpleITK/SimpleITK/issues/1452 and
    original ITK change https://github.com/InsightSoftwareConsortium/ITK/pull/1868
    and resolution https://github.com/InsightSoftwareConsortium/ITK/commit/6278c4171f697649da5b3136214cfbb8e37aab00
    There are still cases when this is an issue. This function tries to read. If this issue is encountered
    the image is read using nibabel which ignores the inconsistency and written to a tempfile using
    the qform information which is guaranteed to be orthornormal. The temp image is then read. This is
    a bit of a hack as the right thing to do would be to extract the information from the nibabel image
    and create a SimpleITK image in memory.
    """
    try:
        image = sitk.ReadImage(file_path, imageIO="NiftiImageIO")
    except RuntimeError as e:
        if (
            "ITK ERROR: ITK only supports orthonormal direction cosines.  No orthonormal definition found!"
            in str(e)
        ):
            with tempfile.TemporaryDirectory() as tmpdirname:
                image = nib.load(file_path)
                file_path = pathlib.Path(tmpdirname) / pathlib.Path(file_path).name
                nib.save(
                    nib.Nifti1Image(image.get_fdata(), image.get_qform()), file_path
                )
                image = sitk.ReadImage(file_path, imageIO="NiftiImageIO")
        else:
            raise
    return image


def overlap_analysis(output_dir, nifti_segmentation_file_paths):
    res = {}
    # Read the segmentations as SimpleITK images and get a numpy array view to each. This
    # saves time, no memory copy, but we need to keep the SimpleITK images so that the
    # views remain valid.
    sitk_segmentations = [
        read_nifti(file_name) for file_name in nifti_segmentation_file_paths
    ]
    segmentations = [
        sitk.GetArrayViewFromImage(sitk_seg) for sitk_seg in sitk_segmentations
    ]
    overlapping_label_locations = sum(segmentations) > 1
    # if this set of labels has overlaps, record the issue
    if np.sum(overlapping_label_locations) > 0:
        seg_overlap_image = sitk.GetImageFromArray(
            overlapping_label_locations.astype(np.uint8)
        )
        seg_overlap_image.CopyInformation(sitk_segmentations[0])
        sitk.WriteImage(
            seg_overlap_image,
            pathlib.Path(output_dir) / "regions_with_overlapping_labels.nii.gz",
        )
        res["dir"] = output_dir
        files_with_overlap = []
        for seg_file_name, seg in zip(nifti_segmentation_file_paths, segmentations):
            seg_overlap = np.multiply(overlapping_label_locations, seg)
            if np.sum(seg_overlap) > 0:
                files_with_overlap.append(seg_file_name.name)
        res["files"] = files_with_overlap
    return res


root_dir = pathlib.Path("Totalsegmentator/")
segmentation_directories = [d / "segmentations" for d in root_dir.glob("s*")]

res = [
    overlap_analysis(
        pathlib.Path(segmentations_dir).parent, list(segmentations_dir.glob("*.nii.gz"))
    )
    for segmentations_dir in segmentation_directories
]
df = pd.DataFrame(res, columns=["dir", "files"])
df.to_csv(root_dir / "overlapping_labels.csv", index=False)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant