Question: overlapping labels in CT data #440

zivy · 2025-03-05T14:21:08Z

Thanks for the great tool and for sharing the CT data on Zenodo.

From an analysis of the segmentation data, version 2.0.1, it appears that all of the individual CT datasets have overlapping labels, in many cases these are significantly large regions that are identified as multiple structures. Looking at the TotalSegmentator code, combine_masks_to_multilabel_file, this is implicitly resolved by using the last read segmentation label file. This appears to be an arbitrary decision as it could have been the first segmentation label file, or any other label in between.

Is there something that I am missing which motivated the decision?

Code analyzing label overlaps is below (it is not efficient, no parallel processing, so takes hours on a laptop to analyze the whole dataset):

import pathlib
import numpy as np
import tempfile
import pandas as pd
import nibabel as nib
import SimpleITK as sitk


def read_nifti(file_path):
    """
    In some nifti files the qform and sform are both set but sform is
    not orthonormal.
    See https://github.com/SimpleITK/SimpleITK/issues/1452 and
    original ITK change https://github.com/InsightSoftwareConsortium/ITK/pull/1868
    and resolution https://github.com/InsightSoftwareConsortium/ITK/commit/6278c4171f697649da5b3136214cfbb8e37aab00
    There are still cases when this is an issue. This function tries to read. If this issue is encountered
    the image is read using nibabel which ignores the inconsistency and written to a tempfile using
    the qform information which is guaranteed to be orthornormal. The temp image is then read. This is
    a bit of a hack as the right thing to do would be to extract the information from the nibabel image
    and create a SimpleITK image in memory.
    """
    try:
        image = sitk.ReadImage(file_path, imageIO="NiftiImageIO")
    except RuntimeError as e:
        if (
            "ITK ERROR: ITK only supports orthonormal direction cosines.  No orthonormal definition found!"
            in str(e)
        ):
            with tempfile.TemporaryDirectory() as tmpdirname:
                image = nib.load(file_path)
                file_path = pathlib.Path(tmpdirname) / pathlib.Path(file_path).name
                nib.save(
                    nib.Nifti1Image(image.get_fdata(), image.get_qform()), file_path
                )
                image = sitk.ReadImage(file_path, imageIO="NiftiImageIO")
        else:
            raise
    return image


def overlap_analysis(output_dir, nifti_segmentation_file_paths):
    res = {}
    # Read the segmentations as SimpleITK images and get a numpy array view to each. This
    # saves time, no memory copy, but we need to keep the SimpleITK images so that the
    # views remain valid.
    sitk_segmentations = [
        read_nifti(file_name) for file_name in nifti_segmentation_file_paths
    ]
    segmentations = [
        sitk.GetArrayViewFromImage(sitk_seg) for sitk_seg in sitk_segmentations
    ]
    overlapping_label_locations = sum(segmentations) > 1
    # if this set of labels has overlaps, record the issue
    if np.sum(overlapping_label_locations) > 0:
        seg_overlap_image = sitk.GetImageFromArray(
            overlapping_label_locations.astype(np.uint8)
        )
        seg_overlap_image.CopyInformation(sitk_segmentations[0])
        sitk.WriteImage(
            seg_overlap_image,
            pathlib.Path(output_dir) / "regions_with_overlapping_labels.nii.gz",
        )
        res["dir"] = output_dir
        files_with_overlap = []
        for seg_file_name, seg in zip(nifti_segmentation_file_paths, segmentations):
            seg_overlap = np.multiply(overlapping_label_locations, seg)
            if np.sum(seg_overlap) > 0:
                files_with_overlap.append(seg_file_name.name)
        res["files"] = files_with_overlap
    return res


root_dir = pathlib.Path("Totalsegmentator/")
segmentation_directories = [d / "segmentations" for d in root_dir.glob("s*")]

res = [
    overlap_analysis(
        pathlib.Path(segmentations_dir).parent, list(segmentations_dir.glob("*.nii.gz"))
    )
    for segmentations_dir in segmentation_directories
]
df = pd.DataFrame(res, columns=["dir", "files"])
df.to_csv(root_dir / "overlapping_labels.csv", index=False)

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question: overlapping labels in CT data #440

Question: overlapping labels in CT data #440

zivy commented Mar 5, 2025

Question: overlapping labels in CT data #440

Question: overlapping labels in CT data #440

Comments

zivy commented Mar 5, 2025