You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
From an analysis of the segmentation data, version 2.0.1, it appears that all of the individual CT datasets have overlapping labels, in many cases these are significantly large regions that are identified as multiple structures. Looking at the TotalSegmentator code, combine_masks_to_multilabel_file, this is implicitly resolved by using the last read segmentation label file. This appears to be an arbitrary decision as it could have been the first segmentation label file, or any other label in between.
Is there something that I am missing which motivated the decision?
Code analyzing label overlaps is below (it is not efficient, no parallel processing, so takes hours on a laptop to analyze the whole dataset):
import pathlib
import numpy as np
import tempfile
import pandas as pd
import nibabel as nib
import SimpleITK as sitk
def read_nifti(file_path):
"""
In some nifti files the qform and sform are both set but sform is
not orthonormal.
See https://github.com/SimpleITK/SimpleITK/issues/1452 and
original ITK change https://github.com/InsightSoftwareConsortium/ITK/pull/1868
and resolution https://github.com/InsightSoftwareConsortium/ITK/commit/6278c4171f697649da5b3136214cfbb8e37aab00
There are still cases when this is an issue. This function tries to read. If this issue is encountered
the image is read using nibabel which ignores the inconsistency and written to a tempfile using
the qform information which is guaranteed to be orthornormal. The temp image is then read. This is
a bit of a hack as the right thing to do would be to extract the information from the nibabel image
and create a SimpleITK image in memory.
"""
try:
image = sitk.ReadImage(file_path, imageIO="NiftiImageIO")
except RuntimeError as e:
if (
"ITK ERROR: ITK only supports orthonormal direction cosines. No orthonormal definition found!"
in str(e)
):
with tempfile.TemporaryDirectory() as tmpdirname:
image = nib.load(file_path)
file_path = pathlib.Path(tmpdirname) / pathlib.Path(file_path).name
nib.save(
nib.Nifti1Image(image.get_fdata(), image.get_qform()), file_path
)
image = sitk.ReadImage(file_path, imageIO="NiftiImageIO")
else:
raise
return image
def overlap_analysis(output_dir, nifti_segmentation_file_paths):
res = {}
# Read the segmentations as SimpleITK images and get a numpy array view to each. This
# saves time, no memory copy, but we need to keep the SimpleITK images so that the
# views remain valid.
sitk_segmentations = [
read_nifti(file_name) for file_name in nifti_segmentation_file_paths
]
segmentations = [
sitk.GetArrayViewFromImage(sitk_seg) for sitk_seg in sitk_segmentations
]
overlapping_label_locations = sum(segmentations) > 1
# if this set of labels has overlaps, record the issue
if np.sum(overlapping_label_locations) > 0:
seg_overlap_image = sitk.GetImageFromArray(
overlapping_label_locations.astype(np.uint8)
)
seg_overlap_image.CopyInformation(sitk_segmentations[0])
sitk.WriteImage(
seg_overlap_image,
pathlib.Path(output_dir) / "regions_with_overlapping_labels.nii.gz",
)
res["dir"] = output_dir
files_with_overlap = []
for seg_file_name, seg in zip(nifti_segmentation_file_paths, segmentations):
seg_overlap = np.multiply(overlapping_label_locations, seg)
if np.sum(seg_overlap) > 0:
files_with_overlap.append(seg_file_name.name)
res["files"] = files_with_overlap
return res
root_dir = pathlib.Path("Totalsegmentator/")
segmentation_directories = [d / "segmentations" for d in root_dir.glob("s*")]
res = [
overlap_analysis(
pathlib.Path(segmentations_dir).parent, list(segmentations_dir.glob("*.nii.gz"))
)
for segmentations_dir in segmentation_directories
]
df = pd.DataFrame(res, columns=["dir", "files"])
df.to_csv(root_dir / "overlapping_labels.csv", index=False)
The text was updated successfully, but these errors were encountered:
Hello @wasserth,
Thanks for the great tool and for sharing the CT data on Zenodo.
From an analysis of the segmentation data, version 2.0.1, it appears that all of the individual CT datasets have overlapping labels, in many cases these are significantly large regions that are identified as multiple structures. Looking at the TotalSegmentator code, combine_masks_to_multilabel_file, this is implicitly resolved by using the last read segmentation label file. This appears to be an arbitrary decision as it could have been the first segmentation label file, or any other label in between.
Is there something that I am missing which motivated the decision?
Code analyzing label overlaps is below (it is not efficient, no parallel processing, so takes hours on a laptop to analyze the whole dataset):
The text was updated successfully, but these errors were encountered: