Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with labels #1

Open
bardiakzzzz opened this issue Dec 1, 2020 · 9 comments
Open

Problem with labels #1

bardiakzzzz opened this issue Dec 1, 2020 · 9 comments

Comments

@bardiakzzzz
Copy link

Hi, thanks for sharing the dataset
I have problem with reading labels
labels in Slice_level_labels.npy and index.csv are not match with dicom files
Could you please help me

@ShahinSHH
Copy link
Owner

Hi Bardia,
Slice_level_labels.npy contains the labels in the order that is mentioned in the index.csv file.
The first 55 labeled cases are COVID-19 cases. However, they're not necessarily the first 55 patients. For example, the labeled cases 50-54 are related to P058, P104, P109, P129, P142. You can find this relation in the index.csv file.
I hope this answers your question. However, if you found other inconsistency in the labels and dicom files, please give me the case number for further review.

@ShahinSHH
Copy link
Owner

Hi, thanks for sharing the dataset
I have problem with reading labels
labels in Slice_level_labels.npy and index.csv are not match with dicom files
Could you please help me

Please also notice the following note in the readme file. I saw this mistake from many people and I guess you're doing the same mistake.
NOTE: The correct order of slices in a CT scan doesn't necessarily follow the order of the Slice-IDs. You need to sort slices based on the "slice location" parameter provided in the DICOM files when you are reading the data.

If you're using pydicom library or similar ones, you need to sort slices based on the "slice location" parameter which indicates the Z-axis location of the slices. The "slice location" parameter can be accessed when you read a .dcm file using pydicom or similar libraries. You can store them and sort the slices based on them.

@bardiakzzzz
Copy link
Author

Thanks for replying, It helped a lot

@faajabbari
Copy link

Hi, thanks for sharing the dataset
I have problem with reading labels
labels in Slice_level_labels.npy and index.csv are not match with dicom files
Could you please help me

Please also notice the following note in the readme file. I saw this mistake from many people and I guess you're doing the same mistake.
NOTE: The correct order of slices in a CT scan doesn't necessarily follow the order of the Slice-IDs. You need to sort slices based on the "slice location" parameter provided in the DICOM files when you are reading the data.

If you're using pydicom library or similar ones, you need to sort slices based on the "slice location" parameter which indicates the Z-axis location of the slices. The "slice location" parameter can be accessed when you read a .dcm file using pydicom or similar libraries. You can store them and sort the slices based on them.

Hi, Thanks for sharing this great data set.
I still have a problem with labels. I converted dicom files to png and printed the labels. It seemed they didn't match. Could you possibly help me to find the problem?
First, for each case, I read all the dicom files by pydicom (it rotates images 90 degrees) and saved them in png format, and named them based on 'SliceLocation ' parameter. Then sorted them based on the given name.

  1. Shall I sort them ascending or descending? (Even though I tried them both, and failed)
  2. You have provided an example, that he labeled cases 50-54 are related to P058, P104, P109, P129, P142. Have you updated the index.csv? I find labeled cases 50-54 are related to P049-P053.

If it is necessary I will gladly share my snippet.
Thanks in advanced

@ShahinSHH ShahinSHH reopened this Jun 1, 2021
@ShahinSHH
Copy link
Owner

Hello and thanks for your message,
Sorry to hear that you have still problems matching the labels and slices.

For your questions:
Q1: You should sort the slices from the upper part of the lung all the way to the bottom. Using the "Slice Location" value, I sorted them in descending order for this dataset.

Q2: Yes the index file has been updated. One difference is in the name of those cases that you mentioned, the other difference is that one labeled COVID-19 case is removed in the new version. This dataset and the related data description have been published in Nature Scientific Data and the index and label files have been updated accordingly. Please use the updated files. Also, for more information on the dataset, you can refer here.

Also, I suggest that you use a DICOM viewer software to view each case and check the labels with the slices there. You can use MicroDicom for windows and miele-lxiv for mac. They are free.

If none of them helped, please share here some sample slices with their corresponding slice locations and labels based on what you have on your side so I can better identify the source of your error.

@faajabbari
Copy link

faajabbari commented Jun 3, 2021

Hello and thanks for your message,
Sorry to hear that you have still problems matching the labels and slices.

For your questions:
Q1: You should sort the slices from the upper part of the lung all the way to the bottom. Using the "Slice Location" value, I sorted them in descending order for this dataset.

Q2: Yes the index file has been updated. One difference is in the name of those cases that you mentioned, the other difference is that one labeled COVID-19 case is removed in the new version. This dataset and the related data description have been published in Nature Scientific Data and the index and label files have been updated accordingly. Please use the updated files. Also, for more information on the dataset, you can refer here.

Also, I suggest that you use a DICOM viewer software to view each case and check the labels with the slices there. You can use MicroDicom for windows and miele-lxiv for mac. They are free.

If none of them helped, please share here some sample slices with their corresponding slice locations and labels based on what you have on your side so I can better identify the source of your error.

Thanks for answering my questions.

Checking the labels on dicom viewer was a great idea. As my os is Ubuntu I used an online viewer. I suppose dicom viewers sort the images and there is no need for further attempts.
I tested P006, according to index.csv, its index is 5
therefore, to get its labels, I read Slice_level_labels[5]
for example 70-th slice is Slice_level_labels[5][70] = 1 which shows the slice is infected
and the infection location would be: lobe_level_labels[5][70] = [0. 0. 1. 1. 0.] which means the infection is located at RLL and RML
The image I got in the online viewer is:

Screenshot from 2021-06-03 14-17-57

it seems like the lobe-level-label doesn't match (I checked that slices are sorted from neck to stomach)

I would be grateful if you could help me with the problem.

@ShahinSHH
Copy link
Owner

From what you shared, I can see that everything looks fine. In this slice, RLL and RML are involved.

@faajabbari
Copy link

From what you shared, I can see that everything looks fine. In this slice, RLL and RML are involved.

Thank you for your immediate response.

I thought the largest lesion was in left lower lobe. So, my interpretation was wrong? or the image should be flipped?

tt

@joyivan
Copy link

joyivan commented Oct 5, 2022

I have a question:
the Label Index 43 was missing in the Index.csv but the there is still 43 index in Slice-level-labels-updated-1.npy file? or the 43 case in the npy file indicate which case ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants