-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] layout redone #134
[WIP] layout redone #134
Conversation
hey @nbeliy The "No assumptions on modalities are used" duplicates the "schema less" approach of this #124 Also I would STRONGLY avoid storing the name of the json sidecars in the layout because the metadata for a given file can be spread across several json files spread through the different levels of the BIDS folder hierarchy (inheritance principle). |
In shema-less approach, you need to separate modality and entity fields from predefined it. You can't forbid user to use modality I still don't see the point to run all parse_dwi, parse_func instead of just loading the layout and run validator on top. |
Wait I am not sure what you mean. This is what the behavior of the schema-less use of use_schema = false();
BIDS = bids.layout(fullfile(pth_bids_example, ...
'ds000001-fmriprep'), use_schema); This can then return all modalities including those that do not exist in the BIDS specification (ex: figures) bids.query(BIDS, 'modalities')
ans =
1×3 cell array
'anat' 'figures' 'func' Entities that do not exist the BIDS schema at the moment are listed in the layout. disp(BIDS.subjects(1).func(1))
ans =
struct with fields:
filename: 'sub-10_task-balloonanalogrisktask_run-1_AROMAnoiseICs.csv'
suffix: 'AROMAnoiseICs'
ext: '.csv'
sub: '10'
task: 'balloonanalogrisktask'
run: '1'
desc: ''
content: []
meta: []
from: ''
mode: ''
to: ''
res: ''
space: ''
hemi: '' I am trying to see what is the test case you mentioned that would break the layout but I think I don't see what you mean, could you give me a concrete example? |
Imagine that someone will have file |
Ha yeah in that case the file would actually just be skipped. But then that changes the basic assumption we've had from the beginning of this discussion that files should at least follow this pattern: I am not so keen in going in that direction: I think this is the minimal amount of file formatting that there should be in the subject folders. |
I meant: If one of entities (or modalities) coincides with field name -- it will pass the regexp, but overwrite the corresponding field, which will make it unsafe. I agree it will be a rare occasion. A possible fix (without creating sub-structure), is to make fields starting wth '_' (which will never appear in entity/modality name). |
Aaaahhhh I see. Sorry. It took me a while.
If I understand you correctly, then I am afraid matlab won't be happy about this:
|
Then with '_' at the end?:
|
Yuck. No. Let's go with sub-structures for entities : I am always afraid of deeply nested sub-structures but sometimes you can't avoid it. Just to make sure we are on the same page, you are suggesting something like this?
Entity substructure
|
sprintf(['^%s.*' pattern '$'], ... | ||
subject.name)); | ||
|
||
pattern); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are still going to need to have something like this here because some of the "files" for meg are actually folders.
if strcmp(modality, 'meg') && ~isempty(d)
for i = 1:size(d, 1)
file_list{end + 1, 1} = d(i, :);
end
end
Another possibility, probably simpler, is to disallow entities and modalities with the same name as fields (with a warning). I agree to drop this pull request (in any case I can't prvide a quality matlab code), and move to your implementation for derivatives. |
Well you just never know: so I went ahead and adapted the other PR to align better with your suggestion. Also adapted the query along the way. I was also thinking that to get the 2 steps process you were suggesting to give the same outcome as the other PR would imply going:
So I think I prefer only indexing stuff once even if it means carrying around the schema during parsing. But you are correct that some of the code |
Quick question. What was the idea or the usecase for adding those 2 fields?
|
These are just small 'usefull for later' switches, based on object-oriented approach. In my mind the layout is used as structure passed to several bids-related tools or even a pipeline. And for such tools it may be usefull to know if file is compressed, or it is tabular file. Practice example: the Christop's The tab is in the same spirit -- tabular data files should be treated differently from image (at least the sidecar json files have different info and structure), so test if given data file is tabular is useful. The |
OK that 's what I though but I just wanted to make sure.
yup sort of ran into the same problem, so it could make sense to have it either in the structure or have query be able to filter files based on that their zipped status. PS: @ChristophePhillips I used SPM for years and learned only recently that it can handle zipped file fine: bids-standard/bids-specification#136 (comment)
yup, same here support either in the structure or via query seems to make sense. |
@all-contributors please add @nbeliy for code, ideas |
I've put up a pull request to add @nbeliy! 🎉 |
Will close this one and wrap the other PR so it can at least live "happily" in the |
Main changes:
modalities
substructurebasename
-- filename without suffixgz
-- true if file is compressed (ends with .gz)tab
-- true if file is tabularintended
-- array of files that uses current filemetafie
-- full path to corresponding json fileentity
-- substructure containing parsed entities of fileThings to do:
depends
field, that contains paths to all files needed for given one