-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Consider collections in on_strings for parameters accepting multiple datasets #19817
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
Consider collections in on_strings for parameters accepting multiple datasets #19817
Conversation
I think there's a general problem with trying to understand the history based on the provenance in the dataset names. While this improves the situation when consuming a collection, the resulting code is not performant. The long term plan is to enable the display of a provenance graph, making use of the request that actually generated a given job ... perhaps it is better to wait for that ? |
This might be true and I'm looking forward to any improvements. But at the moment the fact is that this is the state of the art. For data inputs with
We are already talking about this since 6 years: #7467 (comment) Also with such a graph we may still want to display a name for the nodes of the graph.
This might be improved. |
4cb043f
to
4b74cb5
Compare
There's a difference with suggesting what can be done, and a plan and PR that actually do it. The reason nothing has moved so far is that this is a complex task. However, parts of this have been implemented (see the invocation graph, as well as the inputs/outputs filter button), and others are in progress (the request state to track this type of metadata correctly).
click on info, see the inputs ? this is much more accurate anyway than this approximation. If the input is a large mapped over collection you still don't know what the input was with this change. |
621a290
to
cffe3dd
Compare
lib/galaxy/tools/actions/__init__.py
Outdated
if getattr(dataset_collection, "hid", None): | ||
collection_names.append(str(dataset_collection.hid)) | ||
|
||
for input_name in reversed(inp_data): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The sort order should only be determined by the order within which the parameters are defined. It might just make more sense to sort by hid ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to be sure reversed
was used before my PR. Not sure if the order is equal to the order of the parameter definition.
I'm fine with sorting by HID / we also should do this for the collections.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is the last open point. Should we remove the reversing here? And sort collections and datasets? This would also allow to shorten names even more, e.g. 1-5 instead of 1, 2, 3, 4, 5?
And the 2nd last point is if we should stick with "data" in the dataset names .. and the shorter "list" instead of collection
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
List is just wrong though, unless you want to special case the list ?
it seems to make sense to just list the inputs based on the position in the history ? Who would expect for them to follow the tool form ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks better now, if you can make sure the tests pass I think this is fine.
6a9984f
to
2a7440c
Compare
output "data 1, 2, and 3" instead of "data 1, data 2, and data 3"
Co-authored-by: Marius van den Beek <[email protected]>
2a7440c
to
6b7f1d2
Compare
6b7f1d2
to
6fad473
Compare
instead of dataset and collection
0aa8818
to
631b252
Compare
631b252
to
b4620f1
Compare
b4620f1
to
da0a8da
Compare
Hrm. Can't get the typing right :( Suggestions? |
I pushed a commit to fix that. |
or input of dataset collections to
data_collection
inputs ordata
inputs withmultiple="true"
the dataset name (on_string
) lists the HIDs of the datasets in the input collections. This makes it more difficult than necessary to understand the history. It would be better to list the input collections.This PR does 2 things:
on_string
from"data 1, data 2, and data 3"
to"dataset 1, 2, and 3"
and analogous for collections.Discussion:
dataset
we could stick withdata
collection
we could uselist
.. to me it seems that the terms are used interchangeably .. ultimately we should decide for oneTODO:
Fixes #7467
Before (using the identifier_multiple test tool for multiple data input):
Screencast.from.02.12.2024.13.22.58.webm
After:
Screencast.from.02.12.2024.13.14.30.webm
How to test the changes?
(Select all options that apply)
License