Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Review the internal data structures in class Query #158

Open
wants to merge 27 commits into
base: develop
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
ca98b69
Don't call _get_subst() when the result is not needed
RKrahl Aug 23, 2024
c67d44a
Add helper classes OrderItem and ConditionItem to represent the items
RKrahl Aug 23, 2024
811de80
Internally represent order items as a list rather than as a OrderedDict
RKrahl Aug 23, 2024
4dec2ed
Move some code from the init methods of OrderItem and ConditionItem to
RKrahl Aug 23, 2024
9e27157
Internally represent conditions items as a list rather than as a dict
RKrahl Aug 23, 2024
4d58e3d
Typo in comment
RKrahl Aug 23, 2024
397c453
Fix formal string representation operator Query.__repr__(), Ref. #94
RKrahl Aug 24, 2024
c6c0c52
Minor documentation tweaks
RKrahl Aug 24, 2024
f5014cb
Update the source code creating queries to use the new format of the
RKrahl Aug 26, 2024
2fc5f82
Emit a deprecation warning when passing a mapping in the conditions
RKrahl Aug 26, 2024
e47ae77
Update the test_06_query.py to use the new format of the conditions
RKrahl Aug 26, 2024
ada4100
- Update all tests to use the new format of the conditions argument
RKrahl Aug 27, 2024
50a1063
Add checks to verify that evaluating the formal string representation
RKrahl Aug 27, 2024
383c975
Add tests for the legacy use of passing a mapping in the conditions
RKrahl Aug 28, 2024
cfc9099
Update example scripts to use the new format of the conditions
RKrahl Aug 28, 2024
60024f9
Update the tutotial section "Searching for objects in the ICAT server"
RKrahl Aug 28, 2024
38decc0
Add a change note to the documentation of Query.setOrder()
RKrahl Aug 29, 2024
93e1a12
Update the tutotial section "Upload and download files to and from
RKrahl Aug 29, 2024
f325880
Fix overly long lines in tutorial examples
RKrahl Aug 29, 2024
587718a
Add a test using a query where one attributes appears more than once
RKrahl Aug 29, 2024
5ed6692
Fixup cfc9099: still missed one occurrence of a mapping in the
RKrahl Aug 29, 2024
e9a545f
Remove an outdated entry in the known issues documentation page
RKrahl Aug 29, 2024
b358006
Start the development for the next major release
RKrahl Aug 30, 2024
8e1e872
Review the order of methods in class Query to be consistent with the
RKrahl Aug 30, 2024
5c7e114
Merge branch 'v2_0' into query
RKrahl Aug 30, 2024
6db1195
Update changelog
RKrahl Aug 30, 2024
4025818
Add a comment in test_09_deprecations.py that testing the legacy use
RKrahl Aug 30, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
40 changes: 40 additions & 0 deletions CHANGES.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,46 @@ Changelog
=========


.. _changes-2_0_0:

2.0.0 (not yet released)
~~~~~~~~~~~~~~~~~~~~~~~~

Modified features
-----------------

+ `#158`_: Review the internal data structures in class
:class:`icat.query.Query`. As a consequence, change the format of
the `conditions` argument to :class:`icat.query.Query` and
:meth:`icat.query.Query.addConditions` from a mapping to a list of
tuples. The legacy format is still supported, but deprecated.
Furthermore drop the restriction that any attribute may only appear
once in the `order` argument to :class:`icat.query.Query` and
:meth:`icat.query.Query.setOrder`.

Incompatible changes and deprecations
-------------------------------------

+ `#158`_: Deprecate passing a mapping in the `conditions` argument to
:class:`icat.query.Query` and :meth:`icat.query.Query.addConditions`.
In calling code, change::

query = Query(client, "Datafile", conditions={
"dataset.name": "= 'e208945'",
"datafileCreateTime": [">= '2012-01-01'", "< '2013-01-01'" ],
})

to::

query = Query(client, "Datafile", conditions=[
("dataset.name", "= 'e208945'"),
("datafileCreateTime", ">= '2012-01-01'"),
("datafileCreateTime", "< '2013-01-01'" ),
])

.. _#158: https://github.com/icatproject/python-icat/pull/158


.. _changes-1_4_0:

1.4.0 (2024-08-30)
Expand Down
60 changes: 31 additions & 29 deletions doc/examples/add-job.py
Original file line number Diff line number Diff line change
Expand Up @@ -42,8 +42,10 @@ def makeparam(t, pdata):
param = client.new(t)
initobj(param, pdata)
ptdata = data['parameter_types'][pdata['type']]
query = ("ParameterType [name='%s' AND units='%s']"
% (ptdata['name'], ptdata['units']))
query = Query(client, "ParameterType", conditions=[
("name", "= '%s'" % ptdata['name']),
("units", "= '%s'" % ptdata['units']),
])
param.type = client.assertedSearch(query)[0]
return param

Expand Down Expand Up @@ -78,20 +80,20 @@ def makeparam(t, pdata):
initobj(inputcollection, jobdata['input'])

for ds in jobdata['input']['datasets']:
query = Query(client, "Dataset", conditions={
"name":"= '%s'" % ds['name'],
"investigation.name":"= '%s'" % ds['investigation']
})
query = Query(client, "Dataset", conditions=[
("name", "= '%s'" % ds['name']),
("investigation.name", "= '%s'" % ds['investigation']),
])
dataset = client.assertedSearch(query)[0]
dcs = client.new("DataCollectionDataset", dataset=dataset)
inputcollection.dataCollectionDatasets.append(dcs)

for df in jobdata['input']['datafiles']:
query = Query(client, "Datafile", conditions={
"name":"= '%s'" % df['name'],
"dataset.name":"= '%s'" % df['dataset'],
"dataset.investigation.name":"= '%s'" % df['investigation']
})
query = Query(client, "Datafile", conditions=[
("name", "= '%s'" % df['name']),
("dataset.name", "= '%s'" % df['dataset']),
("dataset.investigation.name", "= '%s'" % df['investigation']),
])
datafile = client.assertedSearch(query)[0]
dcf = client.new("DataCollectionDatafile", datafile=datafile)
inputcollection.dataCollectionDatafiles.append(dcf)
Expand All @@ -112,13 +114,13 @@ def makeparam(t, pdata):
initobj(outputcollection, jobdata['output'])

for ds in jobdata['output']['datasets']:
query = Query(client, "Investigation", conditions={
"name":"= '%s'" % ds['investigation']
})
query = Query(client, "Investigation", conditions=[
("name", "= '%s'" % ds['investigation']),
])
investigation = client.assertedSearch(query)[0]
query = Query(client, "DatasetType", conditions={
"name":"= '%s'" % data['dataset_types'][ds['type']]['name']
})
query = Query(client, "DatasetType", conditions=[
("name", "= '%s'" % data['dataset_types'][ds['type']]['name'])
])
dataset_type = client.assertedSearch(query)[0]
print("Dataset: creating '%s' ..." % ds['name'])
dataset = client.new("Dataset")
Expand All @@ -131,10 +133,10 @@ def makeparam(t, pdata):

for df in ds['datafiles']:
dff = data['datafile_formats'][df['format']]
query = Query(client, "DatafileFormat", conditions={
"name":"= '%s'" % dff['name'],
"version":"= '%s'" % dff['version'],
})
query = Query(client, "DatafileFormat", conditions=[
("name", "= '%s'" % dff['name']),
("version", "= '%s'" % dff['version']),
])
datafile_format = client.assertedSearch(query)[0]
print("Datafile: creating '%s' ..." % df['name'])
datafile = client.new("Datafile")
Expand All @@ -157,16 +159,16 @@ def makeparam(t, pdata):
outputcollection.dataCollectionDatasets.append(dcs)

for df in jobdata['output']['datafiles']:
query = Query(client, "Dataset", conditions={
"name":"= '%s'" % df['dataset'],
"investigation.name":"= '%s'" % df['investigation']
})
query = Query(client, "Dataset", conditions=[
("name", "= '%s'" % df['dataset']),
("investigation.name", "= '%s'" % df['investigation']),
])
dataset = client.assertedSearch(query)[0]
dff = data['datafile_formats'][df['format']]
query = Query(client, "DatafileFormat", conditions={
"name":"= '%s'" % dff['name'],
"version":"= '%s'" % dff['version'],
})
query = Query(client, "DatafileFormat", conditions=[
("name", "= '%s'" % dff['name']),
("version", "= '%s'" % dff['version']),
])
datafile_format = client.assertedSearch(query)[0]
print("Datafile: creating '%s' ..." % df['name'])
datafile = client.new("Datafile")
Expand Down
6 changes: 3 additions & 3 deletions doc/examples/create-datafile.py
Original file line number Diff line number Diff line change
Expand Up @@ -56,13 +56,13 @@
raise RuntimeError("datafile %s not found" % df_path)

query = Query(client, "DatasetType",
conditions={ "name": "= '%s'" % conf.dst_name })
conditions=[ ("name", "= '%s'" % conf.dst_name) ])
dst = client.assertedSearch(query)[0]
query = Query(client, "DatafileFormat",
conditions={ "name": "= '%s'" % conf.dff_name })
conditions=[ ("name", "= '%s'" % conf.dff_name) ])
dff = client.assertedSearch(query)[0]
query = Query(client, "Investigation",
conditions={ "name": "= '%s'" % conf.investigation })
conditions=[ ("name", "= '%s'" % conf.investigation) ])
investigation = client.assertedSearch(query)[0]

fstats = df_path.stat()
Expand Down
14 changes: 8 additions & 6 deletions doc/examples/create-investigation.py
Original file line number Diff line number Diff line change
Expand Up @@ -107,8 +107,10 @@ def getUser(client, attrs):
ip = client.new("InvestigationParameter")
initobj(ip, pdata)
ptdata = data['parameter_types'][pdata['type']]
query = ("ParameterType [name='%s' AND units='%s']"
% (ptdata['name'], ptdata['units']))
query = Query(client, "ParameterType", conditions=[
("name", "= '%s'" % ptdata['name']),
("units", "= '%s'" % ptdata['units']),
])
ip.type = client.assertedSearch(query)[0]
investigation.parameters.append(ip)
if 'shifts' in investigationdata:
Expand All @@ -123,10 +125,10 @@ def getUser(client, attrs):
sd = investigation.startDate or investigation.endDate
ed = investigation.endDate or investigation.startDate
if sd and ed:
query = Query(client, "FacilityCycle", conditions={
"startDate": "<= '%s'" % parse_attr_string(ed, "Date"),
"endDate": "> '%s'" % parse_attr_string(sd, "Date"),
})
query = Query(client, "FacilityCycle", conditions=[
("startDate", "<= '%s'" % parse_attr_string(ed, "Date")),
("endDate", "> '%s'" % parse_attr_string(sd, "Date")),
])
for fc in client.search(query):
ifc = client.new("InvestigationFacilityCycle", facilityCycle=fc)
investigation.investigationFacilityCycles.append(ifc)
Expand Down
119 changes: 69 additions & 50 deletions doc/examples/dumpinvestigation.py
Original file line number Diff line number Diff line change
Expand Up @@ -75,20 +75,32 @@ def mergesearch(sexps):
# instruments related to the investigations. These are independent
# searches, but the results are likely to overlap. So we need to
# search and merge results first. Similar situation for ParameterType.
usersearch = [("User <-> InvestigationUser <-> Investigation [id=%d]"),
("User <-> UserGroup <-> Grouping <-> InvestigationGroup "
"<-> Investigation [id=%d]"),
("User <-> InstrumentScientist <-> Instrument "
"<-> InvestigationInstrument <-> Investigation [id=%d]")]
ptsearch = [("ParameterType INCLUDE Facility, PermissibleStringValue "
"<-> InvestigationParameter <-> Investigation [id=%d]"),
("ParameterType INCLUDE Facility, PermissibleStringValue "
"<-> SampleParameter <-> Sample <-> Investigation [id=%d]"),
("ParameterType INCLUDE Facility, PermissibleStringValue "
"<-> DatasetParameter <-> Dataset <-> Investigation [id=%d]"),
("ParameterType INCLUDE Facility, PermissibleStringValue "
"<-> DatafileParameter <-> Datafile <-> Dataset "
"<-> Investigation [id=%d]"), ]
usersearch = [
Query(client, "User", conditions=[
("investigationUsers.investigation.id", "= %d")
]),
Query(client, "User", conditions=[
("userGroups.grouping.investigationGroups.investigation.id", "= %d")
]),
Query(client, "User", conditions=[
("instrumentScientists.instrument.investigationInstruments."
"investigation.id", "= %d")
]),
]
ptsearch = [
Query(client, "ParameterType", conditions=[
("investigationParameters.investigation.id", "= %d")
], includes=["facility", "permissibleStringValues"]),
Query(client, "ParameterType", conditions=[
("sampleParameters.sample.investigation.id", "= %d")
], includes=["facility", "permissibleStringValues"]),
Query(client, "ParameterType", conditions=[
("datasetParameters.dataset.investigation.id", "= %d")
], includes=["facility", "permissibleStringValues"]),
Query(client, "ParameterType", conditions=[
("datafileParameters.datafile.dataset.investigation.id", "= %d")
], includes=["facility", "permissibleStringValues"]),
]

# The set of objects to be included in the Investigation.
inv_includes = { "facility", "type.facility", "investigationInstruments",
Expand All @@ -103,42 +115,49 @@ def mergesearch(sexps):
# list: either queries expressed as Query objects, or queries
# expressed as string expressions, or lists of objects. In the first
# two cases, the seacrh results will be written, in the last case, the
# objects are written as provided. We assume that there is only one
# relevant facility, e.g. that all objects related to the
# investigation are related to the same facility. We may thus ommit
# the facility from the ORDER BY clauses.
authtypes = [mergesearch([s % invid for s in usersearch]),
("Grouping ORDER BY name INCLUDE UserGroup, User "
"<-> InvestigationGroup <-> Investigation [id=%d]" % invid)]
statictypes = [("Facility ORDER BY name"),
("Instrument ORDER BY name "
"INCLUDE Facility, InstrumentScientist, User "
"<-> InvestigationInstrument <-> Investigation [id=%d]"
% invid),
(mergesearch([s % invid for s in ptsearch])),
("InvestigationType ORDER BY name INCLUDE Facility "
"<-> Investigation [id=%d]" % invid),
("SampleType ORDER BY name, molecularFormula INCLUDE Facility "
"<-> Sample <-> Investigation [id=%d]" % invid),
("DatasetType ORDER BY name INCLUDE Facility "
"<-> Dataset <-> Investigation [id=%d]" % invid),
("DatafileFormat ORDER BY name, version INCLUDE Facility "
"<-> Datafile <-> Dataset <-> Investigation [id=%d]" % invid)]
investtypes = [Query(client, "Investigation",
conditions={"id":"in (%d)" % invid},
includes=inv_includes),
Query(client, "Sample", order=["name"],
conditions={"investigation.id":"= %d" % invid},
includes={"investigation", "type.facility",
"parameters", "parameters.type.facility"}),
Query(client, "Dataset", order=["name"],
conditions={"investigation.id":"= %d" % invid},
includes={"investigation", "type.facility", "sample",
"parameters", "parameters.type.facility"}),
Query(client, "Datafile", order=["dataset.name", "name"],
conditions={"dataset.investigation.id":"= %d" % invid},
includes={"dataset", "datafileFormat.facility",
"parameters", "parameters.type.facility"})]
# objects are written as provided.
authtypes = [
mergesearch([str(s) % invid for s in usersearch]),
Query(client, "Grouping", conditions=[
("investigationGroups.investigation.id", "= %d" % invid)
], order=["name"], includes=["userGroups.user"])
]
statictypes = [
Query(client, "Facility", order=True),
Query(client, "Instrument", conditions=[
("investigationInstruments.investigation.id", "= %d" % invid)
], order=True, includes=["facility", "instrumentScientists.user"]),
mergesearch([str(s) % invid for s in ptsearch]),
Query(client, "InvestigationType", conditions=[
("investigations.id", "= %d" % invid)
], order=True, includes=["facility"]),
Query(client, "SampleType", conditions=[
("samples.investigation.id", "= %d" % invid)
], order=True, includes=["facility"]),
Query(client, "DatasetType", conditions=[
("datasets.investigation.id", "= %d" % invid)
], order=True, includes=["facility"]),
Query(client, "DatafileFormat", conditions=[
("datafiles.dataset.investigation.id", "= %d" % invid)
], order=True, includes=["facility"]),
]
investtypes = [
Query(client, "Investigation",
conditions=[("id", "in (%d)" % invid)],
includes=inv_includes),
Query(client, "Sample", order=["name"],
conditions=[("investigation.id", "= %d" % invid)],
includes={"investigation", "type.facility",
"parameters", "parameters.type.facility"}),
Query(client, "Dataset", order=["name"],
conditions=[("investigation.id", "= %d" % invid)],
includes={"investigation", "type.facility", "sample",
"parameters", "parameters.type.facility"}),
Query(client, "Datafile", order=["dataset.name", "name"],
conditions=[("dataset.investigation.id", "= %d" % invid)],
includes={"dataset", "datafileFormat.facility",
"parameters", "parameters.type.facility"})
]

with open_dumpfile(client, conf.file, conf.format, 'w') as dumpfile:
dumpfile.writedata(authtypes)
Expand Down
6 changes: 3 additions & 3 deletions doc/examples/dumprules.py
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@

groups = set()
query = Query(client, "Rule",
conditions={"grouping": "IS NOT NULL"},
conditions=[("grouping", "IS NOT NULL")],
includes={"grouping.userGroups.user"})
for r in client.search(query):
groups.add(r.grouping)
Expand All @@ -45,9 +45,9 @@
sorted(groups, key=icat.entity.Entity.__sortkey__),
Query(client, "PublicStep"),
Query(client, "Rule", order=["what", "id"],
conditions={"grouping": "IS NULL"}),
conditions=[("grouping", "IS NULL")]),
Query(client, "Rule", order=["grouping.name", "what", "id"],
conditions={"grouping": "IS NOT NULL"},
conditions=[("grouping", "IS NOT NULL")],
includes={"grouping"}),
]

Expand Down
6 changes: 3 additions & 3 deletions doc/examples/ingest.py
Original file line number Diff line number Diff line change
Expand Up @@ -83,9 +83,9 @@
client, conf = config.getconfig()
client.login(conf.auth, conf.credentials)

query = Query(client, "Investigation", conditions={
"name": "= '%s'" % conf.investigation
})
query = Query(client, "Investigation", conditions=[
("name", "= '%s'" % conf.investigation),
])
investigation = client.assertedSearch(query)[0]


Expand Down
Loading