How can I check if an array field contains a certain value OR the array field is null? #9757
-
I have a set of documents. Some documents are versioned; others are not. A single document can belong to multiple versions. In my document database, relevant versions are stored in a JSON field. For example, import ibis
ibis.options.interactive = True
ddb = ibis.duckdb.connect("mydb.duckdb")
schema = ibis.schema(
{
"id": "int64",
"json_data": "json",
}
)
ddb.drop_table("mytable", force=True)
ddb.create_table("mytable", schema=schema)
data = [
{"id": 1, "json_data": {"versions": ["v1", "v2"]}},
{"id": 2, "json_data": {"versions": ["v3"]}},
{"id": 3, "json_data": {}},
]
ddb.insert("mytable", data)
t = ddb.table("mytable")
t.select(
t.id,
t.json_data["versions"],
)
I can filter for documents where t.filter(
t.json_data["versions"].cast("array<str>").contains("v2")
| t.json_data["versions"].cast("array<str>").contains("v3")
)
and I can filter for documents that are "unversioned" (thus being relevant to all versions): t.filter(
t.json_data["versions"] == None
)
But how do I get all documents that are relevant to a specific version, or are unversioned (and thus still relevant)? t.filter(
t.json_data["versions"].cast("array<str>").contains("v1")
| t.json_data["versions"] == None
)
(And, am I thinking about method chaining in the right way for JSON-based arrays, or am I missing some fundamentally better approach?) Thanks in advance! |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 2 replies
-
You're missing the The (t.json_data["versions"].cast("array<str>").contains("v1") | t.json_data["versions"]) == None You'll need to parenthesize any binary operators in the specific way that you want to evaluate them when any of the bitwise operators ( In this case that'd be: t.json_data["versions"].cast("array<str>").contains("v1") | (t.json_data["versions"] == None) Running that in IPython:
Alternatively, you can use the t.json_data["versions"].cast("array<str>").contains("v1") | t.json_data["versions"].isnull() This is:
|
Beta Was this translation helpful? Give feedback.
You're missing the
()
around thet.json_data["versions"] == None
.The
|
operator binds tighter than==
, so the parentheiszation (the way the Python interpreter sees it) of what you've written is actuallyYou'll need to parenthesize any binary operators in the specific way that you want to evaluate them when any of the bitwise operators (
|
,&
, and^
) are in an expression.In this case that'd be:
Running that in IPython: