Skip to content

Conversation

KangOl
Copy link
Contributor

@KangOl KangOl commented Oct 10, 2025

The usage of str.format to inject the parallel filter used to explode queries is not robust to the presence of other curly braces. Examples:

  1. JSON strings (typically to leverage their mapping capabilities):

See 79f3d71, where a query had to be modified to accommodate that.

  1. Hardcoded sets of curly braces:
>>> "UPDATE t SET c = '{usage as literal characters}' WHERE {parallel_filter}".format(parallel_filter="…")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
KeyError: 'usage as literal characters'

Which can be (unelegantly) solved adding even more braces, leveraging one side effect of str.format:

>>> "UPDATE t SET c = '{{usage as literal characters}}' WHERE {parallel_filter}".format(parallel_filter="…")
"UPDATE t SET c = '{usage as literal characters}' WHERE …"
  1. Hardcoded single unpaired curly braces (AFAICT no way to solve this):
>>> "UPDATE t SET c = 'this is an open curly brace = {' WHERE {parallel_filter}".format(parallel_filter="…")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: unexpected '{' in field name
>>> "UPDATE t SET c = 'this is a close brace = }' WHERE {parallel_filter}".format(parallel_filter="…")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: Single '}' encountered in format string

To circumvent this, we now use a dedicated Formatter that only handle the {parallel_filter} placeholder. This has the advantage of still deduplicate the doubled curly braces (see point 2 above) and thus being retro-compatible.

This doesn't solve the single unpaired curly braces case, but this is rare enough to be handled by other means.


Alternative to #142

@robodoo
Copy link
Contributor

robodoo commented Oct 10, 2025

Pull request status dashboard

Copy link
Contributor

@Pirols Pirols left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO the unbalanced braces issue is the least of our problems here and we can do with a backwards compatible implementation that does not tackle that.

]
)
def test_ExplodeFormatter(self, value, expected):
formatted = util.pg._ExplodeFormatter().format(value, parallel_filter="…")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
formatted = util.pg._ExplodeFormatter().format(value, parallel_filter="")
formatted = util.pg._ExplodeFormatter().format(value, parallel_filter="...")

np: I can imagine someone trying to add a test case being confused by the unicode character here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was copied from the commit message stolen from #142

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It did look familiar. :p
For what it's worth, I have since removed the vim abbrev rule that replaced triple dots with that one.

@KangOl
Copy link
Contributor Author

KangOl commented Oct 10, 2025

upgradeci retry with always only crm

@KangOl KangOl force-pushed the master-explode-formatter-chs branch from e1f7f15 to 0a2a432 Compare October 10, 2025 13:17
src/util/pg.py Outdated
sep_kw = " AND " if re.search(r"\sWHERE\s", query, re.M | re.I) else " WHERE "
query += sep_kw + "{parallel_filter}"

fmt = _ExplodeFormatter().format
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't we put this in the global scope, right after the definition of _ExplodeFormatter? It can be reused in explode_query for example.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's try. I added a fixup commit to see if the final result adds up.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't we put this in the global scope, right after the definition of _ExplodeFormatter? It can be reused in explode_query for example.

So, do I squash this commit with the _explode_format variable?

Copy link
Contributor

@aj-fuentes aj-fuentes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good for me.

…` placeholder

The usage of `str.format` to inject the parallel filter used to explode
queries is not robust to the presence of other curly braces. Examples:

1. `JSON` strings (typically to leverage their mapping capabilities):

See 79f3d71, where a query had to be
modified to accommodate that.

2. Hardcoded sets of curly braces:

```python
>>> "UPDATE t SET c = '{usage as literal characters}' WHERE {parallel_filter}".format(parallel_filter="…")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
KeyError: 'usage as literal characters'
```

Which can be (unelegantly) solved adding even more braces, leveraging
one side effect of `str.format`:

```python
>>> "UPDATE t SET c = '{{usage as literal characters}}' WHERE {parallel_filter}".format(parallel_filter="…")
"UPDATE t SET c = '{usage as literal characters}' WHERE …"
```

3. Hardcoded single unpaired curly braces (AFAICT no way to solve this):

```python
>>> "UPDATE t SET c = 'this is an open curly brace = {' WHERE {parallel_filter}".format(parallel_filter="…")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: unexpected '{' in field name
```

```python
>>> "UPDATE t SET c = 'this is a close brace = }' WHERE {parallel_filter}".format(parallel_filter="…")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: Single '}' encountered in format string
```

To circumvent this, we now use a dedicated Formatter that only handle
the `{parallel_filter}` placeholder. This has the advantage of still
deduplicate the doubled curly braces (see point 2 above) and thus being
retro-compatible.

This doesn't solve the single unpaired curly braces case, but this is
rare enough to be handled by other means.
@KangOl KangOl force-pushed the master-explode-formatter-chs branch from efc6c9e to 4eacbc8 Compare October 13, 2025 08:44
@KangOl
Copy link
Contributor Author

KangOl commented Oct 13, 2025

@robodoo r+

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants