Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Escape table names #73

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from
Draft

Conversation

NPCRUS
Copy link
Contributor

@NPCRUS NPCRUS commented Mar 8, 2025

This PR here is somewhat a beginning of a conversation about how do we want to make proper sql identifiers when it comes to schemas, tables and columns, particularly for my use case I want this in order to use sql reserved words for table names. I'll try to split this into 2 logical parts:

How do we know how to escape depending on db variant?
A SQL 1999 rfc says that double quotes should be used in order to escape an identifier, nevertheless:

  • in mysql it is backticks
  • in h2 when you escape you suddenly loose the ability to refer to a table ignoring it's definition case: defining a table T, SELECT * from t will work, SELECT * from "t" will not

Therefore it felt logical to make an escaping mechanism based on Dialect. I didn't want to propagate DialectConfig together with Context down the call chain, it seemed to me kinda redundant, so I made DialectConfig part of Context. That's how we can access escaping mechanism in various places as long as you have Context

How do we decide which table to escape?
With this PR I want to stick with tables, otherwise it can become too big. I see three options:

  1. escape all tables
  2. escape based on list of reserved words per database variant
  3. let user decide to escape

First options we already discarded as one that will change too much tests. As for a second option - I looked into lists of reserved words, and they are very different based on database, also it will not be fun to maintain them, that's why most of the libs resolve to escape all tables. Third option is most simple for now, since it will allow users to opt-in when they want/bump into this issue, also it will allow us to test mechanism first without changing all 1k+ tests, and perhaps we can start with option 3 and then eventually move to option 1. Therefore in this draft PR I made a flag in Table to opt in for escaping table name.

Optional:
At the moment DialectConfig is kinda part of the Dialect, and I don't feel entirely good about including whole Dialect in Context. I kinda want to move DialectConfig into case class and make a separate implicit for it as a part of Dialect so that when we need to construct Context only DialectConfig case class will be picked up

It is a draft PR yet, so obviously I haven't written tests for all cases, only for a simple FROM expression. This will follow when we come to consensus about the implementation.
What do you guys think?
fixes #53

- propogate dialect config into context
- use it to escape table name
@NPCRUS NPCRUS force-pushed the table-name-escaping branch from 6d4b1e0 to bb189df Compare March 10, 2025 06:42
@NPCRUS
Copy link
Contributor Author

NPCRUS commented Mar 21, 2025

@lihaoyi ping

@lihaoyi
Copy link
Member

lihaoyi commented Mar 21, 2025

I think it sounds reasonable to me, but TBH this is probably beyond my expertise. I wonder if anyone else has experience with the intricacies of escaping names in SQL and can chime in? Maybe @aboisvert if you've had to do that using Scalasql at work?

@aboisvert
Copy link
Contributor

It's not something we've run into yet (I guess we've been lucky in our naming choices). I don't have better insight, and I'm supportive of the proposed approach ("let user decide to escape").

We should anticipate that the same situation will occur with column names as well. Even if we don't add support for this right away, it would be good if the design is sympathetic to this future enhancement.

@lihaoyi
Copy link
Member

lihaoyi commented Mar 21, 2025

Sounds good. Happy to merge it once @NPCRUS is happy with the PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Semi/automatic escaping for reserved words as tables or columns
3 participants