Skip to content

Conversation

altmannmarcelo
Copy link
Contributor

This commit adds support for C-style comments supported by MySQL. It parses and consumes the optional version number after the ! character.

@altmannmarcelo altmannmarcelo force-pushed the c_style_comment branch 3 times, most recently from 5089fd5 to 4123881 Compare September 17, 2025 15:26
This commit adds support for C-style comments supported by MySQL.
It parses and consumes the optional version number after the `!`
character and leading whitespace.
}
#[test]
fn tokenize_multiline_comment_with_c_style_comment() {
let sql = String::from("0/*! word */1");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at their docs, I'm wondering if/how we support these examples?

SELECT /*! STRAIGHT_JOIN */ col1 FROM table1,table2
/*!50110 KEY_BLOCK_SIZE=1024 */
SELECT /*! BKA(t1) */ FROM T

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@iffyio - Parsing the c_style comment unblocks sqlparser to not discard those as if they were a normal comment. Support for each hint will have to be added in a case by case bases. For example #2033 - MySQL adds a c-style comment if you run SHOW CREATE TABLE:

mysql> SHOW CREATE TABLE b;
+-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table                                                                                                                                                    |
+-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| b     | CREATE TABLE `b` (
  `ID` int DEFAULT NULL,
  `b` char(1) DEFAULT NULL /*!80023 INVISIBLE */
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci |
+-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0,008 sec)

Without the current patch, the invisible keyword will be discarded.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah so to clarify I'm rather wondering regarding the parser behavior for hints that aren't singe words e.g. /*!50110 KEY_BLOCK_SIZE=1024 */ - can we demonstrate the behavior with test cases for such scenarios?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@iffyio thanks flagging this. I have fixed the issue and now we properly return individual tokens inside a C-style hint comment.

Documented the C-style comments with an example.
}
#[test]
fn tokenize_multiline_comment_with_c_style_comment() {
let sql = String::from("0/*! word */1");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah so to clarify I'm rather wondering regarding the parser behavior for hints that aren't singe words e.g. /*!50110 KEY_BLOCK_SIZE=1024 */ - can we demonstrate the behavior with test cases for such scenarios?

Added the pending tokens structure to properly return all tokens
inside a c-style hint comment.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants