Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved Character Range and Special Sequence Support #527

Open
cdwmhcc opened this issue Mar 29, 2025 · 2 comments
Open

Improved Character Range and Special Sequence Support #527

cdwmhcc opened this issue Mar 29, 2025 · 2 comments
Assignees
Labels
enhancement New feature or request

Comments

@cdwmhcc
Copy link

cdwmhcc commented Mar 29, 2025

🆒 Character Range Issues

Description Current Generated Pattern Expected Pattern
Number Range charIn('1-9') /[1\-9]/ /[1-9]/
Alternatives charIn('123456789') /[123456789]/ /[1-9]/
Ideal API charIn('1-9') n/a /[1-9]/

Whitespace Character Class Issues

Description Current Generated Pattern Expected Pattern
Escaped \s in String charIn('abc\\s') /[abc\\s]/ /[abc\s]/
Alternatives charIn('abc').or(whitespace) /(?:[abc]|\s)/ /[abc\s]/
Ideal API Option 1 charIn('abc\\s') n/a /[abc\s]/
Ideal API Option 2 charIn('abc${whitespace}') n/a /[abc\s]/

Complex Lookbehind or lookahead Structure Issues

Description Current Generated Pattern Expected Pattern
Lookbehind exactly('').after(anyOf(exactly('').at.lineStart(), charIn('-_(:')) /(?<=(?:^|[\-_(:]))/ /(?<=(?:^|[-_(:]))/
Ideal API after(anyOf(lineStart, charIn('-_(:')) n/a /(?<=(?:^|[-_(:]))/

ℹ️ Additional info

  1. Character Range Interpretation:

    • The library interprets '1-9' literally as the characters "1", "-", and "9" instead of the range from 1 to 9
    • Proper character ranges need to be enumerated manually
  2. Escaped Character Handling:

    • Escape sequences like \\s in strings are not correctly translated to regex character classes
    • The library creates unnecessary alternation when combining regular characters with special classes

Suggested Improvements

  1. Implement proper character range parsing in charIn(): - between two characters should create a range
  2. Support proper escape sequence handling in character classes
  3. Introduce more concise helper functions for common patterns (e.g., lineStart, after)
@cdwmhcc cdwmhcc added the enhancement New feature or request label Mar 29, 2025
@danielroe
Copy link
Member

what do you think of the implementation in #399?

@cdwmhcc
Copy link
Author

cdwmhcc commented Apr 2, 2025

what do you think of the implementation in #399?

Thanks for pointing me to PR #399. I initially misunderstood issue #397, thinking it was language-specific. After taking a closer look, I see that it addresses the same Character Range Issues I mentioned in my issue. Having reviewed PR #399, I can confirm that its implementation would indeed solve the Character Range Issues portion of my issue. I appreciate you making this connection.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants