Skip to content

Commit 78991b7

Browse files
authored
Merge pull request #74 from yoeunes/dev
Improves regex parser with CLI tool, highlighting, and more
2 parents fdca058 + 9f5e203 commit 78991b7

File tree

72 files changed

+2591
-335
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

72 files changed

+2591
-335
lines changed

.aiassistant/rules/ProjectRules.md

Lines changed: 0 additions & 23 deletions
This file was deleted.

.aiexclude

Lines changed: 0 additions & 4 deletions
This file was deleted.

.junie/guidelines.md

Lines changed: 0 additions & 32 deletions
This file was deleted.

AGENTS.md

Lines changed: 0 additions & 19 deletions
This file was deleted.

GEMINI.md

Lines changed: 0 additions & 19 deletions
This file was deleted.

README.md

Lines changed: 68 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -23,8 +23,10 @@
2323
- [Advanced Usage](#advanced-usage)
2424
- [Parsing bare patterns vs PCRE strings](#parsing-bare-patterns-vs-pcre-strings)
2525
- [Working with the AST](#working-with-the-ast)
26-
- [Writing a custom AST visitor](#writing-a-custom-ast-visitor)
27-
- [Optimizing and recompiling patterns](#optimizing-and-recompiling-patterns)
26+
- [Writing a custom AST visitor](#writing-a-custom-ast-visitor)
27+
- [Optimizing and recompiling patterns](#optimizing-and-recompiling-patterns)
28+
- [Auto-Modernize Legacy Patterns](#auto-modernize-legacy-patterns)
29+
- [Syntax Highlighting](#syntax-highlighting)
2830
- [ReDoS Analysis](#redos-analysis)
2931
- [What is ReDoS?](#what-is-redos)
3032
- [How RegexParser detects it](#how-regexparser-detects-it)
@@ -321,6 +323,58 @@ This makes it easy to implement automated refactorings (via Rector) or style rul
321323
322324
---
323325
326+
## ✨ Auto-Modernize Legacy Patterns
327+
328+
Clean up messy or legacy regexes automatically:
329+
330+
```php
331+
use RegexParser\Regex;
332+
333+
$regex = Regex::create();
334+
$modern = $regex->modernize('/[0-9]+\-[a-z]+\@(?:gmail)\.com/');
335+
336+
echo $modern; // Outputs: /\d+-[a-z]+@gmail\.com/
337+
```
338+
339+
**What it does:**
340+
- Converts `[0-9]` → `\d`, `[a-zA-Z0-9_]` → `\w`, `[\t\n\r\f\v]` → `\s`
341+
- Removes unnecessary escaping (e.g., `\@` → `@`)
342+
- Modernizes backrefs (`\1` → `\g{1}`)
343+
- Preserves exact behavior — no functional changes
344+
345+
Perfect for refactoring legacy codebases or cleaning up generated patterns.
346+
347+
---
348+
349+
## 🎨 Syntax Highlighting
350+
351+
Make complex regexes readable with automatic syntax highlighting:
352+
353+
```php
354+
use RegexParser\Regex;
355+
356+
$regex = Regex::create();
357+
358+
// For console output
359+
echo $regex->highlightCli('/^[0-9]+(\w+)$/');
360+
// Outputs: ^[0-9]+(\w+)$ with ANSI colors
361+
362+
// For web display
363+
echo $regex->highlightHtml('/^[0-9]+(\w+)$/');
364+
// Outputs: <span class="regex-anchor">^</span>[<span class="regex-type">\d</span>]+(<span class="regex-type">\w</span>+)$
365+
```
366+
367+
**Color Scheme:**
368+
- **Meta-characters** (`(`, `)`, `|`, `[`, `]`): Blue - Structure
369+
- **Quantifiers** (`*`, `+`, `?`, `{...}`): Yellow - Repetition
370+
- **Escapes/Types** (`\d`, `\w`, `\n`): Green - Special chars
371+
- **Anchors/Assertions** (`^`, `$`, `\b`): Magenta - Boundaries
372+
- **Literals**: Default - Plain text
373+
374+
HTML output uses `<span class="regex-*">` classes for easy styling.
375+
376+
---
377+
324378
## ReDoS Analysis
325379
326380
### What is ReDoS?
@@ -519,6 +573,18 @@ If you maintain custom visitors, plan to adjust them when new nodes appear. Brea
519573
520574
---
521575
576+
## Known Limitations
577+
578+
While this library supports a comprehensive set of PCRE2 features, some highly specific or experimental features may not be fully supported yet. For example:
579+
580+
- Certain Perl-specific verbs not yet standardized in PCRE2.
581+
- Advanced Unicode features beyond basic properties and escapes.
582+
- Experimental or platform-specific extensions.
583+
584+
If you encounter an unsupported feature, please [open an issue](https://github.com/yoeunes/regex-parser/issues) with a test case.
585+
586+
---
587+
522588
## Contributing
523589
524590
Contributions are welcome! Areas where help is especially useful:

TODO.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
# TODO Items for Future Releases
2+
3+
## SampleGeneratorNodeVisitor
4+
- Implement proper Unicode name to character conversion (line 642)
5+
6+
## ValidatorNodeVisitor
7+
- Validate that the Unicode name is valid. For now, assume it's correct. (line 569)

0 commit comments

Comments
 (0)