Skip to content

PCRE2 10.45-10.47 contains a known regex character class compilation issue #20341

@ranvis

Description

@ranvis

Description

The current PHP 8.5 builds (RC3) bundle PCRE2 10.46, which contains a known issue in character class handling (PCRE2Project/pcre2#833),
This may lead to incorrect (non-)matches in certain patterns containing bracketed character class [...].
Affected PCRE2 versions are 10.45-10.47, dev-master.

The following code: https://3v4l.org/iZ91q

$patterns = [
    '/[\x{ff}\x{100}\x{8000}\x{8002}\x{8004}\x{8006}]/u' => "\u{100}",
    '/[\x{ff}\x{100}\x{8000}\x{8002}\x{8004}\x{8006}\x{8008}]/u' => "\u{100}",
    '/[\x{ff}\x{101}\x{8000}\x{8002}\x{8004}\x{8006}\x{8008}]/u' => "\u{101}",
];

foreach ($patterns as $pattern => $str) {
    if (preg_match($pattern, $str, $m)) {
        echo "0: ", json_encode($m[0]), "\n";
    } else {
        echo "No match.\n";
    }
}

Resulted in this output:

0: "\u0100"
No match.
0: "\u0101"

But I expected this output instead:

0: "\u0100"
0: "\u0100"
0: "\u0101"

PHP Version

PHP 8.5.0RC3 (cli) (built: Oct 21 2025 21:38:34) (NTS Visual C++ 2022 x64)
Copyright (c) The PHP Group
Zend Engine v4.5.0RC3, Copyright (c) Zend Technologies
    with Zend OPcache v8.5.0RC3, Copyright (c), by Zend Technologies

Operating System

Windows, but also confirmed on pcre2test -32 on Linux PCRE2 10.47

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions