You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* Atomically load the lowered program (#610)
Since we're atomically initializing the compiled program in
`Regex.Program`, we need to pair that with an atomic load.
Resolves#609.
* Add tests for line start/end word boundary diffs (#616)
The `default` and `simple` word boundaries have different behaviors
at the start and end of strings/lines. These tests validate that we
have the correct behavior implemented. Related to issue #613.
* Add tweaks for Android
* Fix documentation typo (#615)
* Fix abstract for Regex.dotMatchesNewlines(_:). (#614)
The old version looks like it was accidentally duplicated from
anchorsMatchLineEndings(_:) just below it.
* Remove `RegexConsumer` and fix its dependencies (#617)
* Remove `RegexConsumer` and fix its dependencies
This eliminates the RegexConsumer type and rewrites its users to call
through to other, existing functionality on Regex or in the Algorithms
implementations. RegexConsumer doesn't take account of the dual
subranges required for matching, so it can produce results that are
inconsistent with matches(of:) and ranges(of:), which were rewritten
earlier.
rdar://102841216
* Remove remaining from-end algorithm methods
This removes methods that are left over from when we were considering
from-end algorithms. These aren't tested and may not have the correct
semantics, so it's safer to remove them entirely.
* Improve StringProcessing and RegexBuilder documentation (#611)
This includes documentation improvements for core types/methods,
RegexBuilder types along with their generated variadic initializers,
and adds some curation. It also includes tests of the documentation
code samples.
* Set availability for inverted character class test (#621)
This feature depends on running with a Swift 5.7 stdlib, and fails
when that isn't available.
* Add type annotations in RegexBuilder tests
These changes work around a change to the way result builders are
compiled that removes the ability for result builder closure outputs
to affect the overload resolution elsewhere in an expression.
Workarounds for rdar://104881395 and rdar://104645543
* Workaround for fileprivate array issue
A recent compiler change results in fileprivate arrays sometimes
not keeping their buffers around long enough. This change avoids that
issue by removing the fileprivate annotations from the affected type.
* Fix an issue where named character classes weren't getting converted in the result builder. <rdar://104480703>
* Stop at end of search string in TwoWaySearcher (#631)
When searching for a substring that doesn't exist, it was possible
for TwoWaySearcher to advance beyond the end of the search string,
causing a crash. This change adds a `limitedBy:` parameter to that
index movement, avoiding the invalid movement.
Fixes rdar://105154010
* Correct misspelling in DSL renderer (#627)
vertial -> vertical
rdar://104602317
* Fix output type mismatch with RegexBuilder (#626)
Some regex literals (and presumably other `Regex` instances) lose
their output type information when used in a RegexBuilder closure due
to the way the concatenating builder calls are overloaded. In
particular, any output type with labeled tuples or where the sum of
tuple components in the accumulated and new output types is greater
than 10 will be ignored.
Regex internals don't make this distinction, however, so there ends up
being a mismatch between what a `Regex.Match` instance tries to
produce and the output type of the outermost regex. For example, this
code results in a crash, because `regex` is a `Regex<Substring>`
but the match tries to produce a `(Substring, number: Substring)`:
let regex = Regex {
ZeroOrMore(.whitespace)
/:(?<number>\d+):/
ZeroOrMore(.whitespace)
}
let match = try regex.wholeMatch(in: " :21: ")
print(match!.output)
To fix this, we add a new `ignoreCapturesInTypedOutput` DSLTree node
to mark situations where the output type is discarded. This status
is propagated through the capture list into the match's storage,
which lets us produce the correct output type. Note that we can't just
drop the capture groups when building the compiled program because
(1) different parts of the regex might reference the capture group
and (2) all capture groups are available if a developer converts the
output to `AnyRegexOutput`.
let anyOutput = AnyRegexOutput(match)
// anyOutput[1] == "21"
// anyOutput["number"] == Optional("21")
Fixes#625. rdar://104823356
Note: Linux seems to crash on different tests when the two customTest
overloads have `internal` visibility or are called. Switching one of the
functions to be generic over a RegexComponent works around the issue.
* Revert "Merge pull request #628 from apple/result_builder_changes_workaround"
This reverts commit 7e059b7, reversing
changes made to 3ca8b13.
* Use `some` syntax in variadics
This supports a type checker fix after the change in how result
builder closure parameters are type-checked.
* Type checker workaround: adjust test
* Further refactor to work around type checker regression
* Align availability macro with OS versions (#641)
* Speed up general character class matching (#642)
Short-circuit Character.isASCII checks inside built in character class matching.
Also, make benchmark try a few more times before giving up.
* Test for \s matching CRLF when scalar matching (#648)
* General ascii fast paths for character classes (#644)
General ASCII fast-paths for builtin character classes
* Remove the unsupported `anyScalar` case (#650)
We decided not to support the `anyScalar` character class, which would
match a single Unicode scalar regardless of matching mode. However,
its representation was still included in the various character class
types in the regex engine, leading to unreachable code and unclear
requirements when changing or adding new code. This change removes
that representation where possible.
The `DSLTree.Atom.CharacterClass` enum is left unchanged, since it
is marked `@_spi(RegexBuilder) public`. Any use of that enum case
is handled with a `fatalError("Unsupported")`, and it isn't produced
on any code path.
---------
Co-authored-by: Nate Cook <[email protected]>
Co-authored-by: Butta <[email protected]>
Co-authored-by: Ole Begemann <[email protected]>
Co-authored-by: Alex Martini <[email protected]>
Co-authored-by: Alejandro Alonso <[email protected]>
Co-authored-by: David Ewing <[email protected]>
Co-authored-by: Dave Ewing <[email protected]>
In the engine nomenclature, a quick-check results in a yes/no/maybe while a thorough check always results in a definite answer.
8
+
9
+
The nature of quick checks and fast paths is that they bifurcate testing coverage. One easy way to prevent this in simple cases is to assert that a definite quick result matches the thorough result.
10
+
11
+
One example of this pattern is matching against a builtin character class. The engine has a `_matchBuiltinCC`
12
+
13
+
```swift
14
+
func_matchBuiltinCC(...) -> Input.Index? {
15
+
// Calls _quickMatchBuiltinCC, if that gives a definite result
16
+
// asserts that it is the same as the result of
17
+
// _thoroughMatchBuiltinCC and returns it. Otherwise returns the
The thorough check is never inlined, as it is a lot of cold code. Note that quick and thorough functions should be pure, that is they shouldn't update processor state.
0 commit comments