Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add fast path skipping grapheme counting #2817

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open

Add fast path skipping grapheme counting #2817

wants to merge 3 commits into from

Conversation

gaearon
Copy link
Contributor

@gaearon gaearon commented Sep 14, 2024

Not a huge deal but since we visit this codepath for every single validated string, seems like we'd want to avoid any unnecessary computation here, especially on low-end devices.

I don't know how encodings and grapheme work very well so this needs a close look.

Commits

Before

This was showing up in a profile for switching tabs with 6x throttling. (Note this is a DEV profile so in PROD its % would be higher compared to other things around it.)

Screenshot 2024-09-14 at 16 23 43

After

It doesn't show up.

Screenshot 2024-09-14 at 16 30 34

Copy link
Collaborator

@pfrazee pfrazee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah good call

Copy link
Collaborator

@bnewbold bnewbold left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

behavior and overall approach looks good to me! this particular set of optimizations is on a relatively hot path and improves performance in the common case and feels worthwhile.

we'll probably want to do the same thing in the go implementation, and can copy these test cases (maybe make them "shared interop test vectors") when we do.

@@ -197,9 +197,14 @@ export function string(
}
}

// Lazily calculated and reused between checks.
let cachedUtf8Len: number | undefined
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "caching" logic could maybe be more readable like:

if (typeof def.maxLength ==='number' || typeof def.minLength ==='number') {
    const len = utf8Len(value)
    if (typeof def.maxLength ==='number') {
        if (len > def.maxLength) {
            ....
        }
    }
    if (typeof def.minLength ==='number') {
        ...
    }
}

and similar for the grapheme value.

but not sure what idiomatic typescript style would be here. not a big deal either way.

`${path} must not be longer than ${def.maxGraphemes} graphemes`,
),
if (value.length <= def.maxGraphemes) {
// If the JavaScript string length is within the maximum limit,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might be worth mentioning here (and below) that JavaScript string length is UTF-16, just to be super redundant about that, because it is confusing to many devs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants