-
Notifications
You must be signed in to change notification settings - Fork 101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
String splitting algorithms could use optional "nesting characters" #107
Comments
It seems your comment got cut off. Is this used outside CSS parsers? Because inside CSS, you need to handle all kinds of other CSS rarities too such as escapes and then you might as well invoke the CSS parser to be sure. |
Comment was cut off and I finished editting immediately, but apparently not before you checked the thread. Check again. ^_^ |
srcset parser has something like this, but without a stack (it uses a dumb state machine). |
Looks like srcset only looks for parens to account for possible future CSS functions? Right now the only valid descriptors are 1w, 1x, and 1h. If that's the case, then the algo is broken - it'll misparse at times once that starts being allowed. It needs to track nesting, and you're making my point for me. ^_^ |
If srcset needs to be compatible with CSS it also needs to handle escapes and should just be defined by the CSS parser, I think. |
It's not for CSS but for future descriptors like integrity(). The algorithm is intentionally "simple" and CSS compat is not a goal. |
I think I'll need this at least for |
When splitting strings, it's reasonably common to only want to split on "top-level" instances of the split chars, and have "nesting" characters, like parens, within which you don't look for the splitting characters. For example, splitting a string on commas, but the string can contain functions with comma-separated arguments.
Most of the strings I work with get parsed by CSS, which has a "split by top-level comma" algo already, so I don't have a concrete use for this in Infra just yet, but I use that algorithm commonly enough that I'd bet other people would benefit from having something like it available, at least for parens.
You'd need a list of start/end string pairs, and keep a stack of start strings seen that gets popped when the topmost end string is seen, and only trigger splitting when the stack is empty.
The text was updated successfully, but these errors were encountered: