Skip to content

fixes for more edge cases in raw tokenizer#43

Open
mdelah wants to merge 1 commit into
muktihari:masterfrom
mdelah:raw-token-edge-cases
Open

fixes for more edge cases in raw tokenizer#43
mdelah wants to merge 1 commit into
muktihari:masterfrom
mdelah:raw-token-edge-cases

Conversation

@mdelah
Copy link
Copy Markdown
Contributor

@mdelah mdelah commented Mar 24, 2025

This PR closes out the remaining issues from #35.

As discussed:

  • Left and right angle brackets within <!-- and not part of the closing --> are considered part of the comment, and
  • Right angle brackets are considered valid in attribute values (and no longer create a split in RawToken)

I also realized there was a similar issue with <? ... ?> tags, so made some fixes there too.

I had to rearrange RawToken a bit to get this to work efficiently. It now calls out to a new function findTokenEnd to locate the closing >, which returns -1 if it's not within the current buffer. The findTokenEnd has logic to skip past > that occur within comments or quoted values. It calls itself to skip past nested tags (such as may appear inside <!DOCTYPE [ ... ] >.

I took the opportunity to replace some of the byte-by-byte iteration with bytes.IndexByte, which gives a moderate performance boost in some of the benchmarks.

@codecov-commenter
Copy link
Copy Markdown

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 98.54%. Comparing base (5d45a12) to head (e9c4ca0).

Additional details and impacted files
@@            Coverage Diff             @@
##           master      #43      +/-   ##
==========================================
- Coverage   99.00%   98.54%   -0.47%     
==========================================
  Files           2        2              
  Lines         302      343      +41     
==========================================
+ Hits          299      338      +39     
- Misses          2        4       +2     
  Partials        1        1              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants