Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

create a mechanism for substring replacement within HTML elements #535

Open
aspiers opened this issue Feb 26, 2025 · 0 comments
Open

create a mechanism for substring replacement within HTML elements #535

aspiers opened this issue Feb 26, 2025 · 0 comments
Assignees
Labels
enhancement New feature or request non-address labelling For labelling of things other than addresses performance

Comments

@aspiers
Copy link
Member

aspiers commented Feb 26, 2025

Ideally rolod0x would be able to detect and label addresses even when they're substrings within an HTML element surrounded by other text, e.g.

<span class="wallet-address">Your wallet address is 0x976EA74026E726554dB657fA54763abd0C3a0aa9</span>

However as previously mentioned in #531 (comment), so far rolod0x can only detect labelling opportunities when the entire contents of an HTML element is recognised in the pre-computed mapping (modulo trailing/leading whitespace), e.g.

<span class="wallet-address">  0x976EA74026E726554dB657fA54763abd0C3a0aa9</span>

(Actually it can also recognise addresses in href attributes, but that's a special case.)

This limited approach to matching was a deliberate design choice in order to keep the extension as performant as possible, based on the assumption that more complex approaches to address detection such as compiling and using gigantic regular expressions would be far too costly to apply to every HTML element of a page. And fortunately it does catch the vast majority of cases successfully.

However, it does undesirably exclude some cases, e.g.

So ideally we could implement a mechanism for substring replacement without harming performance. Possible solutions:

  • Run benchmarks to test whether the above assumption is overly pessimistic.
  • Find an efficient address recognition mechanism. Maybe compiling a huge regexp once (similarly to how we currently build the mapping) and then applying it to each element is actually sufficiently quick?
  • Accept that there's no performant way to detect within substrings, so come up with a more expensive replacement mechanism but only use it in specific circumstances, e.g. on particular sites and/or HTML elements as per allow inclusions/exclusions per site / page / page elements #17.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request non-address labelling For labelling of things other than addresses performance
Projects
None yet
Development

No branches or pull requests

1 participant