create a mechanism for substring replacement within HTML elements #535

aspiers · 2025-02-26T13:43:26Z

Ideally rolod0x would be able to detect and label addresses even when they're substrings within an HTML element surrounded by other text, e.g.

<span class="wallet-address">Your wallet address is 0x976EA74026E726554dB657fA54763abd0C3a0aa9</span>

However as previously mentioned in #531 (comment), so far rolod0x can only detect labelling opportunities when the entire contents of an HTML element is recognised in the pre-computed mapping (modulo trailing/leading whitespace), e.g.

<span class="wallet-address">  0x976EA74026E726554dB657fA54763abd0C3a0aa9</span>

(Actually it can also recognise addresses in href attributes, but that's a special case.)

This limited approach to matching was a deliberate design choice in order to keep the extension as performant as possible, based on the assumption that more complex approaches to address detection such as compiling and using gigantic regular expressions would be far too costly to apply to every HTML element of a page. And fortunately it does catch the vast majority of cases successfully.

However, it does undesirably exclude some cases, e.g.

Sometimes addresses are prefixed with eth: or similar, e.g. according to ERC-3770. https://app.safe.global and similar sites do this, for instance - although luckily they tend to wrap the address in a separate HTML element to the prefix so in practice this is rarely an issue.
Support labelling of function selectors, event topics etc. #28
Addresses in bytes calldata are not labeled #531

So ideally we could implement a mechanism for substring replacement without harming performance. Possible solutions:

Run benchmarks to test whether the above assumption is overly pessimistic.
Find an efficient address recognition mechanism. Maybe compiling a huge regexp once (similarly to how we currently build the mapping) and then applying it to each element is actually sufficiently quick?
Accept that there's no performant way to detect within substrings, so come up with a more expensive replacement mechanism but only use it in specific circumstances, e.g. on particular sites and/or HTML elements as per allow inclusions/exclusions per site / page / page elements #17.

The text was updated successfully, but these errors were encountered:

aspiers added the enhancement New feature or request label Feb 26, 2025

aspiers self-assigned this Feb 26, 2025

github-project-automation bot added this to rolod0x releases Feb 26, 2025

github-project-automation bot moved this to Backlog in rolod0x releases Feb 26, 2025

This was referenced Feb 26, 2025

allow inclusions/exclusions per site / page / page elements #17

Open

Addresses in bytes calldata are not labeled #531

Open

aspiers added non-address labelling For labelling of things other than addresses performance labels Feb 26, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

create a mechanism for substring replacement within HTML elements #535

create a mechanism for substring replacement within HTML elements #535

aspiers commented Feb 26, 2025 •

edited

Loading

create a mechanism for substring replacement within HTML elements #535

create a mechanism for substring replacement within HTML elements #535

Comments

aspiers commented Feb 26, 2025 • edited Loading

aspiers commented Feb 26, 2025 •

edited

Loading