Skip to content

Conversation

onerandomusername
Copy link
Contributor

@onerandomusername onerandomusername commented Oct 13, 2025

Add URL normalization checks using yarl in code snippets.

closes #3315

Thanks to @decorator-factory for the idea.


tested on @pydis-bot and deployed on @monty-python-bot

Copy link
Member

@jb3 jb3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, should we add & pin yarl as we are now explicitly using it? (I know it's present as a dependency of aiohttp but we should probably pin it ourselves).

Copy link
Member

@jb3 jb3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Solid, thanks!

Copy link
Contributor

@wookie184 wookie184 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is some potential for false positives here as there is nothing inherently wrong with an unnormalised URL. I don't think it's really an issue because I can only think of examples that you would have to craft on purpose, e.g. if you percentage encode part of the URL (e.g. if you had a file called # or ? :D) and decide to do it in lowercase %3f, yarl will normalise to uppercase %3F.

This is definitely better than allowing arbitrary requests to be made, and I can't really think of a neater approach.

Ideally we would probably have a helper we can use everywhere when substituting stuff into URLs (possibly a nice use case for PEP 750...). Even this ends up being a bit difficult to do perfectly though, as some URL parameters need to be more permissive than others, and it's not always clear when to quote vs when to just reject something.

@onerandomusername
Copy link
Contributor Author

Let me check that before you merge

@onerandomusername onerandomusername marked this pull request as draft October 21, 2025 19:34
@onerandomusername
Copy link
Contributor Author

Tested on this branch, seems to work:
image

Meanwhile, on main:
image

raw url tested: https://github.com/onerandomusername/ghre-tester/blob/9be3025d879813db5f6524338002539ad0264e45/README.md%3Fplain%3D1#L1

@onerandomusername onerandomusername marked this pull request as ready for review October 21, 2025 19:45
@wookie184
Copy link
Contributor

Never mind I just misread the code, I see no issues with this

@wookie184 wookie184 merged commit db15f63 into python-discord:main Oct 21, 2025
5 checks passed
@onerandomusername onerandomusername deleted the patch-1 branch October 21, 2025 19:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Unsafe URL construction in the CodeSnippets cog

3 participants