You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
perf: Avoid unnecessary URL processing while parsing links (#13132)
There are three optimizations in this commit, in descending order of
impact:
- If the file URL in the "project detail" response is already absolute,
then avoid calling urljoin() as it's expensive (mostly because it
calls urlparse() on both of its URL arguments) and does nothing. While
it'd be more correct to check whether the file URL has a scheme, we'd
need to parse the URL which is what we're trying to avoid in the first
place. Anyway, by simply checking if the URL starts with http[s]://,
we can avoid slow urljoin() calls for PyPI responses.
- Replacing urllib.parse.urlparse() with urllib.parse.urlsplit() in
_ensure_quoted_url(). The URL parsing functions are equivalent for our
needs[^1]. However, urlsplit() is faster, and we achieve better cache
utilization of its internal cache if we call it directly[^2].
- Calculating the Link.path property in advance as it's very hot.
[^1]: we don't care about URL parameters AFAIK (which are different than
the query component!)
[^2]: urlparse() calls urlsplit() internally, but it passes the authority
parameter (unlike any of our calls) so it bypasses the cache.
Co-authored-by: Stéphane Bidoul <[email protected]>
0 commit comments