Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

`rdf_load/[1,2]' changes the request IRI the user supplies #20

Open
wouterbeek opened this issue Jan 24, 2016 · 3 comments
Open

`rdf_load/[1,2]' changes the request IRI the user supplies #20

wouterbeek opened this issue Jan 24, 2016 · 3 comments
Assignees
Labels

Comments

@wouterbeek
Copy link
Contributor

rdf_load/[1,2] performs IRI normalization before sending an HTTP request. IRI normalization introduces unnecessary percent escaping that is not supported by all servers, occasionally resulting in unsuccessful requests.

Reproducible case:

?- [library(semweb/rdf_db)].
?- [library(semweb/rdf_http_plugin)].
?- rdf_load('http://dbpedia.org/resource/Category:Politics').
% Parsed "http://dbpedia.org/resource/Category%3APolitics" in 0.00 sec; 0 triples
true.

If you visit http://dbpedia.org/resource/Category:Politics then you see that there are triples there.

@wouterbeek wouterbeek added the bug label Jan 24, 2016
@JanWielemaker
Copy link
Member

Great. In a previous rounds, we decided that : must be escaped to avoid relative URIs to be read as absolute ones. The above makes it really hard when you can/must escape. rdf_load escapes to allow it processing the unescaped IRIs on the triples ...

@wouterbeek
Copy link
Contributor Author

I'm not clear on the benefit of escaping : in places where this is not required. The only benefit that I can think of is processing speed, since the syntax for relative IRIs is recognizably different than the one for absolute IRIs.

@JanWielemaker
Copy link
Member

It is rather odd. RFC3986 indeed allows for ":" in a path segment. However, if you have a relative url, using a ":" in (the first) path segment causes it to become ambiguous (it can also be read as an absolute url). This problem was raised by Samer a while ago and caused the decision to escape the
":". Looking at JavaScript, we get

> encodeURIComponent("aap:noot")
"aap%3Anoot"
> encodeURI("http://www.example.com/aap:noot")
"http://www.example.com/aap:noot"

I'm a little lost :(

@wouterbeek wouterbeek self-assigned this May 1, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants