Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Additional guidance/clarification on RP ID and origin validation #2059

Open
zacknewman opened this issue Apr 21, 2024 · 0 comments
Open

Additional guidance/clarification on RP ID and origin validation #2059

zacknewman opened this issue Apr 21, 2024 · 0 comments

Comments

@zacknewman
Copy link
Contributor

zacknewman commented Apr 21, 2024

RP ID is required to be a valid domain string which is the string representation of a valid domain. The definition of a valid domain cites issue 245 which raises the following points:

  1. ASCII case insensitivity.
  2. _ among potentially other ASCII code points should be allowed.

The algorithm for determining a valid domain does not require the original domain input to match the result outputs in steps 1 or 3.

Currently origin validation only states "Validation MAY be performed by exact string matching or any other method as needed".

It would be nice if some guidance were provided on origin validation where the RP ID and origin disagree on case alone or even more complicatedly disagree on the syntactic representation of a domain whose semantics are equivalent according to IDNA.

For example as it stands now, any domain with an _ is not a valid domain as a failure will result from applying the domain-to-ascii algorithm. This goes against point 2 raised in issue 245.

Additionally the below are all valid domains that are semantically equivalent according to IDNA, but are syntactically different:

  1. λ.example.com
  2. Λ.ExaMple.com
  3. xn--wxa.example.com
  4. Xn--Wxa.ExAmple.com

Is there any recommendation on requiring both RP IDs and origins to not only be a valid domain string but more strictly that it match exactly with the result returned from step 3 (e.g., only item 1 above is valid)? A relaxed recommendation that would allow all four items above and require them to be treated the same as each other? A recommendation that only domains with A-labels be allowed and match exactly with the result returned from step 1 (e.g, only item 3 above is valid)?

Example code in Rust using the idna crate:

use std::io::{self, Error, StdoutLock, Write as _};
fn main() -> Result<(), Error> {
    let mut stdout = io::stdout().lock();
    idna_transform(&mut stdout, "λ.example.com").and_then(|()| {
        idna_transform(&mut stdout, "Λ.ExaMple.com").and_then(|()| {
            idna_transform(&mut stdout, "xn--wxa.example.com").and_then(|()| {
                idna_transform(&mut stdout, "Xn--Wxa.ExAmple.com")
                    .and_then(|()| idna_transform(&mut stdout, "www_ww.example.com"))
            })
        })
    })
}
fn idna_transform(stdout: &mut StdoutLock<'_>, input: &str) -> Result<(), Error> {
    write!(stdout, "original input: {input}, ").and_then(|()| {
        match idna::domain_to_ascii_strict(input) {
            Err(err) => writeln!(stdout, "domain-to-ascii algorithm fails on input: {err}"),
            Ok(ascii) => writeln!(stdout, "canonical domain with only A-labels: {ascii}"),
        }
    })
}

Output from the above program:

original input: λ.example.com, canonical domain with only A-labels: xn--wxa.example.com
original input: Λ.ExaMple.com, canonical domain with only A-labels: xn--wxa.example.com
original input: xn--wxa.example.com, canonical domain with only A-labels: xn--wxa.example.com
original input: Xn--Wxa.ExAmple.com, canonical domain with only A-labels: xn--wxa.example.com
original input: www_ww.example.com, domain-to-ascii algorithm fails on input: Errors
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants