diff --git a/source b/source index ee8ae95e76f..2c17252a3b2 100644 --- a/source +++ b/source @@ -45494,32 +45494,33 @@ interface HTMLInputElement : HTMLElement {

A valid e-mail address is a string that matches the email production of the following ABNF, the character set for which is Unicode. This ABNF implements the - extensions described in RFC 1123.

- -
email         = 1*( atext / "." ) "@" label *( "." label )
-label         = let-dig [ [ ldh-str ] let-dig ]  ; limited to a length of 63 characters by RFC 1034 section 3.5
-atext         = < as defined in RFC 5322 section 3.2.3 >
-let-dig       = < as defined in RFC 1034 section 3.5 >
-ldh-str       = < as defined in RFC 1034 section 3.5 >
- - + extensions described in RFC 1123 and includes support for internationalized email addresses as + described in RFC 6531. +

+ +
email      = localpart "@" domain
+localpart  = 1*( utext / "." )
+utext      = ALPHA / DIGIT / "!" /                    ; unreserved printable ASCII
+                 "#" / "$" / "%" / "&" / "'" / "*" /  ; as defined in RFC5322 section 3.2.3
+                 "+" / "-" / "/" / "=" / "?" / "^" /
+                 "_" / "`" / "{" / "|" / "}" / "~" /
+                 %80-D7FF  / %E000-10FFFF             ; or any non-ASCII Unicode
+domain     = < a "valid host string", see URL section 3.4 >
+ +

This definition supports internationalized email addresses ("SMTPUTF8"), including + non-ASCII values in both the localpart (the mailbox name or "left hand side") and domain portions + of the address. The domain must be a valid + host string. Because of the details for encoding non-ASCII domain names, it's not possible to + describe the domain portion of an address in a simple regular expression. The number and range of + Unicode characters permitted are interdependent and somewhat variable. The URL spec, + Section 3.5 describes how the domain is + validated.

This requirement is a willful violation of RFC 5322, which defines a syntax for e-mail addresses that is simultaneously too strict (before the "@" character), too vague (after the "@" character), and too lax (allowing comments, whitespace characters, and quoted strings in manners unfamiliar to most users) to be of practical use here.

-
- -

The following JavaScript- and Perl-compatible regular expression is an implementation of the - above definition.

- -
/^[a-zA-Z0-9.!#$%&'*+\/=?^_`{|}~-]+@[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?(?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*$/
- - - -
-

A valid e-mail address list is a set of comma-separated tokens, where each token is itself a valid e-mail address. To obtain the list of tokens from a valid e-mail address list, an implementation must [RFC6350]

vCard Format Specification, S. Perreault. IETF.
+
[RFC6531]
+
SMTP Extension for Internationalized Email, J. Yao, M. Mao. IETF.
+
[RFC6596]
The Canonical Link Relation, M. Ohye, J. Kupke. IETF.