From 70925237a88d9802bfe7224fe9c78b146af615be Mon Sep 17 00:00:00 2001 From: Anne van Kesteren Date: Mon, 3 Apr 2017 12:40:46 +0200 Subject: [PATCH] Editorial: use noncharacter and control from Infra See https://github.com/whatwg/infra/pull/114 for the change to Infra. --- source | 50 ++++++++++++++++++++------------------------------ 1 file changed, 20 insertions(+), 30 deletions(-) diff --git a/source b/source index 6a3a99a0ef6..67ec587ddbb 100644 --- a/source +++ b/source @@ -1960,9 +1960,9 @@ a.setAttribute('href', 'https://example.com/'); // change the content attribute different than its previous value; setting an attribute to a value it already has does not change it.

-

The term empty, when used for an attribute value, Text node, or - string, means that the length of the text is zero (i.e. not even containing spaces or control - characters).

+

The term empty, when used for an attribute value, Text node, + or string, means that the length of the text is zero (i.e., not even containing controls or U+0020 SPACE).

An element's child text content is the concatenation of the data of all the Text nodes that are children of the @@ -2369,9 +2369,11 @@ a.setAttribute('href', 'https://example.com/'); // change the content attribute character

  • surrogate
  • scalar value
  • +
  • noncharacter
  • JavaScript string length
  • string length
  • ASCII whitespace
  • +
  • control
  • ASCII digit
  • ASCII upper hex digit
  • ASCII lower hex digit
  • @@ -4129,9 +4131,6 @@ a.setAttribute('href', 'https://example.com/'); // change the content attribute

    This is not to be confused with the "White_Space" value (abbreviated "WS") of the "Bidi_Class" property in the Unicode.txt data file.

    -

    The control characters are those whose Unicode "General_Category" property has the - value "Cc" in the Unicode UnicodeData.txt data file.

    -

    Some of the micro-parsers described below follow the pattern of having an input @@ -10532,9 +10531,8 @@ console.assert(image.height === 200); whitespace).

    Text nodes and attribute values must consist of scalar - values, must not contain U+0000 characters, must not contain permanently undefined - characters (noncharacters), and must not contain control characters other than - ASCII whitespace. + values, excluding noncharacters, and controls other than ASCII whitespace. U+000E to U+001F, - U+007F to U+009F, U+FDD0 to U+FDEF, and - characters U+000B, U+FFFE, U+FFFF, U+1FFFE, U+1FFFF, U+2FFFE, U+2FFFF, U+3FFFE, U+3FFFF, U+4FFFE, - U+4FFFF, U+5FFFE, U+5FFFF, U+6FFFE, U+6FFFF, U+7FFFE, U+7FFFF, U+8FFFE, U+8FFFF, U+9FFFE, U+9FFFF, - U+AFFFE, U+AFFFF, U+BFFFE, U+BFFFF, U+CFFFE, U+CFFFF, U+DFFFE, U+DFFFF, U+EFFFE, U+EFFFF, U+FFFFE, - U+FFFFF, U+10FFFE, and U+10FFFF are parse errors. These are all - control characters or permanently undefined characters (noncharacters).

    - -

    Any character that is a not a scalar value, i.e. any isolated - surrogate, is a parse error. (These can only find their way into the input stream via - script APIs such as document.write().)

    +

    Any occurrences of surrogates, noncharacters, or controls other than + ASCII whitespace are parse errors.

    + +

    Isolated surrogates can only find their way into the input stream via script APIs + such as document.write().

    U+000D CARRIAGE RETURN (CR) characters and U+000A LINE FEED (LF) characters are treated specially. Any LF character that immediately follows a CR character must be ignored, and all CR