-
-
Notifications
You must be signed in to change notification settings - Fork 36
Add interchange data model description + JSON Schema definition #393
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
6bdc19a
4802d9b
655be51
a83ccb3
53c2aef
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,181 @@ | ||
# DRAFT MessageFormat 2.0 Data Model | ||
|
||
To work with messages defined in other syntaxes than that of MessageFormat 2, | ||
an equivalent data model representation is also defined. | ||
Implementations MAY provide interfaces which allow | ||
for MessageFormat 2 syntax to be parsed into this representation, | ||
for this representation to be serialized into MessageFormat 2 syntax | ||
or any other syntax, | ||
for messages presented in this representation to be formatted, | ||
or for other operations to be performed on or with messages in this representation. | ||
|
||
Implementations are not required to use this data model for their internal representation of messages. | ||
|
||
To ensure compatibility across all platforms, | ||
this interchange data model is defined in terms of JSON-compatible values | ||
using TypeScript syntax for their definition. | ||
|
||
## Messages | ||
|
||
A `SelectMessage` corresponds to a syntax message that includes _selectors_. | ||
A message without _selectors_ and with a single _pattern_ is represented by a `PatternMessage`. | ||
|
||
```ts | ||
type Message = PatternMessage | SelectMessage | ||
|
||
interface PatternMessage { | ||
type: 'message' | ||
declarations: Declaration[] | ||
pattern: Pattern | ||
} | ||
|
||
interface SelectMessage { | ||
type: 'select' | ||
declarations: Declaration[] | ||
selectors: Expression[] | ||
variants: Variant[] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Currently, the ICU4J
This is less convenient to implement in ICU4C than it is to return a list of I like what you have here ( Whether it ends up being a list or a map, mostly I just wanted to highlight that the ICU4J and ICU4C implementations should match what's defined here. |
||
} | ||
``` | ||
|
||
Each message _declaration_ is represented by a `Declaration`, | ||
which connects the `name` of the left-hand side _variable_ | ||
with its right-hand side `value`. | ||
The `name` does not include the initial `$` of the _variable_. | ||
|
||
```ts | ||
interface Declaration { | ||
name: string | ||
value: Expression | ||
} | ||
``` | ||
|
||
In a `SelectMessage`, | ||
the `keys` and `value` of each _variant_ are represented as an array of `Variant`. | ||
For the `CatchallKey`, a string `value` may be provided to retain an identifier. | ||
eemeli marked this conversation as resolved.
Show resolved
Hide resolved
|
||
This is always `'*'` in MessageFormat 2 syntax, but may vary in other formats. | ||
|
||
```ts | ||
interface Variant { | ||
keys: Array<Literal | CatchallKey> | ||
value: Pattern | ||
} | ||
|
||
interface CatchallKey { | ||
type: '*' | ||
value?: string | ||
} | ||
``` | ||
|
||
## Patterns | ||
|
||
Each `Pattern` represents a linear sequence, without selectors. | ||
Each element of the sequence MUST have either a `Text` or an `Expression` shape. | ||
`Text` represents literal _text_, | ||
eemeli marked this conversation as resolved.
Show resolved
Hide resolved
|
||
while `Expression` wraps each of the potential _expression_ shapes. | ||
The `value` of `Text` is the "cooked" value (i.e. escape sequences are processed). | ||
|
||
Implementations MUST NOT rely on the set of `Expression` `body` values being exhaustive, | ||
as future versions of this specification MAY define additional expressions. | ||
If encountering a `body` with an unrecognised value, | ||
an implementation SHOULD treat it as it would a `Reserved` value. | ||
eemeli marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
```ts | ||
interface Pattern { | ||
body: Array<Text | Expression> | ||
eemeli marked this conversation as resolved.
Show resolved
Hide resolved
|
||
} | ||
|
||
interface Text { | ||
type: 'text' | ||
value: string | ||
} | ||
|
||
interface Expression { | ||
type: 'expression' | ||
body: Literal | VariableRef | FunctionRef | Reserved | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. A I'd like to suggest an alternative way to structure our expressions, to more closely map to our syntax: expression = "{" [s] ((operand [s annotation]) / annotation) [s] "}" Let's special-case argument-less functions rather than function-less operands. Instead of type Expression = OperandExpr | FunctionExpr;
interface OperandExpr {
operand: Literal | VariableRef;
annotation?: FunctionExpr;
}
interface FunctionExpr {
name: string;
options: Map<string, Literal | VariableRef>;
} FWIW, this is how I implemented expressions in stasm/message2: (Not blocking this PR on this, but I'd like to discuss this change as a follow-up.) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I will be happy to discuss this further in a follow-on issue or PR. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Filed #436 to continue this. |
||
} | ||
``` | ||
|
||
## Expressions | ||
|
||
The `Literal` and `VariableRef` correspond to the the _literal_ and _variable_ syntax rules. | ||
When they are used as the `body` of an `Expression`, | ||
they represent _expression_ values with no _annotation_. | ||
|
||
An _unquoted_ value is represented by a `Literal` with `quoted: false`, | ||
while a _quoted_ value would have `quoted: true`. | ||
The `value` of `Literal` is the "cooked" value (i.e. escape sequences are processed). | ||
|
||
In a `VariableRef`, the `name` does not include the initial `$` of the _variable_. | ||
|
||
```ts | ||
interface Literal { | ||
type: 'literal' | ||
quoted: boolean | ||
value: string | ||
} | ||
|
||
interface VariableRef { | ||
type: 'variable' | ||
name: string | ||
} | ||
``` | ||
eemeli marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
A `FunctionRef` represents an _expression_ with a _function_ _annotation_. | ||
eemeli marked this conversation as resolved.
Show resolved
Hide resolved
|
||
In a `FunctionRef`, | ||
the `kind` corresponds to the starting sigil of a _function_: | ||
`'open'` for `+`, `'close'` for `-`, and `'value'` for `:`. | ||
The `name` does not include this starting sigil. | ||
|
||
The optional `operand` is the _literal_ or _variable_ | ||
before the _annotation_ in the _expression_, if present. | ||
Each _option_ is represented by an `Option`. | ||
|
||
```ts | ||
interface FunctionRef { | ||
type: 'function' | ||
kind: 'open' | 'close' | 'value' | ||
name: string | ||
operand?: Literal | VariableRef | ||
options?: Option[] | ||
} | ||
|
||
interface Option { | ||
name: string | ||
value: Literal | VariableRef | ||
} | ||
``` | ||
|
||
A `Reserved` represents an _expression_ with a _reserved_ _annotation_. | ||
The `sigil` corresponds to the starting sigil of the _reserved_. | ||
The `source` is the "raw" value (i.e. escape sequences are not processed) | ||
and includes the starting `sigil`. | ||
|
||
Implementations MUST NOT rely on the set of `sigil` values remaining constant, | ||
as future versions of this specification MAY assign other meanings to such sigils. | ||
eemeli marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
If the _expression_ includes a _literal_ or _variable_ before the _annotation_, | ||
it is included as the `operand`. | ||
|
||
```ts | ||
interface Reserved { | ||
type: 'reserved' | ||
sigil: '!' | '@' | '#' | '%' | '^' | '&' | '*' | '<' | '>' | '/' | '?' | '~' | ||
source: string | ||
operand?: Literal | VariableRef | ||
eemeli marked this conversation as resolved.
Show resolved
Hide resolved
|
||
} | ||
``` | ||
|
||
## Extensions | ||
|
||
Implementations MAY extend this data model with additional interfaces, | ||
as well as adding new fields to existing interfaces. | ||
When encountering an unfamiliar field, an implementation MUST ignore it. | ||
For example, an implementation could include a `span` field on all interfaces | ||
encoding the corresponding start and end positions in its source syntax. | ||
|
||
In general, | ||
implementations MUST NOT extend the sets of values for any defined field or type | ||
when representing a valid message. | ||
However, when using this data model to represent an invalid message, | ||
an implementation MAY do so. | ||
This is intended to allow for the representation of "junk" or invalid content within messages. |
Uh oh!
There was an error while loading. Please reload this page.