Skip to content
68 changes: 68 additions & 0 deletions spec/formatting.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,3 +18,71 @@ the local variable takes precedence.

It is an error for a local variable definition to
refer to a local variable that's defined after it in the message.

## Error Handling

During the formatting of a message,
various errors may be encountered.
These are divided to the following categories:

- **Syntax errors** occur when the syntax representation of a message is invalid.
- **Resolution errors** occur when the runtime value of a part of a message
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is unclear what this means.

Can we say "function resolution"?
There are very few things that can go wrong:

  • local variable definitions => similar to placeholders
  • plain text parts
  • placeholders => have a variable / literal part + function name + bag of options
  • selectors => have again variable(s) / literal part + function name + bag of options

We already have unresolved as a class of errors a bit below.
So, what else can go wrong in the (already parsed) parts above?
I think only function names?

So what about we call this section "Function resolution"?
Instead of "resolution errors", where "resolution" is not explained in the spec, and we don't even agree it is needed. We might agree that it is an implementation detail. So there is no need to mention it in the spec.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The difference between the "resolution" and "formatting" errors that's proposed here is perhaps best seen by considering how a Placeholder containing an Expression is handled. Let's say that we have something like

{(foo) :message}

or

{(user_age) :global}

where :message and/or :global is a custom Function that uses its argument to look up a value from elsewhere, and that this value is then formatted as a part of the final message.

During the formatting of this, the custom code could then emit two different sorts of errors:

  1. Resolution Error if there's a failure in getting the value that is to be formatted, e.g. if no foo message is available or if the user_age global is not set.
  2. Formatting Error if the found value can't be formatted, e.g. because the foo message includes a variable reference that can't be resolved, or the user_age value turns out to have some unexpected shape.

Would you agree that these are different error categories, and that this categorical split could lead to different error handling in user code?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If user_age isn't set, isn't that an unresolved variable error?

If no foo message is available, that would be an internal error of the format function message, right? Does that mean "resolution error" is really "function internal error"?

In any case, I think I would move this item below some of the others here (perhaps to the bottom of the list), since I find myself thinking that this could also mean unresolved or formatting error when in fact this error would probably only occur later (only when the pattern is syntactically correct and all of the variables and functions have been resolved but there is still a problem).

Or am I still not understanding?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both foo and user_age are literal values here, so the errors are coming from the :message and :global custom functions; from the PoV of the core implementation, the are no variables here to resolve.

The intent here is to allow for a custom formatting function to emit two different kinds of errors: Resolution and Formatting. This is meant to enable something like :message or :global to work from an end-user PoV as much as possible like core features such as $var.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: I don't think what I am saying here affects the spec, but the answer above.

I think it is contradicting:

  • Resolution ... failure in getting the value that is to be formatted
  • Formatting ... includes a variable reference that can't be resolved ...

It is unclear what is the difference between the two: "get the value" and "resolve variable reference"
The "publicly visible" operation is probably "I have a variable named foo, I want to get the value"
There is no "reference" except deep in the implementation.

But that implementation detail should not "leak" in the kind of error I am getting.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please note that the syntax example for the above discussion is this placeholder:

{(foo) :message}

The foo here is not a variable reference, it's a literal value that a custom :message function is interpreting as a message identifier.

cannot be determined.

- **Unresolved Variable** errors occur when a variable reference cannot be resolved.

- **Selection errors** cover failures encountered during selection.

- **Selector errors** are failures in the matching of a key to a specific selector.
- **Missing Fallback** errors occur when no Variant is selected
due to the message not including a Variant with only catch-all keys.

- **Formatting errors** occur during the formatting of a resolved value,
for example when encountering a value with an unsupported type
or an internally inconsistent set of options.

During selection, an expression handler must only emit Resolution and Selection errors.
During formatting, an expression handler must only emit Resolution and Formatting errors.

In all cases, when encountering an error,
a message formatter must provide some representation of the message.
An informative error or errors must also be separately provided.

When an error occurs in the syntax or resolution of an Expression or MarkupStart Option,
the Expression or MarkupStart in question is processed as if the option was not defined.
This may allow for the fallback handling described below to be avoided,
though an error must still be emitted.

When an error occurs within a Selector,
the selector must not match any VariantKey other than the catch-all `*`
and a Selector error is emitted.
When selection fails to match any Variant,
an empty string is used as the formatted string representation of the message
and a Missing Fallback error is emitted.

When an error occurs in a Placeholder that is being formatted,
the fallback string representation of the Placeholder
always starts with U+007B LEFT CURLY BRACKET `{`
and ends with U+007D RIGHT CURLY BRACKET `}`.
Comment on lines +210 to +211
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we might have a bidi consideration here? If the string being formatted is in Arabic, we might emit an FSI before the { and a PDI after the }. Format patterns use lots of neutrals (such as $ and :) and look like gibberish in a bidi context.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be covered by the text in PR #315? The intent there is to define a prefix and a suffix to a part's formatted string representation, which here would be something like {$foo}.

Between the brackets, the following contents are used:

- Expression with Literal Operand: U+0028 LEFT PARENTHESIS `(`
followed by the value of the Literal,
and then by U+0029 RIGHT PARENTHESIS `)`
- Expression with Variable Operand: U+0024 DOLLAR SIGN `$`
followed by the Variable Name of the Operand
- Expression with no Operand: U+003A COLON `:` followed by the Expression Name
- Markup start: U+002B PLUS SIGN `+` followed by the MarkupStart Name
- Markup end: U+002D HYPHEN-MINUS `-` followed by the MarkupEnd Name
- Otherwise: Three U+003F QUESTION MARK `?` characters, i.e. `???`

For example, the formatted string representation of the expression `{$foo :bar}`
would be `{$foo}` if the variable could not be resolved.

The formatted string representation of a message with an unrecoverable syntax error
is the concatenation of U+007B LEFT CURLY BRACKET `{`,
a string identifier for the message,
and U+007D RIGHT CURLY BRACKET `}`.
If an identifier is not available,
it is replaced with three U+003F QUESTION MARK `?` characters,
resulting in the string `{???}`.