Skip to content

Proposal: Use input mapping functions for case selection #101

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
eemeli opened this issue Jul 22, 2020 · 2 comments
Closed

Proposal: Use input mapping functions for case selection #101

eemeli opened this issue Jul 22, 2020 · 2 comments
Labels
data model Issues related to the Interchange Data Model

Comments

@eemeli
Copy link
Collaborator

eemeli commented Jul 22, 2020

Here's my proposal for how the case selection should happen, following on from #93. As a premise, we have an input value or values of unknown type and some number of cases, each identified by a key. If we allow a key to consist of more than one string part, the key parts have a set order.

First, let us consider selection with just one input. Furthermore, I'm presuming that a formatter function F returning a string may be defined for each input value, and that one of the cases may be defined as the default case. The process ends when a case is selected.

  1. Each key matching a case is considered as a string k.
  2. A string representation s of the input value v is determined, e.g. "true" for a Boolean true value or "42" for the number 42.
  3. s is compared against each key. If equal, its case is selected.
  4. If a custom formatter F is defined:
    1. The string result c = F(v) of applying the formatter to v is determined.
    2. c is compared against each key. If equal, its case is selected.
  5. Else if v is a number, or the string representation of a number:
    1. Its numeric value n is determined.
    2. The string p is determined as the cardinal plural category for n in the current locale.
    3. p is compared against each key. If equal, its case is selected.
  6. If a default case is defined, it is selected.
  7. An empty string is selected as the case.

If more than one input may be used to select a case, the above process is applied to filter the cases down to one, starting with k as the leftmost key part and v as the value of its corresponding input. This assumes that key parts may also define defaults.

Commentary

The above description is at least intended to be independent of the markup used to formulate it, but it does have dependencies on the data model; the selector needs slots for the formatter and its bag of options. In particular, I'm leaving it open here whether multiple-input selection maps keys to inputs positionally, or according to identifiers. I do have some thoughts about the markup, but that's a separate discussion.

Overall this is pretty close to how Fluent works, but expressed a bit differently. There, a custom formatter e.g. for ordinal plurals would be expressed as NUMBER($input, type: "ordinal"), with the plural category selection based on those options being a bit implicit. I'm not sure that Fluent's markup for defining the formatter is really clear, as it strongly implies that the formatting is happening before the first comparison to each of the keys.

The last step of returning an empty string rather than consider lack-of-default an error is deliberate. As I see it, this is analogous to a missing default in a switch case in other languages. It should be technically allowed, but possibly against good practice. In other words, we should also provide recommended style rules for MF2, which would allow linters to be developed using those recommendations. But it should not always be a compile-time error; an empty string is often the appropriate fallback.

This proposal makes it possible for a single selector to accept e.g. both numeric and boolean input values, such that cases with keys "true", "false", "one", and "other" could coexist next to each other and become matched with appropriate input values. This is intentional, but should probably also be a linter error using the default recommended rules.

I did consider including steps allowing for case matching using fractional numeric key values like "4.20", but it's not clear whether that would ever really be useful, and it's not clear whether a value like 4.200001 should match such a key. Leaving them out makes the selector logic much simpler, while technically allowing a custom formatter to provide such functionality.

@grhoten
Copy link
Member

grhoten commented Aug 17, 2020

I'd be curious how this works for Russian, which is probably the hardest to handle. The requested noun form is for a quantity with a number and noun. Here is a table that I had to use for Russian. Notice that ICU's plural rules don't match the requested grammatical number and case inflection of user vocabulary, but it's an important step for selecting the right one.

Requested Noun Form Singular Few Other
Nominative nominative singular genitive plural genitive
Genitive genitive plural genitive plural genitive
Accusative & Animate genitive plural genitive plural genitive
Accusative & Inanimate nominative singular genitive plural genitive
Dative dative plural dative plural dative
Instrumental instrumental plural instrumental plural instrumental
Prepositional prepositional plural prepositional plural prepositional

@mihnita mihnita added data model Issues related to the Interchange Data Model and removed enhancement labels Sep 24, 2020
@eemeli
Copy link
Collaborator Author

eemeli commented May 2, 2021

Closing in favour of #170.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
data model Issues related to the Interchange Data Model
Projects
None yet
Development

No branches or pull requests

3 participants