-
-
Notifications
You must be signed in to change notification settings - Fork 36
Explain how to implement select "best match" with custom functions #271
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
To specify, the claim that I have made is that the matching algorithm used by MF1 can be fully supported in MF2. This may be done by controlling the order of variants in the data model representation of an MF1 message using e.g. this sort method.
Within a first-match selection framework, the list of variants is iterated through starting from the top, until a match succeeds. This means that the "who" in this case is the human that defines the order of those variants, and in this case has put
If this is the case, could you clarify why this should be considered a |
This means it does not support the MF 1 matching algorithm.
I've explained the |
MFv1 selection is "best match", but also is necessarily nested: you can only ever evaluate one item at a time. MFv2 is using a matrix and so might work differently. Looking at your example, @mihnita, if each selector is "greedy", then the evaluation of (restating in a different order for clarity) this:
is
If we rely on the order it could mean that some messages are potentially unreachable. I don't understand @eemeli the part where you say Also, in the case of plural, the "quality" of the match is predetermined by the selector function ( |
With a first-match algorithm, there is no concept of "quality" within the algorithm. There are more detailed explanation of this in the earlier ez-spec and stasm proposals. Taking for instance your example:
Here, the So the order of tests done for this case is:
|
Thanks @eemeli for the explanation, which makes sense. I don't necessarily agree with it, since it requires the developer to arrange the matrix vs. letting the algorithm do it. My discussion of "quality of match" is probably misleading, since what I'm really doing is letting the algorithm order the An argument can be made that letting the algorithm reorder the matrix will lead to subtle bugs due to bad developer behavior. Consider the difference between ordered and unordered matrices:
In "my" world, both of these evaluate the same, but I defy anyone to check completeness of the bottom one at a glance :-). OTOH, I can just add the case |
+100 to Addison's comment. You don't want developers to order things. Also, if we require human ordering, we can't claim that MF2 can reproduce the MF1 behavior, which ignores the ordering. No matter what, even if we decided to use the exact MF1 selection (which is subpar, and can be improved), there are some implications for the "engine" (whatever implements the syntax + eval, in this case ICU, for a September Tech Preview) Implications:
This is what this bug is about: specify that algorithm. Best or not, but an algorithm, not a human reordering things. Or human reordering, if that is what WG members vote (and / or CLDR/ICU TC?). |
How strong we want the "MF1 compatibility" argument to inform our decision here? Separately - would it help if we could allow for plural selection strategy to be defined in a message meta to allow for MF2 to use the optimal strategy and then separately have |
The syntax obviously is completely incompatible. Maybe "operational familiarity" is important, though? When we break with MFv1, we should do it consciously because it adds value. |
I would consider a minimum level of compatibility being reached by being able to parse MF1 syntax into the MF2 data model, and to be able to then have it be formatted the same way as in MF1. This is currently possible, with two caveats:
When leaving out message references from the spec, we lost full two-way compatibility with MF1, i.e. it's not possible to transform a message MF1 syntax -> MF2 data model -> MF1 syntax and be sure to get the same structural representation back, though a message thus transformed will still format the same. |
Then maybe we should first decide what we mean by compatibility. I doubt anyone wants to move back and forth between the MF1 and MF2 syntax. So the way I think of compatibility is that MF2 can do the (good) things that MF1 is able to. I suppose that you expect the same thing when we are talking about Fluent. |
Consensus : It's not a blocker for implementing a formatting library |
My memory from the March discussions with Mark & the TCs is that the formatting library should do first-match, and that other tooling (linters, localization tools) should complain about bad ordering, or should reorder. I think it's a good idea to say in the spec what's a good or bad order, but I don't think that this is a blocker for implementing a formatting library. |
I started writing a comparison document in my fork here (fixed): It is not complete and I will probably create a new issue to track the choice between first and best match when it is ready (since this is off-topic to deciding the best match algorithm should we choose to adopt it). |
@aphillips It's outside the nominal scope of this issue and not yet covered by your comparison document, but you may want to look into prior art outside of MF1 to include in the discussion, such as the operation of |
@eemeli Good point. I do cover this to some degree inside the discussion of MF1, but will expand on it. |
I think that taking inspiration from programming languages takes us in the wrong direction, as we are not trying to solve the same problem. A programmer writing a switch statement has a pretty good idea that they want this option to be more important that that other option, so they put it before. But for what we do the order is potentially unknowable when one writes the expression. It might be locale dependent (for example a language might want a genitive form before an accusative one, and another language might not want that). It might be runtime dependent, for example a long and a short form of a message. So you can't sort something like this advance:
Or because you run on a different OS:
You also can't sort or lint things at build time using a function. TLDR: if I can sort / lint the order with a function at build time, then I can do that 10 times easier and with fewer complications at runtime. Making the whole think "best match" |
In my mind it is pretty strong. But we need to agree what we mean by that. I think that is applies (and can help) to more than this issue (for example issue #350 "Allow names to start with a digit") |
@mihnita Consider making comments on https://github.com/aphillips/message-format-wg/blob/issue-351/exploration/selection-matching-options.md as the discussion vehicle for this issue Monday. |
@mihnita Could you explain a bit more how the order of the variants in this example could affect the results? Wouldn't |
I think we've resolved this by selecting a variant selection method? |
Closing resolve-candidates per discussion in 2023-07-24 call |
One of the contention points between the 3 proposals was the selection process: best match vs first match.
Implementing
MessageFormat
v1 behavior compatibility for plural requires best match functionality(so that something like
{foo, plural, one {...} =1 {...} other {...}}
matches=1
iffoo
is 1)Eemeli claimed (more than once) that this can be implemented by the selection function, without involvement from the MF2 spec.
This is true for one selector, but I don't see how it can be done for multiple selectors.
This is very typical when several dimensions are compared.
Imagine a list with 10 algorithms.
It is easy to select the fastest one.
It is easy to select the one that requires the least amount of data / memory.
It is easy to select the one with fewest lines of code (proxy for complexity).
But it is not clear how to select the one with the best combination.
Here is a MF2 example:
Let the arguments to format be
{ 'itemCount' : 1, 'itemGender' : male }
Best match for the
:plural
function are:Best match for the
:gender
function are:Selections that match both functions are:
But there is no combinations of best
:plural
and best:gender
(which would be=1 masc {...}
)Who can select the best combination (arguably
one masc
)?The functions can't do it, because they have no visibility.
They can only see their own "column" of keys (
:plural => [ =1, =1, one, _, _ ]
and:gender => [ fem, _, masc, masc, _ ]
)Even with more visibility, they are not "equipped" to judge the matching the other functions (columns).
And the MF2 proper (the "core" implementation that only invokes the functions) cannot do it, because the EZ and Stas proposals argued that it is not needed, and can be done with custom functions. So the TC made that decision.
NOTE: I am not trying to reverse the TC decision, which said to do "first match", listening to the 2 proposals mentioned.
I only invite the proponents of the "first match" algorithm to explain how to implement this kind of best match with these constraints.
Because they claimed it is possible to do by delegating the work to the functions.
The text was updated successfully, but these errors were encountered: