-
-
Notifications
You must be signed in to change notification settings - Fork 36
Separation of print and spoken forms #33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Should this be part of message format directly, or through some sort of extension based on the key of a message? For example: 'direction_turn': 'Turn {{direction}} onto {{road_name}}' If you just markup a small portion of the message, it may not work in many voice cases where you want a simple voice output and a more verbose message or vice-versa. In a recipe use-case when using voice accompanied by a screen, you may want a simple voice output as the user can read the screen for additional context. In a news use-case, you may want a long voice output while the text is simpler, perhaps showing the headline or a generic statement. |
I would prefer it to be a part of the message format directly, and its usage should be optional (not explicit). Formatting of variables need to allow this to be taken into account. You frequently need to use all of the same text and then you just want a small span to be spoken a certain way. This is to avoid copypasta where you may have to copy the same thing over and over only to modify one small span of a sentence. Sometimes the pronunciation is dynamically chosen at runtime. This is a common issue for #34. The pronunciation of the number needs to be in grammatical agreement of the noun. A translator may know the context of the sentence more than a rule based or machine learned based text to speech system that is trying to pronounce the digits instead of words. If you have the date 1/2/03, how might I pronounce that? SSML may be able to annotate it as a date so that it doesn't sound like a math equation or an address with slashes. Though it's fairly common that it suffers from not knowing if 1/2 is January 2nd or February 1st. If your text to speech (TTS) system is 100% in sync with your regional differences of the printed form, then it "should" work, but my experience is that the supported date formats in CLDR does not match the set of regional dialects of a TTS system. |
What would happen in the case where the VUI and display text differ significantly? |
Some of the use cases might be helped by something like If the placeholders have a clear type associated ( For non-formatable stuff we can still "remember" that a text area comes from a parameter, and we can keep the parameter name. So |
@grhoten There doesn't seem to be a proposal attached to this. I could see users creating a selector to choose between pattern strings. Or I could see using expression attributes to control the expansion of placeholders (based on modality). Or perhaps some other mechanism. I'm closing this issue as part of general cleanup. If you (or others) feel that a mechanism is still needed and that it should be part of the MF2 specification, please open specific issues or (better) write a design doc using our template. Thanks! |
It's fine to close this. MF2 is meant for GUI only and not easy to adapt for VUI as far as I've seen it. People will be limited to markup like SSML for the time being. This proposal wasn't trying to use selectors in the way that MF2 has it. Think of SSML as CSS but for a VUI. Though you really want to have the message to be printed and spoken the same way most of the time. So you don't want to copy and paste the entire response. You need to annotate and modify specific segments of a message. For example, you may want to pronounce a number a specific way to agree with the unit that you are quantifying. My presentation referenced in this issue covers this topic in more detail. |
@grhoten thanks for the update. I disagree that MF2 is meant for GUI only. I think it would be possible to adapt MF2 for VUI. A (MF2-syntax) markup implementation of SSML coupled with a formatToParts implementation could go a long way to addressing what you mention here.
MF2 is not a perfect solution to the issues you've called out in various places. But it's not so imperfect as to make it unusable either. I think what this particular issue is calling for might be a selector so that one can vary the pattern altogether (rather than sharing patterns, as I demoed above). E.g:
I think that would be out of scope for LDML45, but a custom implementation would be entirely feasible. Think about it for the beta period. More specifically: think about what specific feature additions would be needed to enable VUI support in MF2 and make design proposals for them in the post-45 period. Further, I suspect that, if these are VUI specific, they wouldn't belong in core MF 2.0's default registry. They want to be in a standardized VUI add-on. What form would that take? |
Is your feature request related to a problem? Please describe.
There are times that you need to provide guidance on pronunciation of a given word.
Describe the solution you'd like
It's important to have all text to be printed and spoken by default, but sometimes you need to substitute the printed form because it's too abbreviated. So you need to have mark up to only print a span of text or to only speak a span of text. Instead of repeating the exact same phrase by default, spans of text should be doable. For example, "Turn right onto highway 101" can turn into "Turn right onto highway 101one oh one". Another example is "January 1first" That date example actually requires CLDR's rule based number format (RBNF).
Describe why your solution should shape the standard
Without this ability, pronouncing numbers, dates, abbreviations, hetronyms (same spelling but different pronunciation) and so forth becomes inaccurate.
Additional context or examples
SSML can help in this area. Though sometimes you just need to provide more context. For example a message may be abbreviated for the print form due to limited UI space, but it needs a lot more context if you can't see the context of a message.
Let's Come To An Agreement About Our Words
The text was updated successfully, but these errors were encountered: