-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Intent and water don't mix... #397
Comments
I think I would use But If you want to reference the children isn't that
Where |
To illustrate why it is nice to have a principled approach that is seeded with a very long list, here is a step-by-step of how I would answer this (using encyclopedic names )
So, the entire expression can be marked in full as either (the author would know which):
Alternatively, if we wanted to do the minimal annotation possible to get Neil's desired narration goal, we may just have to annotate the <msub intent="molecule($H,$num)">
<mi arg="H" mathvariant='normal'>H</mi>
<mn arg="num">2</mn>
</msub> I am amused that Neil reached for a literal narration string here. Have you considered Edit: that said, water is also an encyclopedic name, so a use of |
Certainly So rather than changing the intent annotation to make sure the structural information is always accurate, we could accept that there is no structural information in some chemical formulas, and they use the lie-to-children method for abstracting away the structural specifics. |
I haven't done chemistry in a long time but I think your enumerated lists shows the danger of being too formal and possibly wrong as opposed to be sufficiently relaxed that you can still be helpful, but can't really be wrong. Using molecule(H,2) for the H2 part of H2O would be wrong I think. There is only a single molcule involved: H20, as you note the usage here is the total count of each element in the molecule, H20 is "some molecule with 2 H and an O" it is explicitly not "some combination of the molecule H2 with the element O. so I can't see how either You said the same thing in your followup commnent
Although still I think suggesting using molecule() for the subterms anyway for what I called chemical-sub above (not that I recommend that name either, but it seems better than molecule)
Looks fine for that purpose to me, or perhaps |
Sure, if the word The problem with a generic So the remediator would (sometimes) need to actively keep track of the molecular structures hidden in the syntax and use a different annotation between the counting and molecular cases. I think I disagree with,
since we would also expect people to know the basic language of calculus to annotate integrals, and people to know the basic language of group theory to annotate groups, etc. |
While "H_2" is a molecule consisting of two "H" atoms, the "H_2 O" molecule doesn't have an "H_2" molecule within it, even though it does contain two "H" atoms. To the extent we're leaning semantic, there were two distinct kinds of notations used above. One is "chemical formula", which mainly give the number of each kind of atom in the compound, but also can preserve structural subgroups, So it would (I think) follow a grammar something like:
that is, And then there are "structural formula", like the "H-O-H". These are more diagram-like and quickly become 2D (or 3D), although this example is easily written linearly. To the extent they're writable in MathML, they seem easy to account for in intent: they'd look like Presumably an AT system that understands these symbols would pronounce the formula for water as "aitch two oh" (at least, given the right preferences settings). How structural formula get pronounced, however, I have no idea. We really need a chemist involved to clarify the minimal, necessary set of concepts & structures along with expected pronunciations. OTOH, if we don't want to define these concepts and rather just push the above pronunciation, then either an aria approach was suggested above, or else we'd need our own equivalent |
in practice the input may be a latex document using we should be able to make some intent without knowing much more than "the subscript combinations contained here were generated by |
@davidcarlisle LaTeX macros really ought not guide us too far... How about grabbing something from page 12 of the mhchem documentation: \ce{Zn^2+
<=>[+ 2OH-][+ 2H+]
$\underset{\text{amphoteres Hydroxid}}{\ce{Zn(OH)2 v}}$
<=>[+ 2OH-][+ 2H+]
$\underset{\text{Hydroxozikat}}{\ce{[Zn(OH)4]^2-}}$
} should that be |
To the extent that "openparen en aitsch four closeparen two ess" is all you'll ever want to speak, then "(NH4)2S" is certainly a nice, compact, typeable expression of the compound. If you ever need to go beyond that, I wonder how much complex parsing we would be off-loading to the AT (I haven't yet got a LaTeXML binding for mhchem, 'cause it's mind-bogglingly complicated) |
I think you are all overthinking this and trying to mirror the chemistry, which indeed does has more of a Furthermore, it seems very wrong to use a literal for it as that means the subparts for a more complex molecule such as the one @davidcarlisle listed ( Since it appears I'm not missing an easy way to handle this, I think this points to a shortcoming of the |
Exactly, in my opinion we should have a dedicated name, part of Intent Core ( I don't understand the point about "shortcomings" of the current syntax. We know the list of values hasn't been completed. |
I'm confused about one of @NSoiffer points: "H20" and "H-O-H" are two very different things; I don't know how the last one should be pronounced, but almost certainly not the same as the first. I'm also confused about what is missing in the intent *syntax", in that you seem to rule out names for the concepts(?). |
Continuing on from the discussion above and at the meeting... Definitely: level 1 === core -- ran out of time to reread and clean up what I wrote above. Having something like On the call, we discussed moving One the call, @brucemiller suggested an addition to core of a
would turn into That's one solution to the problem and not a bad one. My only concern is their might be other special cases. Of course those could be added to core as we come across them, but it seems a bit special-case. Another solution that avoids adding a new (special case) name to core is to expand the syntax some (which is adding a new special case syntax ;-). E.g., if we allow |
As mentioned on the call, a handful of new attributes can probably handle
this:
element, molecule, isotope, ...
I don't see why we are considering cumbersome non-semantic alternatives.
|
@NSoiffer if I was writing this expression for the Open coverage, I would write it differently. I expect basic chemical notation to be in Core, also because you have repeatedly expressed a desire to have it in Core. So I wrote my example expecting special AT treatment. The way By auto-assigning my examples to Open, you're only antagonizing me and straw-manning the current intent syntax, which is quite unproductive. Also, this statement:
is false for Intent Open, as proposed and discussed in the last couple of years. Intent Open is still Intent - it is designed for use with the intent grammar, it just has unknown A pass of AT over Open should produce a basic function application readout (as we've discussed on many previous occasions). The narration you've written above with the parens spelled out would instead be what you expect to hear for If you told me we will only ever have chemistry in the Open level, and if we expected no AT would ever add support for it, I would annotate the expression as: <math intent="molecule(times($count,$H),$O)">
<msub>
<mi arg="H" mathvariant='normal'>H</mi>
<mn arg="count">2</mn>
</msub>
<mi arg="O" mathvariant='normal'>O</mi>
</math> and I would expect every AT to produce a standard functional narration, on the lines of (but not necessarily exactly identical to):
where I agree with @davidfarmer that having a special keyword for |
Closing because we agreed chemistry notations should go into core. I have opened #398 to discuss what I think is the main issue that the chemistry notation example raises. |
I feel like I'm missing something obvious, but I don't see how to write the
intent
values for water:Suppose I want to send to the speech engine/AT the string
H 2 0
. The only solution I see is to "cheat" and directly put the values of the children intointent
onmsub
asThere doesn't seem to be a way to generically reference the children of the
msub
and end up withH 2
. Am I missing something?The text was updated successfully, but these errors were encountered: