Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interlinear glossed text (IGT) in markup #23

Open
r12a opened this issue Mar 5, 2019 · 13 comments
Open

Interlinear glossed text (IGT) in markup #23

r12a opened this issue Mar 5, 2019 · 13 comments

Comments

@r12a
Copy link
Owner

r12a commented Mar 5, 2019

http://r12a.github.io/blog/201708.html#20190304 shows how you can use flexbox to produce interlinear glossed text of the kind that is often found in linguistic and biblical texts. (To my mind, the name 'interlinear gloss', although apparently used for the particular type of glossing i refer to here, isn't very clear, and is confusable with approaches like ruby annotations, which feel different to me. I'd prefer a name more like 'multi-line gloss'.)

I see this as different from ruby text in that ruby text is very much, in my eyes, an inline feature. For example, ruby is typically an annotation to a part of a line of mainstream text, and one that is squeezed into the inter-linear space (eg. with no change to the dimensions of that space when used with Japanese according to JLReq). Ruby tends to be used as an appendage to the flowing main text it annotates.

The use cases for the glossing i'm referring to here are much more block (or actually, table) oriented, and much more complicated stylistically. They tend to have a legend at the start, verse indicators, etc. They commonly involve 3 or more parallel lines of text, that (importantly) wrap together when they reach the end of a line. The styling is much more complicated – each line may have different font styling, there may be inline changes inside a segment, eg. morphological identifiers tend to be rendered with small caps within a gloss., etc.

So here i'm suggesting an approach based on flexbox. This allows 'tabular data' to wrap at the line end, and allows the author to control the spacing between 'cells' using margins as well as padding. Etc. Significantly, this approach works, right now, in all major browsers. There's no need to design and implement new markup features, it just works out of the box.

This issue was set up to carry discussion related to the idea...

@dyacob
Copy link

dyacob commented Mar 6, 2019

A few quick remarks on the Ethiopic sample:

  1. A typo, ሰዬጣን => ሰይጣን
  2. The text language is in Ge'ez, so the markup language attribute should be set accordingly: lang="gez"
  3. Possibly a few words are missing from the sample, there is no mention of a king. "went.he to and-he.said-to.him" is missing the king a subject, thus not aligning with "he went to the king and said to the king ..."

@r12a
Copy link
Owner Author

r12a commented Mar 6, 2019

Thanks @dyacob for catching those things. Should all be fixed now.

@Crissov
Copy link

Crissov commented Mar 7, 2019

I would describe this layout as a wrappable column-primary table (with a multi-column span in the last row). Iʼm not entirely sure whether this is a better approach than a traditional but wrappable row-primary table.

Btw.: Many (layout-wise) simple examples can be found in the documentation of the Leipzig Glossing Rules.

@amundo
Copy link

amundo commented Jul 3, 2021

Thanks for this discussion.

Curious about lang tags on the other tiers; the "wä-sobä sämʾä ʾIsayəyyas is also Ge’ez, is it not?

@r12a
Copy link
Owner Author

r12a commented Jul 6, 2021

@amundo i think you probably make a good point. I should probably add lang="gez-Latn" to the .trans tags.

@amundo
Copy link

amundo commented Jul 6, 2021

As long as we’re talking lang tags, could there or should there be one for the morphological gloss tier? This is something that has long been unclear to me. <span class="gloss">and-when</span> is English, sort of…

@r12a
Copy link
Owner Author

r12a commented Jul 6, 2021

I think you should assume that the language of the page as a whole has already been declared to be English, in this case. So they directly inherit that, and don't need to be relabelled, unless you wanted to specifically call out that this is an odd type of English – but then, it's not clear to me what kind of label you could use for that.

@amundo
Copy link

amundo commented Jul 13, 2021

Another interesting approach is to use inline-grid instead of inline-flex — that results in fewer rules overall:

.stack {
  display: inline-grid;
  margin-right: .75em;
  margin-top:   .5em;
}

@747
Copy link

747 commented Dec 17, 2021

Hi, I just landed here from your blog post.

I think your approach is visually fine, but semantically not enough accurate in the spirit of linguistic gloss. While the most (superficially) striking feature of glosses is vertical alignment across lines, the bottom line of this format is being a container of multiple inline flows bound together.

A visually (I think) clear manifestation of what linguists would expect can be seen in the link @Crissov has put.

image

If you drag to select a span of text, each line should be continuously selected (they are the chief continuous runs). On top of this, the behavior should be ideally kept even if line breaks involved (see the image below; manually processed from the previous one).

gloss

(Please also note that the last free translation line is out of this parallelism.)

I am not sure whether there exists any rendering engine has implemented such feature, but I believe this is the conceptually correct representation model of linguistic glosses. In other words:

  • the entire "gloss" is container of n inline flows (sub-lines) stacked vertically (block-axis-wise)
  • but the entire "gloss" should behave like a fat inline flow (that can be broken in "lines" when the width is insufficient)
  • when the "gloss" breaks in the middle, always does so in the state that all inner sub-lines stacked
  • each sub-line is, at least conceptually, extended to the longest width among fellow siblings (like align-items: stretch but inline-axis-wise)

I don't quite follow the discussion, but I can imagine this is what people say you need a special structure for glosses when they say. And the "each word or morpheme must synchronize vertically" matter is a secondary styling requirement.

@amundo
Copy link

amundo commented Dec 17, 2021

I don’t see how it makes sense that glossed words would not correspond to a semantic node. If the inline flows are sufficient to represent the gloss, why is the free translation line outside of parallelism? It’s because the parallelism is really at the word (and by notational implication, the morpheme) level. Otherwise the fact (for instance) that ferma and farm are related is lost, but that is the whole point.

As for selecting continuous runs, I think that usually applies to the target language content itself (as you suggest in the screenshot) — but in that case, a continous representation can be captured with another tier, as is done with the free translation. IJAL, for instance, calls these the “three-line” and “four-line” format.

@747
Copy link

747 commented Dec 17, 2021

@amundo While what I wrote above is obviously declaration of my implicit mental model...

If the inline flows are sufficient to represent the gloss, why is the free translation line outside of parallelism?

I don't think I understand this part very well, as I myself didn't have intention to insist "the inline flows are sufficient to represent the gloss" (if you mean the whole "three-line" or "four-line" by "gloss"). Could you perhaps explain it a bit more?

It’s because the parallelism is really at the word (and by notational implication, the morpheme) level. Otherwise the fact (for instance) that ferma and farm are related is lost, but that is the whole point.

I did not deny this. Every morpheme discrete unit in the parallelized lines is two-dimensionally related to adjacent positions, but since HTML does not natively support two-dimensional relations, you can only simulate the effect by nesting two levels of linear structures. In short, you have to choose whether to make horizontal container parent (the "ferma"-"farm" relation is obscured, as you say) or vertical container parent (the "ferma"-"hamišaluǧ" relation obscured).

I think that usually applies to the target language content itself (as you suggest in the screenshot) — but in that case, a continous representation can be captured with another tier, as is done with the free translation.

Here I wanted to demonstrate my idea visually without making a diagram myself. Actual content in that part is, as you see, not very useful to paste somewhere to reuse due to artificial hyphening and spacing etc.

@r12a
Copy link
Owner Author

r12a commented Jan 26, 2022

I suspect that the desire to select a whole line is a different requirement than an arrangement that shows the semantic relationships between the parts of the text, which is what glossing sets out to do. I don't think it should drive the structure of the text.

In my character apps i use a similar approach to explode words into characters and annotate them with transliterations, but there is a small icon close by that allows me to copy each line.

For example, go to https://r12a.github.io/pickers/deva-ks/?text=%E0%A4%B8%E0%A5%97%E0%A4%A4%E0%A5%8D%E0%A4%AF%E0%A5%8D and click on List Characters, above the large box. Look bottom right to see the 'glossed' character list. To pick up the word सॗत्य् click on the copy icon, below, with B in it. To pick up the transliteration, click on the icon with L in it.. To pick up the word सॗत्य् click on the overlapping squares icon to its left. To pick up the transliteration, click on the similar icon alongside it.

@js-choi
Copy link

js-choi commented Jul 18, 2022

I think this is a fascinating question, but I wonder—if these glosses are essentially a table that flows and wraps, then is there any particular reason why a <table> would not work? Semantically, every cell in a table is associated with both its column (e.g., an entire content run in one specific script/language) and its row (e.g., one phrase transcribed/translated across many scripts/languages).

<table> elements don’t have to use display: table; they can use display: grid and responsively flow and wrap.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants