Skip to content

Commit 4b981af

Browse files
committed
Add pre- and post-filer configuration options and update filter examples to use them.
1 parent 7b644d2 commit 4b981af

File tree

4 files changed

+271
-77
lines changed

4 files changed

+271
-77
lines changed

advanced/synchronize/filters.rst

Lines changed: 132 additions & 75 deletions
Original file line numberDiff line numberDiff line change
@@ -9,16 +9,25 @@ pre- and post-filters associated with MathJax's input and output jax.
99
These are prioritized lists of functions that run either before or
1010
after the jax processes a :data:`MathItem`, and they can be used to
1111
pre-process or post-process MathJax's compiling and typesetting
12-
functions. Input jax have both pre- and post-filters, but output jax
13-
have only post-filters; pre-filtering can be done by an input jax
14-
post-filter, if needed.
12+
functions. Input and output jax have both pre- and post-filters, and
13+
the MathML input jax has an extra set of filters for the parsed MathML
14+
as well.
1515

16-
To add a pre- or post-filter to an input jax use
16+
When using :ref:`Mathjax Components framework <web-components>`, you
17+
can use the MathJax configuration object to specify input and output
18+
jax filters. The :data:`preFilter` and :data:`postFilter`
19+
configuration options in the :data:`tex`, :data:`mathml`,
20+
:data:`output`, :data:`chtml`, or :data:`svg` blocks allow you to
21+
specify arrays of filters (or filters together with their priorities).
22+
See the :ref:`configuring-mathjax` section for details.
23+
24+
When using direct access to the MathJax modules in node applications,
25+
to add a pre- or post-filter to an input jax use
1726

1827
.. js:function:: InputJax.preFilters.add(fn, priority)
1928
InputJax.postFilters.add(fn, priority)
2029

21-
:param (arg) => boolean|void: The filter function to be called.
30+
:param (arg)=>boolean|void: The filter function to be called.
2231
The :data:`arg` argument is an object
2332
with three keys: :data:`math`,
2433
:data:`document`, and :data:`data`.
@@ -35,7 +44,8 @@ To add a pre- or post-filter to an input jax use
3544
functions anywhere in the filter list.
3645

3746
For the TeX input jax, the :data:`data` item is the
38-
:data:`ParseOptions` object for the input jax.
47+
:data:`ParseOptions` object for the input jax, which holds
48+
configuration data about the TeX input jax.
3949

4050
For the MathML input jax, the pre-filter only runs in the case that
4151
the MathML is a serialized MathML string, as it is when converting a
@@ -54,14 +64,15 @@ input jax converts the MathML into MathJax's internal format. The
5464
The AsciiMath input jax does not currently execute any pre- or
5565
post-filters.
5666

57-
For an output jax, the post-filters can be added via
67+
For an output jax, the pre- and post-filters can be added via
5868

59-
.. js:function:: OutputJax.postFilters.add(fn, priority)
69+
.. js:function:: OutputJax.preFilters.add(fn, priority)
70+
OutputJax.postFilters.add(fn, priority)
6071

6172
with arguments as above. In this case, the :data:`data` is the
6273
``mjx-container`` node in which the output DOM elements have been
6374
placed. This will become the :data:`MathItem.typesetRoot` value, but
64-
it has not yet been set when the post-filters run.
75+
it has not yet been set when the filters run.
6576

6677
In an application that is using MathJax Components, the input jax can
6778
be obtained from :data:`MathJax.startup.document.inputJax.tex` or
@@ -74,6 +85,7 @@ to them; if not, then they can be obtained from the
7485
:js:meth:`mathjax.document()` by using that in place of
7586
:data:`MathJax.startup.document` above.
7687

88+
7789
-----
7890

7991
.. _filter-number-space:
@@ -90,20 +102,17 @@ displayed as ``12345``.
90102
91103
MathJax = {
92104
tex: {
93-
numberPattern: /^(?:[0-9]+(?:(?: +|\{,\})[0-9]+)*(?:\.[0-9]*)?|\.[0-9]+)/
94-
},
95-
startup: {
96-
ready() {
97-
MathJax.startup.defaultReady();
98-
MathJax.startup.document.inputJax.tex.postFilters.add(({data}) => {
105+
numberPattern: /^(?:[0-9]+(?:(?: +|\{,\})[0-9]+)*(?:\.[0-9]*)?|\.[0-9]+)/,
106+
postFilters: [
107+
({data}) => {
99108
for (const mn of data.getList('mn')) {
100109
const textNode = mn.childNodes[0];
101110
textNode.text = textNode.text.replace(/ /g, '');
102111
}
103-
});
104-
}
105-
}
106-
}
112+
}
113+
],
114+
},
115+
};
107116
108117
We set the :data:`numberPattern` option to allow spaces within the
109118
number, and then use a post-filter to remove the spaces from the text
@@ -123,18 +132,15 @@ to better quality output.
123132
.. code-block:: js
124133
125134
MathJax = {
126-
startup: {
127-
ready() {
128-
MathJax.startup.defaultReady();
129-
MathJax.startup.document.inputJax.tex.preFilters.add(
130-
({math}) => {
131-
math.math = math.math.replace(/[\uFF01-\uFF5E]/g,
132-
(c) => String.fromCodePoint(c.codePointAt(0) - 0xFF00 + 0x20));
133-
}
134-
);
135-
}
135+
tex: {
136+
preFilters: [
137+
({math}) => {
138+
math.math = math.math.replace(/[\uFF01-\uFF5E]/g,
139+
(c) => String.fromCodePoint(c.codePointAt(0) - 0xFF00 + 0x20));
140+
}
141+
]
136142
}
137-
}
143+
};
138144
139145
This uses a pre-filter to replace characters in the full-width range
140146
by an equivalent one in the usual ASCII character range. This will
@@ -155,31 +161,25 @@ and subscripts.
155161
.. code-block:: js
156162
157163
MathJax = {
158-
startup: {
159-
ready() {
160-
//
161-
// Do usual setup
162-
//
163-
MathJax.startup.defaultReady();
164-
//
165-
// The pseudoscript numbers 0 through 9, and a pattern for plus-or-minus a number
166-
//
167-
const scripts = '\u2070\u00B9\u00B2\u00B3\u2074\u2075\u2076\u2077\u2078\u2079';
168-
const scriptRE = /([\u207A\u207B])?([\u2070\u00B9\u00B2\u00B3\u2074-\u2079]+)/g;
169-
//
170-
// Add a TeX prefilter to convert pseudoscript numbers to actual superscripts
171-
//
172-
MathJax.startup.document.inputJax.tex.preFilters.add(({math}) => {
173-
math.math = math.math.replace(scriptRE, (match, pm, n) => {
174-
const N = n.split('').map(c => scripts.indexOf(c)); // convert digits
164+
//
165+
// The pseudoscript numbers 0 through 9, and a pattern for plus-or-minus a number
166+
//
167+
scripts: '\u2070\u00B9\u00B2\u00B3\u2074\u2075\u2076\u2077\u2078\u2079',
168+
scriptRE: /([\u207A\u207B])?([\u2070\u00B9\u00B2\u00B3\u2074-\u2079]+)/g,
169+
170+
tex: {
171+
preFilters: [
172+
({math}) => {
173+
math.math = math.math.replace(MathJax.config.scriptRE, (match, pm, n) => {
174+
const N = n.split('').map(c => MathJax.config.scripts.indexOf(c)); // convert digits
175175
pm === '\u207A' && N.unshift('+'); // add plus, if given
176176
pm === '\u207B' && N.unshift('-'); // add minus, if given
177177
return '^{' + N.join('') + '}'; // make it an actual power
178178
});
179-
});
180-
}
179+
}
180+
]
181181
}
182-
}
182+
};
183183
184184
This uses a TeX input jax pre-filter to scan the TeX expression for
185185
Unicode superscript numerals, with optional plus or minus signs, and
@@ -204,22 +204,21 @@ those measurements to `px` units instead.
204204
.. code-block:: js
205205
206206
MathJax = {
207-
startup: {
208-
ready() {
209-
MathJax.startup.defaultReady();
210-
const fixed = MathJax.startup.document.outputJax.fixed;
211-
MathJax.startup.document.outputJax.postFilters.add(({data}) => {
207+
svg: {
208+
postFilters: [
209+
({data}) => {
210+
const fixed = MathJax.startup.document.outputJax.fixed;
212211
const svg = data.querySelector('svg');
213212
if (svg?.hasAttribute('viewBox')) {
214213
const [ , , w, h] = svg.getAttribute('viewBox').split(/ /);
215-
const em = document.outputJax.pxPerEm / 1000;
214+
const em = MathJax.startup.document.outputJax.pxPerEm / 1000;
216215
svg.setAttribute('width', fixed(w * em) + 'px');
217216
svg.setAttribute('height', fixed(h * em) + 'px');
218217
}
219-
});
220-
}
218+
}
219+
]
221220
}
222-
}
221+
};
223222
224223
We use an output jax post-filter to modify the ``svg`` element's
225224
attributes, taking advantage of the output jax's :meth:`fixed()`
@@ -239,18 +238,18 @@ This configuration implements a substitute for the v2 `autobold` extension.
239238
.. code-block:: js
240239
241240
MathJax = {
242-
startup: {
243-
ready() {
244-
MathJax.startup.defaultReady();
245-
MathJax.startup.document.inputJax.tex.preFilters.add(({math}) => {
241+
tex: {
242+
preFilters: [
243+
({math}) => {
246244
const styles = window.getComputedStyle(math.start.node.parentNode);
247-
if (styles.fontWeight >= 700) {
245+
if (styles.fontWeight >= 700 && !math.inputData.bolded) {
248246
math.math = '\\boldsymbol{' + math.math + '}';
247+
math.inputData.bolded = true;
249248
}
250-
});
251-
}
249+
}
250+
]
252251
}
253-
}
252+
};
254253
255254
It uses a TeX input jax pre-filter that tests if the parent element of
256255
the math string has CSS with ``font-weight`` of 700 or more (the
@@ -259,6 +258,12 @@ usual ``bold`` value), and if so, it wraps the TeX code in
259258
expression itself includes bold notation, that does not become extra
260259
bold, so may not be distinguishable from the rest of the expression.
261260
261+
We track the fact that bolding has been added using the
262+
:data:`inputData` object of the :data:`math` object. That way, if the
263+
expression needs to be reparsed (e.g., for a ``\require`` command, or
264+
other dynamic data being loaded), we won't add ``\boldsymbol`` more
265+
than once.
266+
262267
-----
263268
264269
.. _filter-mathvariant:
@@ -330,6 +335,23 @@ browsers that implement MathML-Core.
330335
0x6F: 0x2134,
331336
}, '\uFE00'],
332337
'-tex-bold-calligraphic': [0, 0x1D4D0, 0x1D4EA, 0, 0, {}, '\uFE00'],
338+
'-tex-mathit': [0, 0x1D434, 0x1D44E, 0x1D6E2, 0x1D6FC, {0x68: 0x210E}],
339+
};
340+
//
341+
// Styles to use for characters that can't be translated.
342+
//
343+
const variantStyles = {
344+
bold: 'font-weight: bold',
345+
italic: 'font-style: italic',
346+
'bold-italic': 'font-weight; bold; font-style: italic',
347+
'script': 'font-family: cursive',
348+
'bold-script': 'font-family: cursive; font-weight: bold',
349+
'sans-serif': 'font-family: sans-serif',
350+
'bold-sans-serif': 'font-family: sans-serif; font-weight: bold',
351+
'sans-serif-italic': 'font-family: sans-serif; font-style: italic',
352+
'sans-serif-bold-italic': 'font-family: sans-serif; font-weight: bold; font-style: italic',
353+
'monospace': 'font-family: monospace',
354+
'-tex-mathit': 'font-style: italic',
333355
};
334356
//
335357
// The filter function
@@ -362,11 +384,20 @@ browsers that implement MathML-Core.
362384
//
363385
// Convert the text of the child nodes
364386
//
387+
let converted = true;
365388
for (const child of node.childNodes) {
366389
if (child.isKind('text')) {
367-
convertText(child, start, remap, modifier);
390+
converted &= convertText(child, start, remap, modifier);
368391
}
369392
}
393+
//
394+
// If not all characters were converted, add styles, if possible,
395+
// but not when it would already be in italics.
396+
//
397+
if (!converted &&
398+
!(['italic', '-tex-mathit'].includes(variant) && text.length === 1 && node.isKind('mi'))) {
399+
addStyles(node, variant);
400+
}
370401
});
371402
}
372403
//
@@ -380,6 +411,7 @@ browsers that implement MathML-Core.
380411
//
381412
// Loop through the characters in the text
382413
//
414+
let converted = 0;
383415
for (let i = 0; i < text.length; i++) {
384416
let C = text[i].codePointAt(0);
385417
//
@@ -395,9 +427,11 @@ browsers that implement MathML-Core.
395427
//
396428
if (map[C]) {
397429
text[i] = String.fromCodePoint(map[C] - m + start[j]) + modifier;
430+
converted++;
398431
break;
399432
} else if (remap[C] || C <= M) {
400433
text[i] = String.fromCodePoint(remap[C] || C - m + start[j]) + modifier;
434+
converted++;
401435
break;
402436
}
403437
}
@@ -406,17 +440,34 @@ browsers that implement MathML-Core.
406440
// Put back the modified text content
407441
//
408442
node.setText(text.join(''));
443+
//
444+
// Return true if all characters were converted, false otherwise.
445+
//
446+
return converted === text.length;
409447
}
410448
//
411-
// Add the input post-filters
449+
// Add styles when conversion isn't possible.
450+
//
451+
function addStyles(node, variant) {
452+
let styles = variantStyles[variant];
453+
if (styles) {
454+
if (node.attributes.hasExplicit(styles)) {
455+
styles = node.attributes.get('style') + ' ' + styles;
456+
}
457+
node.attributes.set('style', styles);
458+
}
459+
}
460+
461+
//
462+
// Add the post-filters to all input jax
412463
//
413464
MathJax.startup.defaultReady();
414465
for (jax of MathJax.startup.document.inputJax) {
415466
jax.postFilters.add(({data}) => unicodeVariants(data.root || data));
416467
}
417468
}
418469
}
419-
}
470+
};
420471
421472
This example adds a post-filter to each of the input jax that are
422473
loaded (so it will work with both the MathML input as well as TeX
@@ -425,12 +476,16 @@ elements with :attr:`mathvariant` attributes, and then converts the
425476
content of the child text nodes of those token nodes to use the proper
426477
Unicode values for any alphabetic, numeric, or Greek characters that
427478
can be represented using the Mathematical Alphanumeric and Letterlike
428-
Symbols blocks.
479+
Symbols blocks. If any characters can't be converted to something in
480+
these blocks, we use a :attr:`style` attribute, when possible, to
481+
simulate the proper output.
429482
430483
The :data:`ranges` variable gives the character ranges that will be
431-
converted, and the :data:`variants` object gives the data needed to
432-
make those ranges to the various Mathematical Alphanumerics characters
433-
for the different :attr:`mathvariant` values.
484+
converted, the :data:`variants` object gives the data needed to make
485+
those ranges to the various Mathematical Alphanumerics characters for
486+
the different :attr:`mathvariant` values, and the
487+
:data:`variantStyles` object to hold the styles that need to be
488+
applied for each variant.
434489
435490
The special ``-tex-calligraphic`` and ``-tex-bold-calligraphic``
436491
variants are used internally in MathJax to produce the Chancery
@@ -444,7 +499,9 @@ for the TeX calligraphic variants. You may wish to add U+FE01 to the
444499
script variants to explicitly request the Roundhand versions as well.
445500
Note, however, that not all fonts support these variant specifiers, so
446501
you may get the same characters in both cases, and which you get will
447-
depend on the font.
502+
depend on the font. Some browsers may also show unknown character
503+
glyphs for these select codes when they don't understand how to
504+
process them.
448505
449506
450507
|-----|

0 commit comments

Comments
 (0)