Closed
Description
I came across a discrepancy between cmark and commonmark.js output:
$ echo '**。**话' | ./cmark/build/src/cmark
<p>**。**话</p>
$ echo '**。**话' | ./commonmark.js/bin/commonmark
<p><strong>。</strong>话</p>
So, according to spec v26,
A punctuation character is an ASCII punctuation character or anything in the Unicode classes Pc, Pd, Pe, Pf, Pi, Po, or Ps.
Character "。" or U+3002 belongs to a class Punctuation, Other [Po]
(see http://www.fileformat.info/info/unicode/char/3002/index.htm), but it's not included here:
For the reference, here's the regexp from unicode-8.0.0 package (we're using that in markdown-it), which includes this character (and appears to be a lot larger):
https://github.com/mathiasbynens/unicode-8.0.0/blob/master/General_Category/Punctuation/regex.js
Metadata
Metadata
Assignees
Labels
No labels