Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

In Japanese, Emphasis and Strong have a bug #1107

Closed
4 tasks done
Neos21 opened this issue Jan 18, 2023 · 4 comments
Closed
4 tasks done

In Japanese, Emphasis and Strong have a bug #1107

Neos21 opened this issue Jan 18, 2023 · 4 comments
Labels
🙅 no/wontfix This is not (enough of) an issue for this project 👎 phase/no Post cannot or will not be acted on

Comments

@Neos21
Copy link

Neos21 commented Jan 18, 2023

Initial checklist

Affected packages and versions

[email protected], [email protected]

Link to runnable example

No response

Steps to reproduce

I use Node.js v18.12.1 and npm v8.19.2. I wrote vanilla Node.js script.

  • package.json : devDependencies
{
  "rehype-stringify": "9.0.3",
  "remark-parse": "10.0.1",
  "remark-rehype": "10.1.0",
  "unified": "10.1.2"
}
  • Code:
import { unified } from 'unified';
import remarkParse from 'remark-parse';
import remarkRehype from 'remark-rehype';
import rehypeStringify from 'rehype-stringify';

const processor = unified()
  .use(remarkParse)
  .use(remarkRehype)
  .use(rehypeStringify);
const result = processor.processSync(`
1. 日本語*強調*、日本語
2. 日本語*強調*。日本語
3. 日本語*強調、*日本語
4. 日本語*強調。*日本語
5. 日本語**強調**、日本語
6. 日本語**強調**。日本語
7. 日本語**強調、**日本語
8. 日本語**強調。**日本語
`);
console.log(result.value);
  • 8 examples.
  • The first 4 examples are wanted to convert to <em>
  • The last 4 examples are wanted to convert to <strong>
  • But, No. 3, 4, and 7, 8 were not converted.
    • I think a regexp has a bug with the character and maybe...?

Expected behavior

<ol>
<li>日本語<em>強調</em>、日本語</li>
<li>日本語<em>強調</em>。日本語</li>
<li>日本語<em>強調、</em>日本語</li>
<li>日本語<em>強調。</em>日本語</li>
<li>日本語<strong>強調</strong>、日本語</li>
<li>日本語<strong>強調</strong>。日本語</li>
<li>日本語<strong>強調、</strong>日本語</li>
<li>日本語<strong>強調。</strong>日本語</li>
</ol>

Actual behavior

<ol>
<li>日本語<em>強調</em>、日本語</li>
<li>日本語<em>強調</em>。日本語</li>
<li>日本語*強調、*日本語</li>
<li>日本語*強調。*日本語</li>
<li>日本語<strong>強調</strong>、日本語</li>
<li>日本語<strong>強調</strong>。日本語</li>
<li>日本語**強調、**日本語</li>
<li>日本語**強調。**日本語</li>
</ol>

Runtime

Node v17, Other (please specify in steps to reproduce)

Package manager

npm 8

OS

Windows

Build and bundle tools

Other (please specify in steps to reproduce)

@github-actions github-actions bot added 👋 phase/new Post is being triaged automatically 🤞 phase/open Post is being triaged manually and removed 👋 phase/new Post is being triaged automatically labels Jan 18, 2023
@Neos21
Copy link
Author

Neos21 commented Jan 18, 2023

I found other bug with the character and . Maybe more characters have same bug.

@ChristianMurphy
Copy link
Member

Welcome @Neos21! 👋
Sorry you ran into a spot of trouble.

Some background, remark implements commonmark (https://commonmark.org/) or with remark-gfm implements GFM (https://github.github.com/gfm/).
The output you are currently seeing from remark is expected, it is how emphasis works in commonmark and GFM.

example of rendering in the commonmark reference implementation: https://spec.commonmark.org/dingus/?text=1.%20%E6%97%A5%E6%9C%AC%E8%AA%9E*%E5%BC%B7%E8%AA%BF*%E3%80%81%E6%97%A5%E6%9C%AC%E8%AA%9E%0A2.%20%E6%97%A5%E6%9C%AC%E8%AA%9E*%E5%BC%B7%E8%AA%BF*%E3%80%82%E6%97%A5%E6%9C%AC%E8%AA%9E%0A3.%20%E6%97%A5%E6%9C%AC%E8%AA%9E*%E5%BC%B7%E8%AA%BF%E3%80%81*%E6%97%A5%E6%9C%AC%E8%AA%9E%0A4.%20%E6%97%A5%E6%9C%AC%E8%AA%9E*%E5%BC%B7%E8%AA%BF%E3%80%82*%E6%97%A5%E6%9C%AC%E8%AA%9E%0A5.%20%E6%97%A5%E6%9C%AC%E8%AA%9E**%E5%BC%B7%E8%AA%BF**%E3%80%81%E6%97%A5%E6%9C%AC%E8%AA%9E%0A6.%20%E6%97%A5%E6%9C%AC%E8%AA%9E**%E5%BC%B7%E8%AA%BF**%E3%80%82%E6%97%A5%E6%9C%AC%E8%AA%9E%0A7.%20%E6%97%A5%E6%9C%AC%E8%AA%9E**%E5%BC%B7%E8%AA%BF%E3%80%81**%E6%97%A5%E6%9C%AC%E8%AA%9E%0A8.%20%E6%97%A5%E6%9C%AC%E8%AA%9E**%E5%BC%B7%E8%AA%BF%E3%80%82**%E6%97%A5%E6%9C%AC%E8%AA%9E

and example gfm rendering the content here on GitHub itself:


  1. 日本語強調、日本語
  2. 日本語強調。日本語
  3. 日本語*強調、*日本語
  4. 日本語*強調。*日本語
  5. 日本語強調、日本語
  6. 日本語強調。日本語
  7. 日本語**強調、**日本語
  8. 日本語**強調。**日本語

remark implements commonmark to spec, the implementation as is matches the behavior, differing would be a bug.

The way to make a change would be to have the commonmark standard change.
If you'd like to nudge commonmark towards better supporting more languages, feel free to reach out in the specification discussion forum (https://talk.commonmark.org/)
In particular this thread https://talk.commonmark.org/t/emphasis-and-east-asian-text/2491

@ChristianMurphy ChristianMurphy closed this as not planned Won't fix, can't repro, duplicate, stale Jan 18, 2023
@github-actions

This comment has been minimized.

@tats-u
Copy link

tats-u commented Jan 26, 2025

If you visit this page while seeking a solution, could you try https://www.npmjs.com/package/remark-cjk-friendly as a tentative solution for the time being?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🙅 no/wontfix This is not (enough of) an issue for this project 👎 phase/no Post cannot or will not be acted on
Development

No branches or pull requests

3 participants