Support Vietnamese for Text Extractor #36978
Labels
Product-Text Extractor
This refers to the Text Extractor PowerToy
Resolution-Already Fixed/Doesn't Apply
A change in the product has made the issue obsolete.
Description of the new feature / enhancement
Currently, the Text Extractor tool does not fully support the Vietnamese language. I propose an enhancement to enable the tool to accurately process and recognize Vietnamese text, including diacritical marks and the unique word arrangement of the language.
Scenario when this would be used?
Managing documents in Vietnam for both personal and professional purposes.
Supporting translation, archiving, or data processing tasks involving Vietnamese text in fields such as education, administration, or commerce.
Supporting information
Vietnamese consists of 29 letters with diacritical marks: acute (́), grave (̀), tilde (̃), hook (̉), dot (̣).
References for Vietnamese language processing:
Research on Vietnamese Natural Language Processing.
Open-source libraries supporting Vietnamese, such as Pyvi and VnCoreNLP.
The text was updated successfully, but these errors were encountered: