Skip to content

Standardize paragraph about ASCII, SBCS, and MBCS #5537

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -1,14 +1,13 @@
---
title: "Single-Byte and Multibyte Character Sets"
description: "An introduction to single and multi-byte character sets in the Microsoft runtime library."
ms.date: 11/04/2016
ms.topic: "concept-article"
ms.date: "11/04/2016"
helpviewer_keywords: ["SBCS (single byte character set)", "MBCS [C++], about MBCS", "character sets [C++], multibyte", "character sets [C++], single byte"]
ms.assetid: 2cbc78ea-33c0-4cfb-b0df-7ce2458431ce
---
# Single-byte and multibyte character sets

The ASCII character set defines characters in the range 0x00 - 0x7F. There are many other character sets, primarily European, that define the characters within the range 0x00 - 0x7F identically to the ASCII character set and also define an extended character set from 0x80 - 0xFF. An 8-bit, single-byte-character set (SBCS) is sufficient to represent the ASCII character set and the character sets for many European languages. However, some non-European character sets, such as Japanese Kanji, include many more characters than can be represented in a single-byte coding scheme, and so require multibyte-character set (MBCS) encoding.
The ASCII character set defines characters in the range 0x00 - 0x7F. There are other character sets, primarily European, that define the characters within the range 0x00 - 0x7F identically to the ASCII character set and also define an extended character set from 0x80 - 0xFF. Thus, an 8-bit, single-byte-character set (SBCS) is sufficient to represent the ASCII character set and the character sets for many European languages. However, some non-European character sets, such as Japanese Kanji, include many more characters than a single-byte coding scheme can represent, and therefore require multibyte-character set (MBCS) encoding.

> [!NOTE]
> Many Microsoft run-time library SBCS routines handle multibyte bytes, characters, and strings as appropriate. Many multibyte-character sets define the ASCII character set as a subset. In many multibyte character sets, each character in the range 0x00 - 0x7F is identical to the character that has the same value in the ASCII character set. For example, in both ASCII and MBCS character strings, the one-byte null character ('\0') has value 0x00 and indicates the terminating null character.
Expand Down
7 changes: 3 additions & 4 deletions docs/text/text-and-strings-in-visual-cpp.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,12 @@
---
description: "Learn more about: Text and Strings in Visual C++"
title: "Text and Strings in Visual C++"
ms.date: "11/04/2016"
description: "Learn more about: Text and Strings in Visual C++"
ms.date: 11/04/2016
helpviewer_keywords: ["globalization [C++], character sets", "programming [C++], international", "multiple language support [C++]", "Unicode [C++]", "international applications [C++], about international applications", "portability [C++]", "translation [C++], character sets", "non-European characters [C++]", "character sets [C++]", "globalization [C++]", "Japanese characters [C++]", "Kanji character support [C++]", "local character sets [C++]", "ASCII characters [C++]", "character sets [C++], about character sets", "localization [C++], character sets", "translating code [C++]", "localization [C++]", "character sets [C++], non-European", "portability [C++], character sets", "MBCS [C++], international programming"]
ms.assetid: a1bb27ac-abe5-4c6b-867d-f761d4b93205
---
# Text and Strings in Visual C++

An important aspect of developing applications for international markets is the adequate representation of local character sets. The ASCII character set defines characters in the range 0x00 to 0x7F. There are other character sets, primarily European, that define the characters within the range 0x00 to 0x7F identically to the ASCII character set and also define an extended character set from 0x80 to 0xFF. Thus, an 8-bit, single-byte-character set (SBCS) is sufficient to represent the ASCII character set, as well as the character sets for many European languages. However, some non-European character sets, such as Japanese Kanji, include many more characters than a single-byte coding scheme can represent, and therefore require multibyte-character set (MBCS) encoding.
An important aspect of developing applications for international markets is the adequate representation of local character sets. The ASCII character set defines characters in the range 0x00 - 0x7F. There are other character sets, primarily European, that define the characters within the range 0x00 - 0x7F identically to the ASCII character set and also define an extended character set from 0x80 - 0xFF. Thus, an 8-bit, single-byte-character set (SBCS) is sufficient to represent the ASCII character set and the character sets for many European languages. However, some non-European character sets, such as Japanese Kanji, include many more characters than a single-byte coding scheme can represent, and therefore require multibyte-character set (MBCS) encoding.

## In This Section

Expand Down