Internal utf8#4920
Open
PieterVdc wants to merge 60 commits into
Open
Conversation
This was
linked to
issues
Jun 14, 2026
Open
There was a problem hiding this comment.
Pull request overview
This PR moves the project toward using UTF-8 internally (instead of language-dependent custom encodings), and introduces a UniFont-based glyph source to replace DBC language fonts.
Changes:
- Added a UniFont
.hex→.fxfontconverter script and a helper batch file for generating font binaries. - Refactored text rendering to decode UTF-8 codepoints and introduced a new
bflib_texthelper module. - Simplified translation-table storage to keep UTF-8 strings (removed the prior internal-codepage conversion path).
Reviewed changes
Copilot reviewed 10 out of 14 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| tools/fxfontmaker/unifont_hex_to_binary.py | New converter producing a fixed BMP index + glyph data block binary. |
| tools/fxfontmaker/make_fonts.bat | Batch helper to generate .fxfont outputs from UniFont .hex inputs. |
| src/config_translation.c | Stops converting translation strings into an internal codepage and stores UTF-8 directly. |
| src/config_keeperfx.c | Removes DBC-language setup hook during config load. |
| src/bflib_text.h | New header for UTF-8 decoding helper(s). |
| src/bflib_text.c | New UTF-8 decoding + codepage conversion utilities (used for UTF-8 decoding by renderer). |
| src/bflib_sprfnt.h | Updates font API signatures to accept uint32_t codepoints; removes old DBC font structures. |
| src/bflib_sprfnt.c | Major UTF-8 decoding/rendering changes; UniFont .fxfont loading; new DBC rendering path. |
| Makefile | Adds obj/bflib_text.o to the build. |
| config/fxdata/translation.toml | Adjusts a few language keys (RUS entries). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…into internal_utf8
Contributor
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Internally use utf-8 everywhere instead of the the language dependent custom ones
also replaces the dbc languages font with unifont
languages are way less relevant now
it can mix sprite fonts and a simpler fallback font
it actually checks what codepage a dat was made for and loads it correctly regardless of user language
so if a campaign only has japanese and game set to english, it'll display japanese text instead of the corrupted gibberish it used to
same in the opposite direction if the eng used any non ascii chars in it
QUICK_MESSAGE etc all now accept utf-8 as well
the fallback font covers everything in the bmp, so if someone wanted a Bamum translation or whatever wierd shit they can come up with, likely got them covered already
for the dbc languages I still default to dbc mode for now
added a console command !dbc to switch between the modes, both will display any text, just has some exceptions spread throughout
eventually I would want to merge dbc mode into the main one, but to many spots to look at I'll do that seperately
there's 3 fonts japanese, chinese simpified, and traditional for the han unified glyphs to still use a localized variant so if you're not on japanese the han might look bit simplified chinese style but should be fine if language is