Skip to content

Commit 8f4dcf2

Browse files
committed
chore: fix checksum mismatch under Windows due to CRLF auto-conversion by updating gitattributes
1 parent f60b87f commit 8f4dcf2

File tree

3 files changed

+8
-4
lines changed

3 files changed

+8
-4
lines changed

.gitattributes

+5-2
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,8 @@
1-
# Ref: https://stackoverflow.com/questions/19052834/is-it-possible-to-exclude-files-from-git-language-statistics
2-
data/ZhConversion.php linguist-vendored
1+
# Exclude external ruleset files from GitHub PL stats
2+
# ref: https://stackoverflow.com/questions/19052834/is-it-possible-to-exclude-files-from-git-language-statistics
3+
# And prevent auto CRLF conversion to avoid checksum mismatch
4+
data/ZhConversion.php linguist-vendored binary
5+
data/*.txt linguist-vendored binary
36
data/cgroups/*.json linguist-vendored
47
web/public/cgroups.json linguist-vendored
58
benches/*.txt linguist-vendored

build.rs

+2-1
Original file line numberDiff line numberDiff line change
@@ -353,7 +353,8 @@ fn read_and_validate_file(path: &str, sha256sum: &[u8; 32]) -> String {
353353
assert_eq!(
354354
&sha256(&content),
355355
sha256sum,
356-
"Validating the checksum of zhconv"
356+
"Validating the checksum of {}",
357+
path.display()
357358
);
358359
content
359360
}

src/lib.rs

+1-1
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
//! with the leftmost-longest matching strategy and linear time complexity with respect to the
44
//! length of input text and conversion rules. It ships with a bunch of conversion tables,
55
//! extracted from [zhConversion.php](https://phabricator.wikimedia.org/source/mediawiki/browse/master/includes/languages/data/ZhConversion.php)
6-
//! which is maintained and used by MediaWiki and Chinese Wikipedia.
6+
//! (maintained by MediaWiki and Chinese Wikipedia) and [OpenCC](https://github.com/BYVoid/OpenCC/tree/master/data/dictionary).
77
//!
88
//! While built-in datasets work well for general case, the converter is never meant to be 100%
99
//! accurate, especially for professional text. In Chinese Wikipedia, it is pretty common for

0 commit comments

Comments
 (0)