Releases: LanguageMachines/uctodata
Releases · LanguageMachines/uctodata
v0.11
v0.10.1
v0.10
v0.9.1
v0.9
[Ko van der Sloot]
- fix for PREFIX rules in french and italian
- small fix to prevent loosing a character in the PREFIX rule. (see LanguageMachines/ucto#87 ) This doesn't fix the unwanted splits though.
- added SYMBOL, PICTOGRAM and EMOTICON to setdefinitions
- relaxed the e-mail rule a bit.
[Piroska Lendvai]
- Suggestions for German abbreviations
[Antal van den Bosch]
- New config file for English Twitter data. Recognizes and retains #hastags and @mentions.
v0.8
[Ko van der Sloot]
- separated .abr files from there main files for all Languages
- updated italian data (thanks to @texttheater)
[Iris Hendricks]
- updated abbrev files for Portuguese Turkish and French based on
https://en.wiktionary.org/wiki/Category:Portuguese_abbreviations and
https://en.wiktionary.org/wiki/Category:Turkish_initialisms. - added full list of French abbreviations.
- added 'aub' to Dutch list
v0.7.1
v0.7
[Ko vd Sloot]
- tokconfig-nld-historical: typo in rule
- updated all languages with new ABBREVIATION and NUMBER-ORDINAL rules:
= accommodate ABBREVIATIONS within brackets.
= avoid needless backtracking in NUMBER-ORDINAL
[Maarten van Gompel]
- Apparent bug in Italian config
v0.6
several fixes for problems addressed in
LanguageMachines/ucto#46
Notes:
- the suffix problems were already addressed in 0.5
- the colon problem is not addressed. Do we need REVERSE-SMILEY?