Collecting kanji usage frequency data from Twitter Streaming API
-
Updated
Jun 1, 2015 - JavaScript
Collecting kanji usage frequency data from Twitter Streaming API
An assortment of word-lists and micro dictionaries in English. Especially suited to English language learning tasks.
Data from a corpus of written Hawaiian
Frequency List Wizard is a command-line program that does various useful things with... frequency lists.
Software that generates text in the style of the oeuvre that is added as argument (in plain text). Every run provides unique output, stored with a randomized integer in the output filename.
Parser for danish corpora Korpus 90, 2000, and 2010
A Dataset for Training and Testing Abstractive Summarizers
A growing corpus of fortune cookies (for NLP and fun). Add your fortunes!
Command-line corpus tools
Vietnamese Wikipedia Corpus
A binary-corpus system for word tagging
Scripts and resources for making spaCy understand Hungarian.
A linguistic corpus of Czech native learners acquiring Italian language
Massive corpus of questions for ESL discussions or conversations for ESL classrooms, data mining or creation of bots
Benchmarking various tools for counting word and phrase frequency in corpora [for windows]
Add a description, image, and links to the corpus-linguistics topic page so that developers can more easily learn about it.
To associate your repository with the corpus-linguistics topic, visit your repo's landing page and select "manage topics."