A module to quickly create Corpus objects containing TTR, tokenized sentences, lexical density, class frequencies and more.
nltk numpy re codecs
This module SHOULD also work in Python2.
Creates and objects which contains different information on a corpus given to the function build_corpus(). When instanciating the object, specify the maximum number of tokens to keep into consideration.