- 
                Notifications
    You must be signed in to change notification settings 
- Fork 92
PHP Frequency Distribution
        yooper edited this page Aug 16, 2016 
        ·
        3 revisions
      
    The frequency distribution is a great way to find out how frequently or in-frequently specific words are used in a body of text. The FreqDist class expects the tokens to be normalized prior to object instantiation.
$tokenizer = new GeneralTokenizer()
$tokens = $tokenizer->tokenize("time flies like an arrow and an arrow flies like time");
$freqDist = new FreqDist($tokens);
/*
* Get the Hapaxes, all the terms with a frequency count of 1
*/
$freqDist->getHapaxes(); 
/*
* get the corpus size
*/ 
$freqDist->getTotalTokens()
/**
* Get the size of the vocabulary
*/
$freqDist->getTotalUniqueTokens();