Tuesday 22 November 2011

N-GRAMS from the COCA and COHA corpora of American English

Link: http://www.ngrams.info/
Intro: These n-grams are based on the largest publicly-available, genre-balanced corpus of English -- the 450 million wordCorpus of Contemporary American English (COCA). With this n-grams data (2, 3, 4, 5-word sequences, with their frequency), you can carry out powerful queries offline -- without needing to access the corpus via the web interface.