Web-Searchable Corpora
- [link] BYU Corpora
- [link] Corpus of Contemporary American English (COCA)
- [link] Corpus of Historical American English (COHA): not yet released
- [link] BYU-BNC: British National Corpus (cf. the official version below)
- [link] TIME Corpus of American English
- [link] Pitt ELI Online Data Search System (permission needed)
- [link][pdf] Child Language Data Exchange System (CHILDES)
- [link] Michigan Corpus Linguistics Home
- [link] Michigan Corpus of Academic Spoken English (MICASE)
- [link] Michigan Corpus of Upper-Level Student Papers (MICUSP)
- [link] The John Swales Conference Corpus (JSCC): no online interface; downloadable transcripts
- [link] Collins Wordbanks Online English Corpus Concordance Sampler (Part of Collins-COBUILD Corpus/Bank of English)
Downloadable/Easy-Access Corpora
- [link] The John Swales Conference Corpus (JSCC), hosted by Michigan University
- [link] The Lancaster-Oslo/Bergen Corpus (LOB)
- [link] The Brown Corpus
- [link] The Santa Barbara Corpus of Spoken American English
- [link] International Corpus of English (ICE)
*Also: see Corpus Archives and Indexes section below.
For-Fee/Limited-Access Corpora
- [link] British National Corpus (BNC) by BNC Consortium (cf. BYU's online version above)
- [home][catalog] American National Corpus (ANC)
- [home] The Penn Treebank Project
- [home][catalog] LDC Catalog by Type And Source
- [blog][link] Web 1T 5-Gram Corpus by Google
- [link] Cambridge International Corpus
- [link] Cambridge and Nottingham Corpus of Discourse in English (CANCODE)
- [link] Cambridge Learner Corpus (CLC)
- [link] Cambridge and Nottingham Spoken Business English Corpus (CANBEC)
- ... and many more.
- [link] The Collins-COBUILD Corpus / The Bank of English Corpus
- [link] Penn Parsed Corpora of Historical English
- [link][book] International Corpus of Learner English (ICLE)
- [link] Louvain Corpus of Native English Essays (LOCNESS)
- [link] Longman Learners' Corpus
Corpus Archives and Indexes
- [home][catalogue] The Oxford Text Archive
- [link] NLTK (Natural Language Toolkit) Corpora
- [home][index] Corpus Resource Database (CoRD) at Helsinki University
- [home][catalog] LDC Catalog by Type And Source
- [home] LinguistList Texts and Corpora page
Corpora in Other Languages
(Web-searchable ones only)
- [link] Corpus del Espaņol (at BYU)
- [link] Corpus do Portuguęs (at BYU)
- [link] COMPARA: a bidirectional parallel corpus of English and Portuguese
- [link] Corpus de Referencia del Espaņol Actual (CREA)
- [link][ENG] The Russian Reference Corpus
- [link] Hungarian National Corpus (Registration required)
- [link] CORIS/CODIS: Corpus of Written Italian (Registration required)
- [link] The German National Corpus
- [link] The Hellenic National Corpus
- [link] FLLOC: French Learner Language Oral Corpora
- [link] SPOLLOC: Spanish Learner Language Oral Corpora
- [home][concordancer] The Lancaster Corpus of Mandarin Chinese (LCMC)
|