Week 8 - Discourse and Corpora

  • Corpus linguistics is the analysis of corpora
  • Corpus = Collection of naturally occuring language texts, chosen to characterise a state or a variety of language. Usually study real world language and generate statistics

Types of Corpora

  • General Corpora - Reference Corpora = BNC (British National Corpora) representative sample of a language variety. Internet based = google and wikipedia
  • Specialist (sub)corpora - Type eg. Twitter, Genre eg. newspapers, poetry, Time Specific eg. 19th century or contemporary, Single Author eg. All works of Dickens

Corpus Tools and Terminology

  • Concordance = Point is to see lots of examples of a word or phrase in their contexts



