DiaCollo: Corpora

Public Corpora

Historical Corpora

  • dta: Deutsches Textarchiv (1600-1900)
  • dingler: Polytechnisches Journal (1820-1931)
  • DSDK: Digitale Sammlung Deutscher Kolonialismus (1884-1919)
  • grenzboten: Die Grenzboten (1841-1922)
  • rem: Referenzkorpus Mittelhochdeutsch (1050–1350)

Newspaper Corpora

  • bz: Berliner Zeitung (1994-2005)
  • tagesspiegel: Tagesspiegel (1996-2004)
  • zeit: ZEIT (1946-2018)

Synchronic Corpora

Aggregated Corpora

  • dta+dwds: DTA+DWDS (1600-1999)
  • public: public (+newspapers, 1600-2018)

Non-German Corpora


Restricted Corpora

CLARIN Corpora (*)

* non-public: authentication via CLARIN credentials required

DWDS Corpora (**)

** non-public: authentication via www.dwds.de credentials required
  • ibk_dchat: Dortmund Chat Corpus (1998-2006)
  • ibk_web_2016c: Webcorpus 2016c (2001-2016)
  • textberg: Jahrbuch des Schweizer-Alpenclubs (1864–2015; Academic use only)
  • ... see https://www.dwds.de/r/ for an up-to-date list of all DiaCollo instances currently hosted by the DWDS project at the BBAW. Click on the DiaCollo icon () in the "Tools" column to access the DiaCollo GUI for a particular corpus.