-
Corpus
Oxford Text Archive Core Collection
Date of publication:
2020
Description:
The CorCenCC corpus contains over 11 million words (circa 14.4m tokens) from written, spoken and electronic (online, digital texts) Welsh language sources, taken from a range of genres, language varieties (regional and ...
This item contains 1 file (49.41
KB).
Publicly Available
-
-
Corpus
Oxford Text Archive Core Collection
Date of publication:
2004
Author(s):
Unknown author
Description:
The Lancaster Corpus of Mandarin Chinese (LCMC) is designed as a Chinese match for the FLOB and FROWN corpora for modern British and American English. The corpus is suitable for use in both monolingual research into modern ...
This item contains 1 file (6.33
MB).
Publicly Available
-
-
Corpus
Oxford Text Archive Core Collection
Date of publication:
2004
Description:
Mode of access: Online. OTA website The rudimentary form of the Sheffield Corpus of Chinese contains a limited body of representative texts from Medieval (MedC) and Modern Chinese (ModC) periods. They are of two text types: ...
This item contains 1 file (138.18
KB).
Publicly Available
-
-
CollectionSound
Oxford Text Archive Core Collection
Date of publication:
2015
Description:
The resource is a speech corpus, with digital audio files, text transcripts, and files containing time stamps of the phoneme boundaries. There are 1813 .wav files containing spoken utterances, 1813 .lab files containing ...
This item contains 3 files (1.97
MB).
Publicly Available
-
-
Corpus
Oxford Text Archive Core Collection
Date of publication:
2002-2004
Description:
Mode of access: Online. Application to OTA
This corpus contains 979,831 words, made up of 1723 articles taken from three daily French newspapers:
Le Monde (576 articles ...
This item contains 1 file (3.34
MB).
Publicly Available
-
-
Corpus
Oxford Text Archive Core Collection
Date of publication:
2003
Author(s):
Unknown author
Description:
The collection consists of: Thirty million words of monolingual written data (Gujarati, Tamil, Hindi, Punjabi-news website articles); 600,000 words of monolingual spoken data (Hindi, Urdu, Punjabi, Bengali, Gujarati-radio ...
This item contains 9 files (108.26
MB).
Publicly Available
-
-
Corpus
Oxford Text Archive Core Collection
Date of publication:
2001-2009
Description:
The download now also includes an updated version of VOICE XML (VOICE 2.0 XML) and a part-of-speech tagged and lemmatized version of VOICE (VOICE POS XML). The primary language of the corpus is English as a lingua franca, ...
This item contains 1 file (48.05
MB).
-
-
CollectionText
Oxford Text Archive Core Collection
Date of publication:
2001
Description:
The four major objectives of the project were: i) to establish an electronic corpus of (a) conversations, from the British National Corpus (BNC) and (b) oral narratives, from Lancaster's Centre for North Western Regional ...
This item contains 1 file (2.02
MB).
-
-
CollectionSound
Oxford Text Archive Core Collection
Date of publication:
2014-2016
Description:
Publications based on the data include:
Ayafor, Miriam and Melanie Green (2017). Cameroon Pidgin English: A comprehensive grammar [London Oriental and African Language Library 20]. Amsterdam: John ...
This item contains 1 file (1.41
MB).
Publicly Available
-
-
Corpus
Oxford Text Archive Core Collection
Date of publication:
2002-2005
Description:
Mode of access: Online. OTA website Gidian Archives is a database created from the press cuttings preserved by André Gide in his personal collection. The archive consists of articles published during his life on the author's ...
This item contains 1 file (1.67
MB).
Publicly Available
-