British Academic Spoken English corpus
dc.contributor | Thompson, Paul University of Reading Reading |
dc.contributor.author | Nesi, Hilary |
dc.contributor.author | Thompson, Paul |
dc.date.accessioned | 2018-07-27 |
dc.date.accessioned | 2022-08-19T15:55:58Z |
dc.date.available | 2022-08-19T15:55:58Z |
dc.date.created | 1999-2005 |
dc.date.issued | 2006-12 |
dc.identifier | ota:2525 |
dc.identifier.uri | http://hdl.handle.net/20.500.14106/2525 |
dc.description.abstract | The BASE corpus consists of 160 lectures and 39 seminars recorded in a variety of university departments. Holdings are distributed across four broad disciplinary groups, each represented by 40 lectures and 10 seminars. These groups are: Arts and Humanities, Social Studies and Sciences, Physical Sciences, and Life and Medical Sciences. The lectures and seminars have been transcribed and annotated using a system devised in accordance with the TEI Guidelines. There is a DTD file which must be kept in the same folder as the corpus files, named 'base.dtd'. The transcription and mark-up conventions are described in the 'BASE manual' document which is in PDF format, and the holdings are described in the Excel spreadsheet, 'BASE corpus holdings.xls'. The token count for the entire corpus is 1.6 million, and the files contain the transcripts of nearly 200 hours of recording. Nesi, H. and H. Basturkmen (2006) 'Lexical bundles and discourse signalling in academic lectures'. International Journal of Corpus Linguistics 11(3) 147-168 Thompson, P. (2006) 'A corpus perspective on the lexis of lectures, with a focus on Economics lectures'. In K. Hyland and M. Bondi (eds) Academic Discourse Across Disciplines Bern: Peter Lang, pp. 253-270 Nesi, H. (2002) 'An English spoken academic word list' , in Braasch, A. and Provlsen, C. (eds) Proceedings of the Tenth EURALEX International Congress, Copenhagen: Center for Sprogteknologi Nesi, H. (2001) 'A corpus based analysis of academic lectures across disciplines', in: Cotterill, J. and Ife A. (eds) Language Across Boundaries, London: Continuum Press Also available at: http://www2.warwick.ac.uk/fac/soc/celte/research/base/ |
dc.description.sponsorship | Arts and Humanities Research Council (AHRC) |
dc.description.sponsorship | British Academy |
dc.format.extent | CollectionSound 206 files: ca. 18.5 MB |
dc.format.medium | Digital bitstream |
dc.language | English |
dc.language.iso | eng |
dc.publisher | University of Oxford |
dc.relation.ispartof | Oxford Text Archive Core Collection |
dc.rights | Available for non-commercial use on condition that this header is included in its entirety with any copy distributed. |
dc.rights.uri | https://hdl.handle.net/20.500.14106/licence-ota |
dc.rights.label | ACA |
dc.subject.lcsh | Linguistics |
dc.subject.lcsh | Linguistics analysis (Linguistics) |
dc.subject.other | Linguistic corpora |
dc.title | British Academic Spoken English corpus |
dc.title.alternative | BASE |
dc.type | CollectionSound |
has.files | yes |
branding | Oxford Text Archive |
branding | Oxford Text Archive |
files.size | 4125441 |
files.count | 3 |
relation.uri | https://downloads.it.ox.ac.uk/ota-public/audio/2525.zip |
otaterms.date.range | 1900-1999 |
Files for this item
- Name
- 2525.zip
- Format
- unknown
- Description
- Note
- This file is hosted on an external server
- URI
- https://downloads.it.ox.ac.uk/ota-public/audio/2525.zip
Download all local files for this item (3.93 MB)
- Name
- xml.zip
- Size
- 3.76 MB
- Format
- application/zip
- Description
- Compressed file containing the resource file or files
- xml
- ls
- lslct004.xml95 kB
- lslct036.xml43 kB
- base.dtd72 kB
- lslct024.xml45 kB
- lssem003.xml217 kB
- lslct012.xml30 kB
- lslct032.xml99 kB
- lslct020.xml51 kB
- lslct019.xml97 kB
- lslct040.xml87 kB
- lslct039.xml111 kB
- lslct007.xml69 kB
- lslct027.xml40 kB
- lssem006.xml66 kB
- lslct015.xml49 kB
- lslct035.xml46 kB
- lslct003.xml81 kB
- lslct023.xml61 kB
- lssem002.xml94 kB
- lslct011.xml142 kB
- lslct031.xml82 kB
- lssem010.xml59 kB
- lssem009.xml80 kB
- lslct018.xml60 kB
- lslct006.xml58 kB
- lslct038.xml64 kB
- lssem005.xml56 kB
- lslct026.xml64 kB
- lslct014.xml47 kB
- lslct002.xml94 kB
- lslct034.xml88 kB
- lslct022.xml73 kB
- lssem001.xml47 kB
- lslct010.xml46 kB
- lslct009.xml78 kB
- lslct030.xml84 kB
- lslct029.xml73 kB
- lssem008.xml54 kB
- lslct017.xml56 kB
- lslct005.xml67 kB
- lslct037.xml52 kB
- lslct025.xml51 kB
- lssem004.xml58 kB
- lslct013.xml33 kB
- lslct001.xml85 kB
- lslct033.xml143 kB
- lslct021.xml59 kB
- lslct008.xml57 kB
- lssem007.xml60 kB
- lslct028.xml68 kB
- lslct016.xml34 kB
- ah
- ahlct037.xml85 kB
- base.dtd72 kB
- ahlct005.xml71 kB
- ahlct025.xml86 kB
- ahsem004.xml90 kB
- ahlct013.xml73 kB
- ahlct001.xml97 kB
- ahlct033.xml60 kB
- ahlct021.xml75 kB
- ahlct008.xml57 kB
- ahlct028.xml73 kB
- ahsem007.xml162 kB
- ahlct016.xml74 kB
- ahlct036.xml69 kB
- ahlct004.xml71 kB
- ahsem003.xml72 kB
- ahlct024.xml78 kB
- ahlct012.xml65 kB
- ahlct032.xml57 kB
- ahlct020.xml75 kB
- ahlct019.xml76 kB
- ahlct040.xml51 kB
- ahlct007.xml70 kB
- ahlct039.xml80 kB
- ahlct027.xml64 kB
- ahsem006.xml54 kB
- ahlct015.xml78 kB
- ahlct003.xml71 kB
- ahlct035.xml58 kB
- ahlct023.xml63 kB
- ahsem002.xml23 kB
- ahlct011.xml70 kB
- ahlct031.xml69 kB
- ahsem010.xml71 kB
- ahsem009.xml52 kB
- ahlct018.xml66 kB
- ahlct038.xml70 kB
- ahlct006.xml68 kB
- ahlct026.xml66 kB
- ahsem005.xml74 kB
- ahlct014.xml77 kB
- ahlct034.xml87 kB
- ahlct002.xml60 kB
- ahlct022.xml64 kB
- ahsem001.xml60 kB
- ahlct010.xml71 kB
- ahlct009.xml83 kB
- ahlct030.xml75 kB
- ahlct029.xml72 kB
- ahsem008.xml113 kB
- ahlct017.xml67 kB
- ss
- sslct027.xml76 kB
- sssem006.xml55 kB
- base.dtd72 kB
- sslct015.xml52 kB
- sslct003.xml158 kB
- sslct035.xml97 kB
- sslct023.xml77 kB
- sssem002.xml85 kB
- sslct011.xml122 kB
- sslct031.xml86 kB
- sssem010.xml116 kB
- sssem009.xml59 kB
- sslct018.xml55 kB
- sslct038.xml89 kB
- sslct006.xml112 kB
- sslct026.xml69 kB
- sssem005.xml67 kB
- sslct014.xml64 kB
- sslct034.xml84 kB
- sslct002.xml128 kB
- sslct022.xml54 kB
- sssem001.xml35 kB
- sslct010.xml72 kB
- sslct009.xml147 kB
- sslct030.xml69 kB
- sssem008.xml48 kB
- sslct029.xml89 kB
- sslct017.xml68 kB
- sslct005.xml122 kB
- sslct037.xml64 kB
- sslct025.xml78 kB
- sssem004.xml78 kB
- sslct013.xml43 kB
- sslct033.xml68 kB
- sslct001.xml116 kB
- sslct021.xml55 kB
- sslct008.xml119 kB
- sslct028.xml67 kB
- sssem007.xml45 kB
- sslct016.xml59 kB
- sslct036.xml73 kB
- sslct004.xml87 kB
- sslct024.xml88 kB
- sssem003.xml44 kB
- sslct012.xml102 kB
- sslct032.xml134 kB
- sslct020.xml71 kB
- sslct019.xml64 kB
- sslct040.xml58 kB
- sslct007.xml93 kB
- sslct039.xml82 kB
- ps
- pslct021.xml58 kB
- base.dtd72 kB
- pslct008.xml58 kB
- pslct028.xml51 kB
- pssem007.xml60 kB
- pslct016.xml68 kB
- pslct004.xml43 kB
- pslct036.xml62 kB
- pslct024.xml68 kB
- pssem003.xml72 kB
- pslct012.xml57 kB
- pslct032.xml61 kB
- pslct020.xml48 kB
- pslct019.xml75 kB
- pslct040.xml57 kB
- pslct039.xml54 kB
- pslct007.xml54 kB
- pslct027.xml85 kB
- pssem006.xml53 kB
- pslct015.xml76 kB
- pslct003.xml55 kB
- pslct035.xml71 kB
- pslct023.xml75 kB
- pssem002.xml68 kB
- pslct011.xml74 kB
- pssem010.xml51 kB
- pslct031.xml53 kB
- pslct018.xml62 kB
- pslct038.xml47 kB
- pslct006.xml87 kB
- pslct026.xml36 kB
- pssem005.xml54 kB
- pslct014.xml52 kB
- pslct002.xml132 kB
- pslct034.xml59 kB
- pslct022.xml78 kB
- pssem001.xml60 kB
- pslct010.xml72 kB
- pslct009.xml43 kB
- pslct030.xml54 kB
- pslct029.xml64 kB
- pssem008.xml18 kB
- pslct017.xml71 kB
- pslct005.xml74 kB
- pslct037.xml47 kB
- pslct025.xml56 kB
- pssem004.xml101 kB
- pslct013.xml106 kB
- pslct033.xml71 kB
- pslct001.xml69 kB
- ls