========================================= Readme for the Lancaster Newsbooks Corpus ========================================= This corpus consists of two collections of seventeenth-century English "newsbooks". Both were drawn from the Thomason Tracts collection, which is held at the British Library and available in graphical form via Early English Books Online (EEBO). The construction of these keyboarded versions were in both cases funded by the British Academy. The FIRST collection (1654_newsbooks) consists of every newsbook published in London and still surviving in the Thomason Tracts from the first half of 1654 (to be precise, for the second half of December 1653 to the end of May 1654, with one or two additions from the first week in June, 1654). This was constructed for the project "Looking at text re-use in a corpus of seventeenth-century news reportage", funded by the British Academy, grant reference SG-33825. The SECOND collection (mercurius_fumigosus) consists of every surviving issue published of the highly idiosyncratic newsbook "Mercurius Fumigosus", written by John Crouch between summer 1654 and early autumn 1655. This was constructed for the project "Decoding the news - Mercurius Fumigosus as a source of news in the interregnum, 1654-1655", funded by the British Academy, grant reference LRG-35423. This is version 1.0 of the corpus, released April 2007; it supercedes earlier versions circulated informally. For more information about the corpus, see www.ling.lancs.ac.uk/newsbooks Format ====== The corpus is stored as a set of 303 XML files. The XML markup complies with a customised DTD, largely based on the tags used in the TEI and in XHTML. XML comments () have been used at many points in the corpus for transcribers to add comments about layout, uncertain readings, and other issues. The total size of the corpus is 7.6 MB. The two collections are stored in separate folders. For convenience, the DTD and files containing the XHTML entity references used in the corpus are repeated in each folder. Credits ======= This corpus was created by the Department of Linguistics and English Language, Lancaster University. Project leader: Tony McEnery Corpus editor: Andrew Hardie Word counts: ============ --Newsbooks titles with long runs-- The true and Perfect Dutch-Diurnall Total Wordcount: 21,687 Every Day's Intelligence Total Wordcount: 67,618 The Faithful Scout Total Wordcount: 64,677 The Faithful Scout (printed by George Horton) Total Wordcount: 22,662 Mercurius Democritus, by John Crouch Total Wordcount: 8,462 Mercurius Fumigosus, by John Crouch Total Wordcount: 130,282 Mercurius Politicus, by Marchamont Nedham Total Wordcount: 118,667 The Moderate Intelligencer Total Wordcount: 40,490 A Perfect Account Total Wordcount: 55,967 Perfect Diurnall Occurrences Total Wordcount: 16,946 The Perfect Diurnall of some Passages and Proceedings Total Wordcount: 133,381 Proceedings of State Affairs Total Wordcount: 138,980 The Weekly Intelligencer of the Commonwealth Total Wordcount: 68,225 The Weekly / Politique Post Total Wordcount: 55,876 --Short Titles-- Perfect Occurrences Total Wordcount: 8634 The True (and Perfect) Informer Total Wordcount: 7697 The Loyal Intelligencer Total Wordcount: 2528 Mercurius Aulicus 1654 Total Wordcount: 8069 The Loyal Messenger Total Wordcount: 2903 Mercurius Poeticus Total Wordcount: 1950 Mercurius Nullus Total Wordcount: 4360 The Politique Informer Total Wordcount: 6913 The Grannd Politique Post (pirate) Total Wordcount: 3495 Perfect and Impartial Intelligence Total Wordcount: 8779 Grand total word count: 999,248 868,966 in the 1654_newsbooks section; 130,282 in the mercurius_fumigosus section.