posted on 2019-07-10, 08:55authored byNicholas Smith, Cathleen Waters
Corpus-based studies of specialized registers typically sample texts using random
methods as far as possible, but they disregard social characteristics of the
speakers/writers. In contrast, in corpus-based studies of conversation and quantitative
sociolinguistic studies, sampling is more typically designed to optimize social
representation. To our knowledge, this study is the first to compare linguistic
outcomes from random versus sociolinguistic sampling in a specialized register. Our
data comes from the biographical radio chat show, Desert Island Discs (DID), at
different points in time. We constructed two versions of a DID corpus: a
sociolinguistic judgment sample based on guest demographics, and a random sample.
We compare grammatical usage between them using an inductive (‘key POS-tags’)
method and close manual analysis, uncovering some evidence of significant
grammatical differences between the samples and differing patterns of diachronic
change. We discuss the implications of our research for corpus design,
representativeness and analysis in specialized registers.
Funding
We acknowledge the University of Leicester for periods of study leave
and a small research grant.
History
Citation
International Journal of Corpus Linguistics, 2019, 24 (2) , p. 169 - 201
Author affiliation
/Organisation/COLLEGE OF SOCIAL SCIENCES, ARTS AND HUMANITIES/School of Arts
The file associated with this record is under embargo until publication, in accordance with the publisher's self-archiving policy. The full text may be available through the publisher links provided above.