home

Studies in Variation, Contacts and Change in English

Volume 20 – Corpus Approaches into World Englishes and Language Contrasts

Contents

Abstracts

Introduction

Hanna Parviainen
Language Centre, University of Helsinki
Faculty of Information Technology and Communication Sciences, Tampere University

The 39th Annual Conference of the International Computer Archive for Modern and Medieval English (ICAME 39) was organised by the University of Tampere in Finland from May 30 to June 3, 2018. The theme of the conference was Corpus Linguistics and Changing Society, and the event included four pre-conference workshops, 73 full papers, 13 WIPs, 10 posters and one software demonstration. In addition to the papers focusing on the main theme of the conference, some of which will be published in the sister volume Corpora and the Changing Society: Studies in the Evolution of English (Rautionaho, Nurmi & Klemola forthcoming) by John Benjamins, many others formed smaller thematic groups that were selected to be published in the present volume, which thus follows in the footsteps of a number of previous ICAME conferences that have also published in the VARIENG Studies in Variation, Contacts and Change in English series (Hoffmann et al. 2017; Huber & Mukherjee 2013; Ebeling et al. 2012).

The present volume brings together studies exploring various features in World Englishes, learner Englishes and language contact situations, and the structural elements under investigation range from verb complementation and noun complexity issues to linguistic innovations. Some of the chapters are more theoretical in nature, introducing methodologies which can be employed to reveal new dimensions of existing data, while others present innovative solutions to some of the challenges faced by contemporary corpus linguists by presenting new methods on how to compile and present data gathered from online sources. Despite the multifaceted nature of the topics examined in this volume, there are several key elements which bind them together. Firstly, together they present an excellent snapshot of the variation that exists in modern-day corpus linguistics. Secondly, they explore the contacts between languages and society and the ways in which corpora can be used to study the dynamics between the two. Thirdly, all the articles in the proposed volume deal with change, be it from the perspective of languages or the newest trends in the collection and analysis of data.

The chapters in this publication are organised around the following thematic groups:

  1. Innovation and variation around the world
  2. Translingualism
  3. New directions in corpus linguistics

The first four chapters investigate the variation and innovations that can be found in World Englishes and learner varieties spoken around the world. Hilde Hasselgård uses two corpora (VESPA and BAWE) to examine how novice L1 and L2 users or academic English differ in their use of ‘the N1 of the N2’ colligation. Even though using complex phrasal patterns is usually tied to high language proficiency, the results of the study are surprising as they indicate that the two groups are actually quite close to one another in their use of the structure. Nevertheless, some discrepancies between the groups could be observed, such as the higher tendency of L1 speakers to use nominalized N1s, whereas members of the L2 group had a slightly elevated frequency of using the construction with possessive and partitive meanings. Another chapter focusing on the use of English in academic contexts comes from Yu-Hua Chen, Simon Harrison and Robert Weekly, who examine the complicated relationship between ‘learner mistakes’ and ‘features of ELF’. By presenting comparable samples from VOICE and CAWSE (a corpus of L2 English from China), the authors argue that the difference between ‘mistakes’ and ‘features’ often depends on the context. The WE perspective is also present in the paper by Raquel P. Romasanta, who uses data from GloWbE to study the complementation profile of the verb REGRET (I regret (that) I said that/ saying that) in AmE, BrE, HKE and NigE. The results of the study indicate that unlike the L1 varieties, the L2 varieties favour the declarative finite that/zero-complement structure over the gerund, which, according to Romasanta, can be explained with a combination of substrate influence, SLA processes and the cognitive effects of contact language situations in the L2. Other intra- and extra-linguistic factors contributing to the patterns of preference (e.g. passivization, and subject type) are also examined using binary logistic regression analysis. The final chapter of the thematic group Innovation and variation around the world comes from Patricia Ronan, who uses two corpora (COCA and GloWbE) to trace the diversifying uses of X-much, while also mapping the geographical spread of the construction. According to the results of the study, the feature is used most frequently in AmE, followed by some Pacific varieties of English, whereas the lowest usage was located in some African and South Asian Englishes.

The second thematic group in this volume consists of articles focusing on issues related to different aspects of translingualism. Gisle Andersen examines phraseological borrowings using data from two English (COCA and COHA) and two Norwegian corpora (the Norwegian Newspaper Corpus and the National Library’s Text Archive). By employing a diachronic-contrastive method to corpus studies, Andersen shows how corpora can be used to reveal whether parallel phrasemes in two languages are likely to be caused by borrowing or are they simply parallel developments in the two languages. Another study that deals with multilingualism comes from Antorlina Mandal and Leonie Wiemeyer, who use the CALE corpus to examine the presence and function of foreign elements in linguistic research papers written by EFL learners whose L1 is German. Their results indicate that the foreign elements found in the texts were mostly individual words and phrases, but unlike in spoken EFL, they were not used to fill lexical gaps – instead, their use was motivated by the writers’ academic goals, thus actually exemplifying their multilingual competences.

The last theme of the volume focuses on new directions in corpus linguistics by providing interesting snapshots of some of the more recent technical and methodological developments in the field. Seth Mehl provides a critical exploration of the traditional methods used in corpus linguistics, the Mutual Information and Pearson’s chi-square tests, and compares them to the results obtained by calculating the statistical probability of co-occurrence based on a grammatical part of speech (POS) baseline. The results of the study indicate that the two traditional tests can potentially give artificially significant results, while the author also recognises that the POS-baseline might not provide significant improvements to the way the commonly used top ten co-occurrence pairs for a given word are formulated. Some recent technological advances in the field of data collection are discussed in a chapter by Martin Weisser, who is the developer of ICEweb 2, a programme which is designed to help compilers of the new ICE-corpora collect written data from online sources by reducing their need to use multiple tools when collecting, processing and analysing the data. The last chapter in this publication is by Aris Alissandrakis, Nico Reski, Mikko Laitinen, Jukka Tyrkkö and Magnus Levin, who use data from the Nordic Tweet Stream corpus to show how new immersive virtual reality (VR) can be used to gain new perspectives on traditional corpus data when examining variation in space and time.

Our deepest thanks go to both the authors without whose contributions this publication would not have been possible and to the anonymous reviewers whose valuable feedback helped us in the making of this volume. We would also like to thank our sponsors (City of Tampere, University of Tampere, University of Eastern Finland, the Federation of the Finnish Learned Societies, John Benjamins Publishing Company, Peter Lang Publishing Group, Cambridge Scholars Publishing and the Emil Aaltonen Foundation) for their financial support of ICAME 39 and all the people who contributed to the making of the conference.

References

Ebeling, Signe Oksefjell, Jarle Ebeling & Hilde Hasselgård, eds. 2012. Aspects of Corpus Linguistics: Compilation, Annotation, Analysis (Studies in Variation, Contacts and Change in English 12). Helsinki: VARIENG. http://www.helsinki.fi/varieng/series/volumes/12/

Hoffmann, Sebastian, Andrea Sand & Sabine Arndt-Lappe, eds. 2017. Exploring Recent Diachrony: Corpus Studies of Lexicogrammar and Language Practices in Late Modern English (Studies in Variation, Contacts and Change in English 18). Helsinki: VARIENG. http://www.helsinki.fi/varieng/series/volumes/18/

Huber, Magnus & Joybrato Mukherjee, eds. 2013. Corpus Linguistics and Variation in English: Focus on Non-Native Englishes (Studies in Variation, Contacts and Change in English 13). Helsinki: VARIENG. http://www.helsinki.fi/varieng/series/volumes/13/

Rautionaho, Paula, Arja Nurmi & Juhani Klemola, eds. Forthcoming. Corpora and the Changing Society: Studies in the Evolution of English.

University of Helsinki