Studies in Variation, Contacts and Change in English

Volume 20 – Corpus Approaches into World Englishes and Language Contrasts

Phraseology in a cross-linguistic perspective: Introducing the diachronic-contrastive corpus method

Gisle Andersen
Department of Professional and Intercultural Communication, NHH Norwegian School of Economics


The inventory of phrasemes in a language is not static, but new patterns of lexical co-occurrence evolve over time, and such new patterns may be the result of external influence due to language contact. Thus, a cross-linguistically parallel phraseme such as English to go for X and Norwegian å gå for X, in the sense of ‘choose among several options’, for instance from a menu, may – but need not – be the result of indirect borrowing (Backus 2014). In this paper I investigate ‘the largely unexplored area of phraseological borrowing’ (Fiedler 2017: 90). I first present a typological survey that draws on the work of Granger and Paquot (2008) and Fiedler’s (2017) recent work on phraseological Anglicisms in German. Next, I show how a diachronic-contrastive corpus method can be devised to investigate the question of whether cross-linguistically parallel phrasemes are the result of borrowing or parallel developments, and as a vehicle for rejecting preconceived ideas about a form’s alleged origin in English. The approach is based on diachronic and synchronic corpora of English (COCA and COHA) and Norwegian (the Norwegian Newspaper Corpus and the National Library’s Text Archive).


It is well known that the English language exerts considerable influence on other languages at the lexical level, as seen from extensive borrowing of terminology and everyday words into many languages of the world (i.e. Anglicisms such as controller, blogg, awesome!, etc.). Although much less studied, it is also clear that the ‘phrasicon’ (Granger 2009) of a language, i.e. its ‘inventory of communicative formulae, catchphrases, slogans and other multi-word items’ (Fiedler 2017: 90) may be similarly affected by such external influence (cf. Andersen 2010, 2017; Fiedler 2012, 2014, 2017). Recent studies of language contact have seen a shift in focus from the individual lexeme (word, or term, in domain-specific contexts) towards longer units of discourse, multiword expressions and phraseology. The study of phraseological borrowing has been launched as a subfield of contact linguistics, and Fiedler observed that this ‘has not received much scholarly attention so far’ (Fiedler 2017: 89). Citing a handful of publications, she observes that phraseological units like greetings, discourse markers, proverbs, catchphrases and other types of pre-fabricated constructions ‘have significant pragmatic implications, because they are closely related to culturally influenced text patterns, discourse norms and speaker attitudes’ (Fiedler 2017: 89). The aim of this paper is to consolidate a generic taxonomy of phraseology with a taxonomy of borrowing by means of a survey that shows how the English language has influenced the phrasicon of Norwegian. Further, the aim is to explore ways of adding empirical evidence that may support or reject hypotheses of borrowing in (often contentious) cases that do not involve any direct use of source language material. Relevant examples of this are expressions such as the discourse marker tingen er at ‘the thing is that’, det er opp til deg ‘it is up to you’, or å gjøre en forskjell ‘to make a difference’, etc. These are commonly criticised as alleged loan translations from English into Norwegian, but it is rarely documented via empirical means that they actually constitute cases of borrowing. The article thus focuses on phraseological units that involve a formal and functional parallelism across two languages that may be contact-induced. However, we cannot rule out alternative explanations, since the parallelism may instead be due to parallel developments in the two languages or part of a common Germanic phraseological heritage. I devise a research method that involves comparison of diachronic corpora. This will be used to test empirically individual hypotheses of borrowing for two phrasemes, the linking adverbial i et nøtteskall ‘in a nutshell’ and the discourse marker tingen er at ‘the thing is that’.


Phraseology can be defined as ‘the study of the structure, meaning and use of word combinations’ (Cowie 1994: 3168). The ‘phrasicon’ of a language (Granger 2009) covers many different types of structures, from common collocations such as strong tea via phrasal verbs such as go for and idioms like by and large or start from scratch, to a range of different idiomatic multiword units. The individual unit studied in phraseology, the phraseological unit, ‘phraseme’ (Mel’čuk 2012: 32) or ‘phraseologism’ (Gries 2008: 3) can be defined as a symbolic unit constituted as a conventionalised association of a form and a meaning/function with the following properties:

This definition and its criteria of inclusion capture a whole range of phenomena. As phraseology is a notoriously many-faceted object of study, and since ‘[t]ypologies abound in the literature’ (Granger & Paquot 2008: 35), it is not my intention here to account fully for existing categorisations. Rather, I wish to consolidate Fiedler’s (2017) formally based classification with the functionally based classification of Granger and Paquot (2008) into various types of referential, textual and communicative phrasemes. The main purpose of this is to show the variability of structures that can be borrowed, exemplified through cases of English to Norwegian borrowing.

Fiedler (2017) considers the linguistic source of the lexical material of the phraseological unit as the main characteristic. She uses the criteria for substitution and importation of lexical items put forth by Haugen (1950) and Weinreich (1953) as a basis for her classification. A phraseological borrowing can either be a) direct (or unadapted), as in big business and point of no return in Norwegian, where a phrase has been borrowed wholesale from English, b) hybrid, which amounts to a partial substitution with at least one recipient language (RL) form intact, as in ta det easy ‘take it easy’, or c) indirect, involving a complete loan translation from the source language (SL) to the RL, as in det er opp til deg ‘it is up to you’ (cf. Graedler & Johansson 1997: 10). [1] Fiedler (2014, 2017) shows that for German, loan translations by far outnumber the other categories. Like borrowings generally, phraseological borrowings quickly adapt to the morphology of the RL. A case in point is the verbal collocation å sette (en) deadline ‘to set a deadline’, a hybrid which displays the regular paradigmatic variability of the RL verb (sette-setter-satte-satt) and the domestication of the indefinite article, whilst the semantic ‘pivot’ of meaning or the ‘base’ of the collocation (Mel’cuk 2012: 36, 39), deadline, remains intact in its SL form.

In addition to linguistic form, a cross-linguistic account of phraseological borrowing needs to consider the functional properties in each language, more specifically the degree of functional parallelism of a phraseme in the SL and RL. As is well known, lexical borrowing is often characterised by post hoc modification in terms of semantic narrowing, broadening or shift; cf. the Anglicism mail, which has a narrower meaning (‘email’) in Norwegian than its English etymon. Similarly, we can expect differences between the SL and the RL in the functional properties of phrasemes. Although we can generally assume that many or most pragmatic functions are transferred in borrowing, it is clear that a phraseme may also undergo subsequent functional change (Andersen 2014). A case in point is the politeness phrase thank you, which, alongside sorry and please, has been studied in Cypriot Greek by Terkourafi (2011). She shows that ‘once borrowed, the English terms are gradually bleached of their speech-act signalling potential and increasingly come to function as discourse markers, serving to locally manage sequential aspects of discourse structure’ (Terkourafi 2011: 218). Thus, a comprehensive study of phraseological borrowing should investigate the functional range of a borrowed phraseme in both the SL and RL.

Further, the empirical study of borrowing needs robust data and methods to provide support for claims that a certain phraseme in a language is indeed the outcome of contact with another language. In order to document, say, that the Dutch expression ik ga for de steak ‘I’ll go for the steak’ is a loan translation of English to go for X, as Backus (2014) assumes, there is a need to explore language resources in both languages, which is indeed the main tenet of this article. The investigation should include primary sources like corpora and text archives, as well as secondary sources like dictionaries – especially large documentation dictionaries rich in information about etymology and early usage. Exploring a hypothesis of phraseological borrowing requires rigorous analysis of the phrasemic equivalents and their communicative functions in diachronic and contemporary corpora in both the SL and RL. Such resources enable us to document first occurrence, longitudinal frequency developments and contemporary usage.

Finally, a comprehensive account of phraseological borrowing should ideally also address the social indexicality associated with various borrowed forms and how this may be retained or altered in the transfer from the SL to the RL, although this will not be pursued in this article. It is well known that what is idiomatic in language differs across time and speaker groups. What is much less documented is how borrowing is governed by social and contextual factors. Of particular interest is the sociolinguistic factors that of so-called non-catachrestic loans (Onysko & Winter-Froemel 2011), which refers to the emergence of a novel foreign expression despite the existence of a domestic (near-)equivalent. We can assume that sociolinguistic factors like age, gender and social background have a bearing on speaker’s choice between the domestic or foreign alternative. Terkourafi (2011) has suggested that formality of the speech situation correlates with the choice between thank you and its domestic alternative, and that the use of this form differs significantly in Standard vs. Cypriot Greek, in that in the first case it is ‘almost exclusively restricted to youth language’ (Terkourafi 2011: 224), while in the Cypriot Greek it has a much wider social distribution.

Table 1 is intended to document the existence of equivalents of Fiedler’s (2017) German examples and show that the same formal variability exists in Norwegian.

Category Phraseme Corpus example
Direct phraseological borrowing: phrases and sentences that have been imported wholesale from English meet and greet Personen med det høyeste budet vil ikke bare sikre seg konsertbilletter, men også en meet and greet med den canadiske superstjernen. (NNC/DB/2012-10-19)
The highest bidder will not only get tickets but also a meet and greet with the Canadian superstar. 
blind date Tiger møtte advokatstudenten på en blind date. (NNC/DB/2000-08-13)
Tiger met the law student on a blind date.
Hybrid forms: characterised by partial substitution/retention of at least one SL component å sette (en) deadline
‘to set (a) deadline’
Til slutt måtte vi bare sette en deadline og få gang på det. (NNC/AA/2015-08-20)
Finally we just had to set a deadline and get it started.
å ha noe/være/ligge i pipeline
‘to have something/be/lie in the pipeline’
På Oslo Børs ligger det nå åtte selskaper i pipeline for børsnotering. (NNC/AP/2005-10-24)
On the Oslo Stock Exchange there are now eight companies in the pipeline to be listed.
Indirect phraseological borrowing (loan translation): lexical material fully substituted by morphemes of the RL X er ikke rakettvitenskap
‘X is not rocket science’
Å bli mobilspiller er heller ikke rakettvitenskap. (NNC/DB/2013-04-03)
To become a mobile player is also not rocket science.
å adressere et problem
‘to address a problem’
Vi håper vi kan adressere de samme problemene i Norge som vi har gjort i Sverige (NNC/DN/2002-01-09)
We hope that we can address the same problems in Norway as we have done in Sweden

Table 1. Fiedler’s (2017: 90f) classification exemplified in Norwegian.

Fiedler (2017) points out that a single SL item may emerge in the RL with multiple categorical membership, as in the case of Nice try/Netter Versuch, realised both as a direct and indirect borrowing (Fiedler 2017: 91). The same pair of variants occurs in Norwegian as Nice try and Godt forsøk. Importantly, however, this variability does not entail that the domestic form is a result of linguistic borrowing, as the expressions may have co-existed for a long time. As noted above, one needs a cross-linguistic empirical approach to support that a form is contact-induced.

Granger and Paquot (2008) provide not only a useful survey of phraseological taxonomies, but also their own functionally based classification. This reconciles typologies stemming from work in lexicology and lexicography (most notably Cowie 1994 and Mel’čuk 1995, 1998) with distributionally based typologies, thus, ‘integrat[ing] the new insights derived from the corpus-based approach’, where phrasemes are identified from n-gram statistics. Inspired by Burger’s (1998) classification, which is ‘primarily based on the function of phraseological units in discourse’ (Granger & Paquot 2008: 9), the authors distinguish three broad categories of phrasemes with referential, textual or communicative functions, each comprising several phrasemic types. Table 2 presents their categorical survey supplied with examples of phraseological borrowing from English into Norwegian.

Functional category Phrasemic category Characteristics Example Granger & Paquot (2008) Borrowing ENG→NOR
Referential phrasemes Lexical collocations preferred syntagmatic relations between lexemes perform a task, heavy rain føl deg fri til å ‘feel free to’,
gratis lunsj ‘free lunch’
Idioms constructed around a verbal nucleus, semantic non-compositionality to spill the beans å gå den ekstra milen ‘to go the extra mile’
Irreversible bi-/trinominals fixed 2-3 word form sequences linked with and/or bed and breakfast, left, right and centre bed and breakfast, knask eller knep ‘trick or treat’
Similes stereotyped comparisons to swear like a trooper, as old as the hills sprek som ei fele ‘fit as a fiddle’*
Compounds two or more lexemes with independent status outside word combination goldfish, black hole kirsebærplukking ‘cherrypicking’ (also plukke kirsebær V),
sort hull ‘black hole’
Phrasal verbs combinations of verb and adjectival particle blow up, make out frike ut ‘freak out’
work out
Grammatical collocations restricted combinations of lexical and grammatical word, typ. V/N/adj. + prep. depend on, cope with kaffe to go,
for en kaffe
Textual phrasemes Complex prepositions grammaticalised prep. + N/adv./adj. + prep. combinations with respect to, apart from, in addition to  
Complex conjunctions grammatical sequences w. conjunction function so that, as soon as, even though  
Linking adverbials various phrases w. adverbial function last but not least, in other words by the way
Textual sentence stems routinised fragments w. text-organising function, usu. w. subject + verb another thing is, it will be shown that når det kommer til  ‘when it comes to’
Communicative phrasemes Speech act formulae routine formulae w. discourse-pragmatic function, greetings, etc. good morning, take care, how do you do thank you, see you, love you
Attitudinal formulae (including attitudinal sentence stems) signal speaker attitude in fact, to be honest, it is clear that for X's sake, holy X, get over it, OMG
Proverbs and proverb fragments express general ideas by means of non-literal meaning (metaphor, metonymy) When in Rome …
A bird in the hand is worth two in the bush.
When in Rome …
Commonplaces non-metaphorical complete sentences expressing truisms/tautologies Enough is enough; We only live once. Shit happens.
Slogans short directive phrases used repeatedly in politics or advertising Make love, not war;
Coke is it
Coke is it
Idiomatic sentences (listed but not explicitly discussed or exemplified by G&P)   What’s in it for me?
Quotations (listed but not explicitly discussed or exemplified by G&P)   We shall never surrender.

Table 2. Functional classification (Granger & Paquot 2008) and exemplification of phraseological borrowing.
*non-repetitive, not found in corpora but only in dictionary

The examples listed in the rightmost column are all phrasemes according to the criteria listed above – they are polylexemic, idiomatic (non-transparent), lexicalised (ready-made, not created productively by user), and syntactically and semantically stable. Appendix 1 documents this in the form of a table with corpus examples and frequency statistics of the listed forms. The survey shows that phraseological borrowing is a phenomenon that is widespread enough to cover most of the phrasemic categories laid out by Granger and Paquot (2008). A few remarks about the inventory are called for. Firstly, the category of Similes of the type to swear like a trooper is very scarcely documented in the Norwegian data consulted (cf. Section 3). I only found one token of the expression fit as a fiddle in its adapted form (Appendix 1), and this was not in an authentic language corpus but an example from a dictionary. I therefore consider this at best to be a marginal category of phraseological borrowing. Secondly, two categories do not meet the criteria of what counts as borrowing, namely Slogans and Quotations. Given their quotative status, these must rather be construed as cases of code-switching and not of borrowing. Finally, no examples were found of complex prepositions and complex conjunctions that result from borrowing. This interesting fact conforms well with the long-accepted thesis that grammatical words are much less prone to borrowing than lexical words (and phrases) and are thus placed lower in hierarchies of borrowability (e.g. Haugen 1950).

Considering the cross-cutting taxonomies of Fielder (2017) and Granger and Paquot (2008) in combination, it becomes clear that direct borrowings, hybrids and indirect borrowings are all represented in Table 2, but their distribution across the categories is skewed in the case of English-to-Norwegian borrowing. Granger and Paquot’s (2008) third group, the communicative phrasemes, are exclusively represented by direct borrowings; e.g. the idiomatic sentence What’s in it for me?, the attitudinal formula get over it, etc. Textual phrasemes are represented by a direct and an indirect borrowing, in the linking adverbial by the way and the textual sentence stem (a.k.a. discourse topic marker; cf. Fraser 1996; Andersen 2016) når det kommer til ‘when it comes to’. Referential phrasemes display the greatest formal variability and include direct and indirect borrowings, e.g. bed and breakfast and føl deg fri til å ‘feel free to’, respectively, as well as hybrids, e.g. frike ut ‘freak out’, which consists of a spelling-adapted form of ‘freak’ and the domestic preposition ut ‘out’. It is uncertain, however, what generalisations can be made about the form/function relationship beyond this survey, as the given forms are meant as paradigmatic examples of a category but not as a comprehensive survey of all the forms that occur in actual usage. A final point to be made is that the indirect borrowings include a category that entails only minimal change of an existing grammatical collocation that appears to be English-induced, namely the expression for en kaffe ‘for a coffee’. In its usage-context of pleasant invitation and offer of a drink, a meal or similar, it seems likely that this is a novel version of an existing domestic expression en kaffe lit. ‘on a coffee’ whose anglification solely involves the RL-internal replacement of a domestic preposition with another to match an English model. Gottlieb (2004) makes a similar observation of an English-induced change in prepositional choice in Danish.


The criteria outlined in the previous section require access to language resources that can shed light on usage and frequency of phrasemes in both the SL and RL. In principle, any empirical documentation of a phraseme may be of anecdotal interest, but a convincing analysis must extend beyond haphazard collection of individual tokens whether observed in media or spoken discourse. In this section I first describe a set of resources needed for a non-intuitive documentation of English-to-Norwegian borrowing, before I outline the diachronic-contrastive corpus method.


In order to assess whether similar meanings and usage patterns are found in the source language and recipient language, two large and generally comparable contemporary corpora have been selected, namely the Corpus of Contemporary American English (COCA; Davies 2009) and the Norwegian Newspaper Corpus (NNC; Andersen & Hofland 2012). The NNC is seen as relevant for analysis of Anglicism candidates in the RL, since forms that are used repeatedly and consistently in journalistic writing have reached a stage of maturity and adaptation to such a degree that they can be justifiably considered more than idiosyncratic code-switches but relatively stable linguistic borrowings. COCA (1990–2017) and NNC (1998-present) are comparable corpora given their time span and composition, although they are not entirely similar. Both are web-based monitor corpora that contain large amounts of newspaper text. It should be pointed out, though, that COCA also contains spoken dialogue, while NNC also contains readers’ comments to published articles. Since the tertium comparationis of this study is not frequency of use in these two contemporary corpora, but degree of functional equivalence and the phraseme’s frequency developments over time, I consider the slight difference in composition between COCA and NNC to be of minor significance.

In order to study diachronic development, I have searched the National Library’s Text Archive (Nasjonalbiblioteket; henceforth NB; a.k.a. Bokhylla.no) as a significant empirical source. [3] This is a searchable digital text archive containing the full collection of the National Library’s printed material dating from 1690 to 2013. This massive collection includes all books published in Norway during this period, as well as a wide selection of newspapers, periodicals and several other text types. For reasons of comparison and practicality, the searches were restricted to newspapers. The diachronic corpus used to document diachrony in English was the Corpus of Historical American English (COHA), which contains some 400 million words of fiction, popular magazines, newspapers and non-fiction books covering the period from 1810–2010.

In addition to these primary sources representing authentic language use, I consulted a set of dictionaries for documentation of the earliest attestation of specific usage types of phrasemes, to the extent that these were recorded there. These include the OED Online, the NAOB dictionary for Norwegian Bokmål and the dictionary Norsk ordbok (henceforth NO) for Norwegian Nynorsk.


Using these historical and contemporary sources to document borrowing involves the following analytical steps.

  1. Establish whether a phraseme constitutes a functionally equivalent pair in the two languages by comparing its use in contemporary corpora;  
  2. check if the phraseme is registered in large documentation dictionaries and document their earliest attestation;
  3. determine the frequency profile, i.e. the longitudinal frequency development observed in diachronic corpora in both languages, and assessing whether the statistics demonstrate a significant frequency development (increase or decrease), a common fluctuation, or stability;
  4. match the earliest occurrence and frequency profiles against each other in order to estimate likely time of transfer in cases of borrowing.

As a proof of concept, I will in what follows demonstrate this method through the analysis of an expression which is unquestionably an EN→NO borrowing, namely shit happens. This classifies as a directly borrowed attitudinal formula in Granger and Paquot’s (2008) terms, and its function is described as follows in the OED:

orig. U.S. shit happens: bad things often happen unavoidably. Also (esp. as a rejoinder) expressing a resigned attitude to any state of affairs or course of events: these things happen, such is life. (OED: shit P26.)


Examples of the phrase in the contemporary corpora are the following.

(1) I didn’t look at that and say, “Oh, this poor woman lost her house in a tornado.” Yeah, shit happens. It was when she said “I’m actually an atheist.” (COCA 2013 MAG)
(2) Dermed ble han skjøvet ned på fjerdeplass. - Shit happens, kommenterte han til Dagbladet.no. (NNC DB 2008-08-25)
Consequently he was relegated to fourth place. Shit happens, he commented to Dagbladet.no.

That the contemporary corpora contain many examples with a persistent form/function mapping soon becomes evident if one produces concordances of the phrase:

Figure 1. Concordance of shit happens in COCA.

Figure 2. Concordance of shit happens in NNC.

In COCA the phrase occurs 70 times and it is clear from Figure 1 that a majority fit the OED interpretations of ‘bad things often happen unavoidably’ or ‘these things happen, such is life’. These ready-made and lexicalised units are marked in blue in the figure. This contrasts with tokens where the two words enter compositionally into larger syntactic clauses with shit and happens as subject and verb, respectively (cf. Like I do when that shit happens to me; COCA FIC 2011 Bk:KillerRoutineLast). In NNC, the phrase occurs 173 times. It is clear after manual checking that none of the Norwegian tokens is part of longer citations of English discourse but all are genuine cases of the Anglicism with the same function as described in the OED. In fact, some tokens even emphasise this function through co-occurrence with an equivalent domestic expression:

(3) Det er slikt som skjer, shit happens, sier han til dansk TV 2. (NNC DB 2009-07-14)
That’s the kind of thing that happens, shit happens, he says to Danish TV2.

Hence, it is established beyond doubt that the phraseme is a functionally equivalent pair in the two languages.


As shown above, the OED has a separate entry for the phraseme shit happens. The oldest attestation dates from 1983 and refers to another dictionary, namely the University of North Carolina Campus Slang. As it takes time from the emergence of a new form in spoken or written discourse to it being recorded in a dictionary, the OED attestations are not synchronous with first instance, and we can have little hope of catching a form in its infancy using conventional empirical techniques such as corpus linguistic methods. What can be concluded from the dictionary entry, though, is that by 1983 the form/function mapping of the phraseme has been sufficiently well established across language users to merit a listing in the OED. For Norwegian, the NAOB Bokmål dictionary lists the expression under the entry for shit and asserts that it is borrowed from American English. Its oldest attestation is from 2000. The Nynorsk NO dictionary does not have an entry for shit or shit happens.


The next step amounts to searching for the phrase in the diachronic corpora, registering the number tokens per time period and applying statistical techniques for the detection of trends in the two languages. This will establish at which times the phrase emerged in both languages and its subsequent frequency development. The frequency profiles of shit happens are shown in Figure 3.

Figure 3. Frequency profiles for shit happens in English and Norwegian.

Figure 3. Frequency profiles for shit happens in English and Norwegian.

Although observations can be made impressionistically by looking at the graphs, a statistical measure should be applied to achieve a non-impressionistic detection of trends in the data. To this end, the Kendall’s τ correlation coefficient may be used, a non-parametric measure of the ordinal association between two measured quantities, in our case time and frequency (Hilpert & Gries 2009). For Norwegian, the Kendall’s τ correlation coefficient is 0.5572031. This means that there is a positive correlation of time and frequency, albeit quite far from the theoretical maximum of 1 (maximal growth) or -1 (maximal decrease), but nevertheless statistically significant (p-value = 8.313e-05). For English Kendall’s τ is 1, implying a maximally significant correlation between time and frequency (p-value = 0.01667). Coupled with what is visually observable from the scatterplot graphs, we can conclude that there has been a significant increase in the use of the phraseme after its emergence in both languages, although since its peak in 2010, in Norwegian the frequency has decreased considerably.


Considered in conjunction, the corpus and dictionary data as well as the frequency profiles and significance testing lead to a number of relevant observations:

Based on the assumption that formally and structurally similar patterns may represent products of language contact (borrowing), the timeline and frequency profiles reflect possible trajectories of language contact. As is inevitably the case in contact linguistics and sociolinguistics more generally, we can never expect to find irrefutable evidence that some linguistic feature actually spreads from one group of language users to another, but the listed observations can be seen as supporting evidence of a theory of borrowing from English to Norwegian. It is clear that the pairing of form and meaning/function is well established in English some three decades before it emerges in Norwegian. Naturally, such an ordering of events is a prerequisite for a conclusion about borrowing. It is also clear that there has been a significant increase in its usage in the SL preceding the emergence in the RL. In order for borrowing to occur, a sufficiently large number of RL users must be aware of the item and start using it in relevant contexts. A surge in SL usage makes this scenario more likely, but an increase here is not a prerequisite for borrowing, as one can also envisage borrowing in cases where an item fluctuates or remains stable in the SL. However, the form must be salient and frequent enough to catch the interest of innovative RL users and to resonate with their interlocutors. The surge in the SL seen for shit happens in the 1980s and 1990s makes it a more likely candidate for borrowing than if its frequency development were otherwise. Zenner et al. (2013) and Fiedler (2017) use the term ‘catchphrases’ about items which are frequent in a SL and gain a foothold in a RL. In this particular case, the accumulated quantitative evidence leads to the – admittedly unsurprising – conclusion that the idiomatic phraseme shit happens has emerged as a borrowing from English to Norwegian around 1990. Its final fall in frequency may reflect that the phrase is no longer as vogue as in its earliest decades, but this post hoc development does not refute the borrowing hypothesis.


I now turn to the far more contentious cases of cross-linguistically parallel phrasemes, namely potential indirect borrowings (loan translations). Let us first consider what Backus (2014) writes about the matter:

When I was younger, I would indicate my choice from a menu in a restaurant with the Dutch phrase ‘ik neem de steak’ (a more or less literal rendition in English would be ‘I’ll take the steak’). Nowadays, I may or may not use the same expression, but if I don’t, I’m likely to say ‘ik ga for de steak’ (‘I’ll go for the steak’). What has happened in between is that my Dutch has adapted to what I hear being said around me in the Dutch speech community, and the Dutch nowadays routinely use the expression ‘go for X’ to convey having made a choice from among a selection of options’. The construction is most likely a loan translation from English. It has made its way into Dutch, presumably first in the speech of bilinguals such as myself, and then got dispersed throughout the speech community (Backus 2014: 91).

Backus’ interpretation raises the question: How can we be sure it is from English? As we saw above, the diachronic-contrastive approach proposed here uses corpora to add supporting evidence to Backus’ ‘most likely’. In this section the method is illustrated with two cases that may be cases borrowing, namely the linking adverbial i et nøtteskall ‘in a nutshell’ (4.1) and the textual sentence stem tingen er at ‘the thing is that’ (4.2).

4.1 The linking adverbial i et nøtteskall ‘in a nutshell’

The fixed expression in a nutshell is classified as a linking adverbial, i.e. the category described as ‘various phrases with adverbial function’ (Granger & Paquot 2008: 44), on a par with expressions such as last but not least and in other words. Its function is described in the OED as follows:

d. in a nutshell: (used as adj. or adv.) in a few words; concisely stated, encapsulated. Also in to put in a nutshell. (OED nutshell)

Its use with this summarising function is evidenced in both the contemporary corpora:

(4) In a nutshell, who is Steve Bannon? (COCA 2017 SPOK)
(5) Dette er dagens Portugal i et nøtteskall, ungt og frodig, gammelt og behagelig. (NNC AA 1998-01-30)
This is today’s Portugal in a nutshell, young and lavish, old and pleasant.

As virtually no literal senses of the phrase is found in either the English or Norwegian corpus, it is easy to establish that the metaphorical linking adverbial is functionally equivalent in the two languages. The oldest record in the OED is from 1822, and the phraseme has existed in Norwegian long enough to merit a record in both the NAOB and NO, the oldest example being from 1934 (NAOB quoting the newspaper Morgenbladet). The frequency profiles are as shown in Figure 4.

Figure 4. Frequency profiles for in a nutshell / i et nøtteskall.

Figure 4. Frequency profiles for in a nutshell / i et nøtteskall.

The oldest token of the adverbial in COHA with the sense ‘concisely stated’ dates from 1852, i.e. 30 years after the oldest OED record, which might suggest that the phraseme is of British English origin. In the Norwegian text archive the oldest token is from 1922, a century after its emergence in BrE. The statistics suggest a steady growth in English from 1830 that continues into and beyond its emergence in Norwegian around 1920. For Norwegian, the Kendall’s τ correlation coefficient is 0.5231592. There is thus a statistically significant positive correlation of time and frequency (p-value = 2.44e-13). For English, the Kendall’s coefficient τ is 0.61138; thus there is also a significant correlation for this language (p-value = 0.000467). The dictionary entries and frequency profiles support the borrowing hypothesis, as the idiomatic phraseme is established in English about a century before its emergence in Norwegian (cf. OED vs. NB). At the time of its emergence in Norwegian, the phraseme has reached a stage of relatively high frequency of around 20 tokens per decade in COHA, although its subsequent development is that of a moderately s-shaped curve which takes a dip around 1940–60 followed by a sharp increase. Subsequent to its adoption into the RL, there is a steady increase for a period of 60 years until it reaches a peak around 1980. In conclusion, the empirical data support the hypothesis that the linking adverbial i et nøtteskall is a phraseological borrowing from English into Norwegian.

4.2 The textual sentence stem tingen er at ‘the thing is that’

The phraseme tingen er at ‘the thing is that’ is classified as a textual sentence stem, i.e. a phraseme which ‘routinised fragments with text-organising function, usually containing a subject and a verb’ (Granger & Paquot 2008: 44). The OED describes its sense as follows:

OED thing 7. colloq. With the. b. The special, important, or notable point; esp. that which is specially required; (more generally) that which is to be considered, the truth or the facts of the matter (esp. in the thing is (that) …, used to draw attention to a following statement; …

This discourse marker usage is well documented in the two contemporary corpora:

(6) That's for somebody else to decide. But the thing is that I am disappointed. (COCA SPOK 2017)
(7) Jeg har folk i livet mitt som hadde angret på at de ikke gjorde mer. Men tingen er at de har gjort mer enn nok, (NNC AP 2015-07-16)
I have people in my life who had regretted that they didn’t do more. But the thing is that they have done more than enough,

The oldest example from the OED goes as far back as 1748. The Norwegian equivalent is found in the NB dictionary for Nynorsk, the oldest record being from and 1916–1921. In this case, the frequency profiles tell a markedly different story from the previous case:

Figure 5. Frequency profiles for the thing is that / tingen er at.

Figure 5. Frequency profiles for the thing is that / tingen er at.

The frequency profiles show no clear pattern of a sharp or steady increase in either language. For Norwegian, the Kendall’s τ correlation coefficient is 0.3278427 and there is not a statistically significant positive correlation of time and frequency (p-value = 0.07905). For English Kendall’s τ is 0.4565137 and there is not a statistically significant correlation here either (p-value = 0.05155). Thus, there is no evidence to suggest that the discourse marker was established in English before it emerged in Norwegian, nor is there any significant increase to trigger borrowing of the phrase from English in any particular period. However, the most important evidence against a theory of borrowing in this case is that there are tokens in Norwegian that date much further back than in the previous case. The oldest example dates from 1747:

Meningen heri er ikke, at nogle skulde foregive, at Sneelinien paa et Sted skulde tage sin Begyndelse i en Afstand ad 9000de og andre grave paa same Sted hertil dertil 12000de Fod; men Tingen er, at paa et Sted begynder denne Linie i en Høide af 9000de og derimod paa et anded Sted først i en Afstand af 12000de Fod; (NB 1787 Fleischer, Esaias Forsøg til en natur-historie. 2 1: Forsøg til en Natur-Historie over Luften og de i og med Luften forefaldende og forbundne Tildragelser. København: Gyldendal)
… and others dig in the same place hitherto 12,000 feet; but the thing is that in one place this line starts at a height of 9000 …

Subsequent to this, the discourse marker occurs fairly regularly across the decades in Norwegian. Thus the data gives no evidence of a time gap between the emergence in English and Norwegian, rather that the use emerged in parallel by mid-18th century in both languages, the oldest documentations being from 1747–1748. This makes it a highly unlikely candidate for borrowing, especially since the massive influx of English on Norwegian lexis and phraseology is a post-WW2 phenomenon, though some influence on lexis is observable from the first half of the 19th century in specific domains such as maritime terminology (Stene 1945, Graedler 1998). Given the limited influence of English on Norwegian before 1900, it seems reasonable to conclude that the phraseme has not emerged in Norwegian as a result of borrowing from English, although there is a possibility that its use has been boosted by English usage at periods where it increases, such as from 1900 onwards (Figure 5). [4]


This article has investigated a set of formally and functionally equivalent phrasemes in English and Norwegian, and hypothesised that Norwegian usage is the result of borrowing from English due to language contact. It was shown that the idiomatic phraseme shit happens was unquestionably a case of borrowing from English to Norwegian that took place around 1990. In the case of the linking adverbial i et nøtteskall, the data drawn from dictionaries and corpora in both languages strongly support the borrowing hypothesis, since the English etymon, the conventionalised metaphor in a nutshell, was shown to be established well before the equivalent expression emerged in Norwegian around 1920. In the final case, the discourse marker tingen er at is, the diachronic-contrastive corpus method lead to the rejection of the borrowing hypothesis. This was concluded on the grounds that the discourse marker has existed in Norwegian at least two centuries before the massive influx from English set in (mid 1900s), and there is no evidence to suggest that its emergence in English predates that of Norwegian.

The approach taken here seems promising as a vehicle for empirically testing cases of functional and formal parallelism that are alleged to be contact-induced. Importantly, it must be acknowledged that the kind of structural parallelism that has been the focus of this article need not be a product of language contact, but other explanations for the parallelism must – at least occasionally – be sought. It could be the result of more or less parallel developments triggered separately in the two languages or the expressions could be part of a common Germanic phraseological heritage, although the use of such a native expression could at times be boosted by an English model, as might be the case for tingen er at. The empirical data pertaining to the discourse marker prompted the detection of what could be called ‘the Anglicism illusion’, on a par with Zwicky’s (2005a, 2005b) ‘recency illusion’ and ‘frequency illusion’. Just as language users tend to believe that something is new in the language because it is new to them, and just as they tend to believe that recently noticed phenomena happen ‘a whole lot’ (Zwicky 2005b), there is also – I argue – the tendency for speakers in languages with extensive borrowing from English the tendency to believe that ‘everything’ stems from English. In the case of tingen er at, this belief now stands corrected in light of corpus data. Whether this is also the case for other alleged English-induced phrasemes is for future research to uncover.


[1] Unless stated otherwise, the examples in this article are of English borrowings into Norwegian. [Go back up]

[2] See Sources for links to the language resources used in this study. [Go back up]

[3] https://sporbiblioteket.nb.no/faq/182202 [Go back up]

[4] Carstensen (1979: 94) notes this third possibility for similar structures in German. I thank the reviewer for pointing this out. [Go back up]


Documentation dictionaries

English: Oxford English Dictionary Online (OED), http://www.oed.com/

Norwegian Bokmål: Det norske akademis ordbok (NAOB), https://www.naob.no/
Entry for i et nøtteskall: https://www.naob.no/ordbok/n%C3%B8tteskall
Entry for shit happens: https://www.naob.no/ordbok/shit_2

Norwegian Nynorsk: Norsk ordbok 2014 (NO), http://no2014.uib.no/
Entry for i et nøtteskall: http://no2014.uib.no/perl/ordbok/no2014.cgi?soek=nøtteskal
Entry for tingen er at: http://no2014.uib.no/perl/ordbok/no2014.cgi?soek=ting


English, contemporary: Corpus of Contemporary American English (COCA), https://www.english-corpora.org/coca/

English, diachronic: Corpus of Historical American English (COHA), https://www.english-corpora.org/coha/

Norwegian, contemporary: Norsk aviskorpus / The Norwegian Newspaper Corpus (NNC), http://clarino.uib.no/korpuskel/corpus-list

Norwegian, diachronic: Nasjonalbiblioteket / The National Library, https://www.nb.no/search


Andersen, Gisle. 2010. “A contrastive approach to vague nouns”. New Approaches to Hedging, ed. by Gunther Kaltenboeck, 35–48. Bingley: Emerald.

Andersen, Gisle. 2014. “Pragmatic borrowing”. Journal of Pragmatics 67: 17–33.

Andersen, Gisle. 2016. “Speaking of X: Discourse topic markers from a variationist perspective”. Paper presented at the 37th Annual Conference of the International Computer Archive for Modern and Medieval English, The Chinese University of Hong Kong, 25–29 May 2016.

Andersen, Gisle. 2017. “A corpus study of pragmatic adaptation: The case of the Anglicism [jobb] in Norwegian”. Journal of Pragmatics 113: 127–143.

Andersen, Gisle & Knut Hofland. 2012. “Building a large monitor corpus based on newspapers on the web”. Exploring Newspaper Language – Using the Web to Create and Investigate a Large Corpus of Modern Norwegian, ed. by Gisle Andersen, 1–28. Amsterdam: John Benjamins.

Backus, Ad. 2014. “Towards a usage-based account of language change: Implications of contact linguistics for linguistic theory”. Questioning Language Contact: Limits of Contact, Contact at its Limits, ed. by Robert Nicolaï, 91–118. Leiden: Brill.

Burger, Harald. 1998. Phraseologie. Eine Einführung am Beispiel des Deutschen. Berlin: Erich Schmidt.

Carstensen, Broder. 1979. “Evidente und latente Einflüsse des Englischen auf das Deutsche”. Fremdwort-Diskussion, ed. by Peter Braun, 90–94. Munich: Fink.

Cowie, Anthony P. 1994. “Phraseology”. The Encyclopedia of Language and Linguistics, ed. by Robert E. Asher, 3168–3171. Oxford: Oxford University Press.

Davies, Mark. 2009. “The 385+ million word Corpus of Contemporary American English (1990–2008+): Design, architecture, and linguistic insights”. International Journal of Corpus Linguistics 14: 159–190.

Fiedler, Sabine. 2012. “Der Elephant im Raum ... The influence of English on German phraseology”. The Anglicization of European Lexis, ed. by Cristiano Furiassi, Virginia Pulcini & Félix Rodríguez Gonzáles, 239–259. Amsterdam: John Benjamins.

Fiedler, Sabine, 2014. “Gläserne Decke” und “Elefant im Raum” – Phraseologische Anglizismen im Deutschen. Berlin: Logos.

Fiedler, Sabine. 2017. “Phraseological borrowing from English into German: Cultural and pragmatic implications”. Journal of Pragmatics 113: 89–102.

Fraser, Bruce. 1996. “Pragmatic markers”. Pragmatics 6(2): 167–190.

Gottlieb, Henrik. 2004. “Danish echoes of English”. Nordic Journal of English Studies 3(2): 39–65.

Graedler, Anne-Line. 1998. Morphological, Semantic and Functional Aspects of English Lexical Borrowings in Norwegian. Oslo: Universitetsforlaget.

Graedler, Anne-Line & Stig Johansson. 1997. Anglisismeordboka: Engelske Lånord i Norsk. Oslo: Universitetsforlaget.

Granger, Sylviane. 2009. “Comment on: Learner Corpora: A Window onto the L2 Phrasicon”. Researching Collocations in Another Language. Multiple Interpretations, ed. by Andy Barfield & Henrik Gyllstad, 60–65. Houndmills: Palgrave Macmillan.

Granger, Sylviane & Magali Paquot. 2008. “Disentangling the phraseological web”. Phraseology: An Interdisciplinary Perspective, ed. by Sylviane Granger & Fanny Meunier, 27–50. Amsterdam: John Benjamins.

Gries, Stefan Th. 2008. “Phraseology and linguistic theory: A brief survey”. Phraseology: An Interdisciplinary Perspective, ed. by Sylviane Granger & Fanny Meunier, 3–26. Amsterdam: John Benjamins.

Haugen, Einar. 1950. “The analysis of linguistic borrowing.”  Language 26: 210–231.

Hilpert, Martin & Stefan Th. Gries. 2009. “Assessing frequency changes in multistage diachronic corpora: Applications for historical corpus linguistics and the study of language acquisition”. Literary and Linguistic Computing 24(4): 385–401. http://www.stgries.info/research/2009_MH-STG_AssessingFrequencyChanges_LitLingComp.pdf

Mel’čuk, Igor. 1995. “Phrasemes in language and phraseology in linguistics”. Idioms: Structural and Psychological Perspectives, ed. by Martin Everaert, Erik-Jan Van der Linden, Andre Schenk & Rob Schreuder, 167–232. Hillsdale: Lawrence Erlbaum Associates.

Mel’čuk, Igor. 1998. “Collocations and lexical functions”. Phraseology. Theory, Analysis, and Applications, ed. by Anthony P. Cowie, 23–53. Oxford: Oxford University Press.

Mel’čuk, Igor. 2012. “Phraseology in the language, in the dictionary, and in the computer”. Yearbook of Phraseology 3(1): 31–56.

Onysko, Alexander & Esme Winter-Froemel. 2011. “Necessary loans – luxury loans? Exploring the pragmatic dimension of borrowing”. Journal of Pragmatics 43: 1550–1567.

Stene, Aasta. 1945. English Loan-words in Modern Norwegian: A Study of Linguistic Borrowing in the Process. Oxford & London: Oxford University Press.

Terkourafi, Marina. 2011. “Thank You, Sorry and Please in Cypriot Greek: What happens to politeness markers when they are borrowed across languages?”. Journal of Pragmatics 43: 218–235.

Weinreich, Uriel. 1953. Languages in Contact: Findings and Problems. New York: Linguistic Circle of New York.

Zenner, Eline, Dirk Speelman & Dirk Geearaerts. 2013. “What makes a catchphrase catchy? Possible determinants in the borrowability of English catchphrases in Dutch”. New Perspectives on Lexical Borrowing: Onomasiologican and Phraseological Innovations, ed. by Eline Zenner & Gitte Kristiansen, 41–64. Boston: De Gruyter.

Zwicky, Arnold. 2005a. “Just between Dr. Language and I”. Language Log. 7 August 2005. http://itre.cis.upenn.edu/~myl/languagelog/archives/002386.html

Zwicky, Arnold. 2005b. “More illusions”. Language Log. 17 August 2005. http://itre.cis.upenn.edu/myl/languagelog/archives/002407.html


Phrasemic category Borrowing ENG→NOR Example Freq. (NNC)
Lexical collocations føl deg fri til å ‘feel free to’ «Føl deg fri til å tegne!» var søndagens motto i Oslo (NNC AP 2015-01-11)
“Feel free to draw!” was Sunday’s motto in Oslo
Idioms å gå den ekstra milen ‘to go the extra mile’ Vi ønsker å gå den ekstra milen for våre gjester. (NNC AP 2007-03-29)
We wish to go that extra mile for our guests.
Irreversible bi-/trinominals bed and breakfast Stedet er egentlig bed and breakfast, men er du heldig, lager vertinnen Annette et måltid mat til deg. (NNC DB 2014-08-23)
The place is really bed and breakfast, but if you’re lucky, the hostess Annette will cook you a meal.
Similes sprek som ei fele
‘fit as a fiddle’*
n.a. -
Compounds kirsebærplukking ‘cherrypicking’ - Regjeringen kan ikke drive kirsebærplukking fra programmet til Venstre. (NNC AP 2014-06-17)
The government cannot cherrypick from Venstre’s (political party) program.
Phrasal verbs frike ut ‘freak out’ - Jeg håper de ikke friker helt ut, men det kommer sikkert noe kreativt fra dem. (NNC AA 2004-09-21)
I hope they don’t freak out totally, but they will probably come up with something creative.
Grammatical collocations X to go Tilsett en dobbel cortado to go, og du har en knallstart på dagen (NNC DA 2007-06-28)
Add a cortado to go and you have a great start of the day
Complex prepositions n.a. n.a. -
Complex conjunctions n.a. n.a. -
Linking adverbials by the way 30. desember! Som by the way er et håpløst tidspunkt. (NNC AA 2005-12-30)
30 December! Which, by the way, is a ridiculous date.
Textual sentence stems når det kommer til  ‘when it comes to’ Og når det kommer til etnisk rensing, har ikke Tudjman stått tilbake for Milosevic. (NNC AP 1999-04-25)
And when it comes to ethnic cleansing, Tudjman has not been any less persistent than Milosevic.
Speech act formulae thank you, see you, love you - OK, gutta! Thank you!! brølte treneren. (NNC VG 2001-10-14)
OK guys! Thank you! the trainer shouted.
Attitudinal formulae get over it - Man får ikke snudd utviklingen, så «get over it», og skriv en ny og bedre hit i stedet, sier Ina Wroldsen. (NNC AP 2015-03-26)
You can’t stop this development, so get over it and write a new and better hit instead, says Ina Wroldsen.
Proverbs and proverb fragments When in Rome … Men vi ønsker ikke å fordømme han: Det er vel sånn at "when in Rome, do as the Romans". (NNC DB 1998-11-25)
But we do not wish to condemn him: I guess it is a case of “When in Rome, do as the Romans”.
Commonplaces YOLO Noen har gått ut på den utrygge isen på Nordåsvatnet for å gi en beskjed. «YOLO»: «Du lever bare én gang» har noen skrevet på Nordåsvatnet. (NNC BT 2013-01-24)
Someone has gone out on the thin ice on Lake Nordås to give a message. “YOLO”: “You only live once”, somebody has written on Lake Nordås.
Slogans Coke is it Coke is it. Cola-reklamene var alltid de gjeveste. Som 12-åring elsket jeg dem dypt og inderlig. (NNC DA 2014-02-15)
Coke is it. The Coca Cola ads were always the best. As a 12-year-old I loved them dearly.
Idiomatic sentences What’s in it for me? Fokuset på egen økonomi, eller "what's in it for me"-tankegangen, legger ikke noe godt grunnlag for verdiskaping (NNC DN 2006-11-16)
The focus on own economy, or the “what’s in it for me”-line of thinking does not lay a good foundation for value creation
Quotations We shall never surrender. Konsertintroen med Winston Churchills avsluttende ord "we shall never surrender" har åpenbart mer enn en patriotisk valør. (NNC DB 2008-07-23)
The concert intro with Winston Churchill’s closing words “we shall never surrender” obviously has more than a patriotic value.

