2019 Meier FormulaicLanguage
2019 Meier FormulaicLanguage
net/publication/333132412
Formulaic language and text routines in football live text commentaries and
match reports. A cross-and corpus-linguistic approach
CITATIONS READS
0 1,791
1 author:
Simon Meier-Vieracker
Technische Universität Dresden
72 PUBLICATIONS 99 CITATIONS
SEE PROFILE
All content following this page was uploaded by Simon Meier-Vieracker on 16 May 2019.
Abstract: Since writers of football live text commentaries and online match reports have to write under
high time pressure, they make extensive use of formulaic sequences. Still, they have to stage the games as
emotional and emotionalizing events and therefore have to avoid the impression of being routinized.
Based on large corpora of German and English data, the present article makes use of data-driven methods
to investigate the writers’ strategies to meet this challenge. Recurrent syntactic patterns serve as templates
for describing recurrent events and can be enriched by a noteably large set of mostly expressive lexical
items. Moreover, idioms are frequently used for giving summarizing and evaluating accounts of the
games. Beyond the cross-linguistic differences with regard to the lexis and some syntactic details, the
linguistic routines used by the writers are rather similar in both languages.
1 Introduction
Authors of football live text commentaries and online match reports must, first and
foremost, be quick. In live text commentaries, the entries need to be published ‘minute
by minute’, and even match reports must go online shortly after or even at the final
whistle. As the text producers have to describe these open-ended games under high time
pressure, they have to make extensive use of sequences of formulaic language that may
be activated as prefabricated patterns (Wray & Perkins 2000:1) and thus relieve them of
encoding efforts. However, they also need to display emotional involvement in order to
deliver appealing and entertaining narratives and to create suspense (Kern 2014), and
must therefore avoid the impression of acting merely on a routine basis.
In the present paper, I will make use of data-driven, corpus linguistic methods to inves-
tigate the writers’ strategies to meet this challenge of reconciling linguistic routines on
the one hand and the task of emotionalization on the other. Based on large corpora of
German and English data (approx. 13 mio. tokens) I will focus on two types of formula-
icity. First, I will show how writers make use of recurrent schematic constructions
(Croft 2001:25) that are filled with an extraordinarly rich set of mostly expressive syno-
nyms of, say, motion verbs. Second, I will discuss the role of idioms that allow for a
routinized, yet vivid and community-building narration of sports events. The results of
my corpus analyses will give evidence to what Sinclair (1991:109f.) has pointed to as
the complementarity of the open-choice principle on the one hand and the idiom princi-
ple on the other (Erman/Warrens 2000). The production of texts oscillates between
1
word-for-word combinations and preconstructed patterns, yet in a register-specific way
that is tied to the communicative and social needs in the domain of sports coverage.
1
Following Giulianotti’s (2002) taxonomy of spectator idenities in football, the term ‘fans’ refers to per-
sons with a ‘hot’, that is, affective, relationship to a team, a player or the like. But as opposed to the ‘sup-
porters’, whose emotional investment typically resists the commodification of sports (Merkel 2012), the
fans’ “identification with the club and its players is […] authenticated most readily through the consump-
tion of related products” (Giulianotti 2002:36) like fan articles and the whole variety of media services.
2
where, for example, military terms like marching orders are not to be taken literally.
Finally, by their register-marking functions, formulaic sequences may serve as a shibbo-
leth for asserting and reaffirming group identity (Wray & Perkins 2000:14), which in
the context of this paper is the collective identity of the football fans. Formulaic se-
quences are expected by the followers of football and thus “both serve to include those
who are familiar with the phraseology and to exclude those who are not” (Levin
2008:146).
Popular accounts on formulaic language in football coverage usually focus on set
phrases, idiomatic expressions and lexicalized metaphors. From a linguistic perspective,
however, less salient routines like recurrent syntactic patterns are of interest too. Fol-
lowing Feilke (2010), I call them ‘text routines’ and they can be defined as domain and
register-specific procedures of writing. They can take the shape of (more or less lexical-
ized) grammatical constructions or of recurring means of structuring texts.2 Like formu-
laic sequences in the narrower sense, text routines also serve as convenient and register-
marking solutions for the task of writing under time pressure. Moreover, as they nor-
mally have empty slots that can be flexibly filled with expressive lexical or phraseologi-
cal items (Croft 2001:25; Erman/Warrens:34), they can counterbalance the stereotypi-
cality and conciliate it with the task of staging the football game as a uniquely emotion-
al and emotionalizing event. Only on the basis of routines, the “dramatic embellish-
ment” (Bryant, Comisky & Zillmann 1977:140) of the commentator’s narrative can be
achieved, and “the background of habitualized activity opens up a foreground for delib-
eration and innovation” (Berger & Luckmann 1966:71). As I will show below, corpus
linguistic methods are particularly suitable to demonstrate how this conciliation works
within the register of online football coverage.
2
Good examples include patterns used in weather reports like with occasional X Y or text struc-
time_of_day weather_phenomenon
3
(Schmid 1994)4 and then uploaded to the web-based tool CQPweb (Hardie 2012),
which allows flexible queries of the annotated data and various statistical analyses. Eve-
ry text is enriched with metadata including a URL link to the original text in its multi-
modal appearance.5
No. of texts No. of tokens
Kicker LTC (2006–16) 3,058 4,055,353
Kicker MR (2006–16) 3,057 2,145,189
Sportsmole LTC (2012–17) 1,530 5,810,800
Sportsmole MR (2012–17) 1,727 719,155
total 9,327 12,730,497
Table 1: Corpora6
As reference corpora, I am using random sets of 100,000 sentences each from English
and German online news of 2015 taken from the Leipzig Corpora Collection
(Goldhahn, Eckart & Quasthoff 2012). By contrasting the football data with these the-
matically non-specific corpora, i.e. by key word analysis (Bondi 2010)7, I will deter-
mine some lexical items as a starting point for further analyses, including ngram analy-
sis and collocation analysis, in order to detect (partly) schematic constructions which
are lexically not fully specified, and their highly variable filler items (chapter 4.1). Fur-
thermore, I will make use of the Ngram Statictics Package (Banerjee & Pedersen 2003)
that allows users to detect statistically significant ngrams in a corpus without any lexical
specification based on association measures alone. This method will help to find idio-
matic expressions and to show their pervasiveness in the corpus (section 5).
4
Legends for part-of-speech tags set by the Treetagger can be found under http://www.cis.uni-
muenchen.de/~schmid/tools/TreeTagger/. Note that the German and English tagsets differ.
5
In this paper, I will not go into the topic of multimodality any further. See Werner (this volume).
6
The quantity of the texts does not match the exact number of games, because some links on both web-
sites are misdirected. Also note that on sportsmole.co.uk reports on selected FA cup and Champions
League games with the participation of Premier League clubs are tagged as “Premier League”, too. How-
ever, the vast majority of the texts are about Premier League games.
7
“In a quantitative perspective, keywords are those whose frequency (or infrequency) in a text or a cor-
pus is statistically significant, when compared to the standards set by a reference corpus” (Bondi 2010:3).
4
Kicker: MR nach [after], Minute [minute], Ball [ball], Tor [goal], gegen [against],
Meter [meter], Partie [match], Strafraum [box], links [left], Gast
[guest], Chance, Führung [lead], erst [first], Spiel [game], rechts
[right]
Sportsmole: LTC ball, goal, League, but, side, chance, match, minute, game, Premier,
box, half, corner, win, shot, season
Sportsmole: MR minute, goal, ball, League, Premier, the, half, side, effort, wide, corner,
shot, chance, when, box
8
Note that live text commentaries include automatized entries (or parts of them) indicating the begin-
nings and ends of the games, substitutions, yellow/red cards etc. – even more so in the German data than
in the English ones.
9
The sigles of the corpus documents are structured as follows:
{source}_{competition+season}_{texttype}_{ID}. My translations are rather literal and seek to cover
most of the German syntax-semantics knowing that they sometimes are at odds with a canonical English
translation.
5
aus linker/rechter/… Position 1015 / 539 1488 / 656
[from the left/right/… side] (366.9 / 305.8)
aus spitzem Winkel 1120 / 490 1464 / 545
[from a tight angle] (361.0 / 254.1)
aus kurzer/langer/… Distanz 866 / 835 1057 / 989
[from close/long/… range] (260.6 / 461.0)
aus vollem Lauf 192 / 45 202 / 45
[at full speed] (50.6 / 21.0)
Table 3: Most frequent adpositional phrases of the type aus _ADJA _NN (Kicker corpus)
While in LTCs the time display of every entry is delivered as a meta datum (e.g. in a
separate column or highlighted in bold and colour), in MRs the time specifications have
to be verbalized. Aside from constructions with the key lemma Minute like in der x.ten
Minute [in the x minute] (3,117 hits, cf. Levin 2008 for English), constructions with the
th
temporal preposition nach [after], which is the most significant key lemma, are also
highly frequent. On the one hand, they are used to indicate time periods relative to the
basic time structure of football games (nach der Pause [after the break] etc.). On the
other hand, adpositional phrases of the type nach (_ART) _ADJA _NN (3,451 hits) allow
for very condensed but vivid descriptions of what happened right before a player’s
move (4–5).
(4) Nach schöner Vorarbeit von Matmour traf Neuville erneut nur den Pfosten.
(k_BL0809_spb_652)
After a beautiful assist of Matmour, Neuville, again, only hit the post.
(5) Rudnevs kam nach Jansens Flanke knapp zu spät (9.), Skjelbred traf den Ball nach
toller Kombination im Zentrum nicht voll (11.). (k_BL1213_spb_1952)
Rudnevs was a little late after Jansen’s cross (9.), Skjelbred didn’t quite hit the ball af-
ter a great combination in the centre (11.).
With constructions of this type a whole series of preparatory moves (among the most
frequent nouns we find Vorarbeit [assist], Zuspiel [pass], Kombination [combination]
and Solo [solo run]) can be expressed in just three words. Beyond a merely temporal
determination, the preposition is given a causal reading according to the implicature-
based principle post hoc, ergo propter hoc (Pinto 1995) and is therefore particularly
suitable for giving an account of the emergence (Entstehung) of a shot or the like.
Moreover, the adjective slot allows for an evaluative description of the scene. The most
frequent adjectives with primarily evaluative meaning are the following:
gut [good], schön [beautiful], toll [great], fein [fine], stark [strong], gelungen [successful],
schwach [weak], sehenswert [worth seeing], klasse [great], glänzend [brilliant], schlimm
[bad], klug [clever], katastrophal [disastrous], perfekt [perfect], präzise [precise]
In the noun slot, we can often find compounds with proper names of players or teams as
in (6), making the description even more concise.
(6) Diouf lief nach feinem Pinto-Pass alleine auf das Tor zu, wurde dann aber noch von
Russ eingeholt, der klären konnte (58.). (k_BL1112_spb_1796)
After a fine Pinto pass, Diouf ran towards the goal alone, but Russ caught up and
was able to clear (58.)
6
Thus, the syntactic pattern nach _ADJA _NN provides a useful and flexible template for
MRs that helps the writers to re-narrate and evaluate the temporal unfolding of the game
and at the same time to provide causal explanations of what was going on.
The prevalence of evaluative means, as was shown for the pattern nach _ADJA _NN,
also proves for the descriptions of shots with the noun Ball, which is the most signifi-
cant key lemma for the LTC subcorpus. In this subcorpus, the most frequent POS-
trigram is VVFIN ART Ball (8,843 hits); adding vernacular synonyms for Ball to the
query, even 16,050 hits are returned.10 In most cases the trigram is extended to the right
by an adpositional phrase like ins Netz [into the net] (8,082 hits) to indicate the path of
the ball or by an adverbial adjective (2,179 hits).11 A count of the verbs used to instanti-
ate this pattern shows up many expressive jargon words. Some of them specify the
manner of shooting e.g. by describing the shooting technique (löffeln [to spoon], spit-
zeln [to poke], zirkeln [to curl]). But especially for powerful shots there seems to be a
rich variety of synonyms (in descending frequency):
treiben, jagen, hämmern, dreschen, knallen, hauen, nageln, donnern, feuern, zimmern,
schweißen, bolzen, kloppen, ballern, prügeln, semmeln
Since the core meaning of all these verbs is something like ‘to shoot powerfully’, differ-
ences of their denotational semantics can hardly be determined. Some of them seem to
be metaphorical loans from the source domain of craft (hämmern [to hammer], nageln
[to nail], dreschen [to thresh], schweißen [to weld] etc.), but the very specifics of these
actions are not projected on the target domain of shooting except the feature of power-
fulness. The main function of this richness of synonyms is likely to be the diversifica-
tion of descriptions of the ever-repeating act of shooting to keep them vivid and sus-
penseful.12 This may be further intensified by adverbial adjectives. Some of them char-
acterize the manner of shooting with respect to its motion-related aspects like flach
[flat] or quer [across], but the majority are more expressive than descriptive (in de-
scending frequency):
stark [strongly], perfekt [perfectly], sehenswert [worth seeing], schön [beautifully], gefühl-
voll [with feeling], elegant [elegantly], souverän [confidently], humorlos [humourlessly],
wunderbar [wonderfully], mühelos [effortlessly], trocken [dryly], artistisch [artistically],
…
Rather than specifying, for example, the path of the ball these adjectives express the
writer’s emotional attitude towards the effort in the first place. Again, the pattern
10
In German, a rich variety of (metaphorical and metonymical) synonyms is used to denote the ball (in
descending frequency): Leder [leather], Kugel [bullet], Spielgerät [game equipment], Rund [round],
Sportgerät [sports equipment], Pille [pill], Kirsche [cherry], Ei [egg], Murmel [marble].
11
In German, adjectives that are used as adverbs are not marked by a suffix as in English -ly and there-
fore not formally distinguishable from predicate adjectives. For this reason, the German equivalents to
e.g. strongly are classified and tagged as ‘adverbial adjectives’ (_ADJD) rather than just adverbs.
12
If at all, a distributional semantic specification of these verbs based on combinatorial preferences seems
feasible (Dalmas et al. 2015). As a tendency, vernacular terms for the ball go together with metaphorical
verbs (e.g. drischt das Leder [thrashes the leather]), whereas the unmarked Term Ball is combined with
neutral verbs like schießen [to shoot] or non-agentive verbs like landen [to land]).
7
_VVFIN den Ball _ADJD _APPR provides a useful and flexible template for describing
the repetitive event of shooting in ever new and appealing ways.
Lastly, the adversative conjunction aber [but] (30,984 hits) is a highly significant key
lemma for the LTC subcorpus. Adversative constructions with aber or other conjunc-
tions like doch are used recurrently to describe wasted chances or other misses, and
since in football failed shots are much more frequent than successful ones, many fixed
phrases, some of them with metaphorical meanings, have emerged. They can be detect-
ed by a collocation analysis (with a collocation window of 7 to the right). In the follow-
ing examples (7–9) the significant collocates are italicized, while all of the adversative
phrases are typical, but rather flexible instantiations:
(7) Der Flügelangreifer feuert das Leder scharf nach innen, findet aber keinen Abneh-
mer. (k_BL1516_lt_2921)
The winger fires a sharp cross into the centre, but does not find any buyers.
(8) Der Bremer Angriffs-Allrounder bleibt aber an der Berliner Abwehr hängen.
(k_BL1516_lt_2919)
But the Bremen allrounder gets stuck in Berlin’s defence.
(9) Den Abpraller krallt sich Holtby, wird aber wegen Abseits zurückgepfiffen.
(k_BL1415_lt_2500)
Holtby grabs the rebound, but he is whistled back for offside.
Of course, the miss can be framed as an intervention from the defender’s perspective
(10–11).
(10) Gute Flanke von Chandler, aber Tah klärt. (k_BL1516_lt_3015)
Good cross by Chandler, but Tah clears.
(11) Dost kommt im Strafraum an den Ball, aber Wiedwald hat gut aufgepasst.
(k_BL1516_lt_3017)
Dost gets to the ball in the penalty area, but Wiedwald has been very attentive.
To sum up so far, the described sets of more or less fixed syntactic patterns already pro-
vide the basic means to describe a prototypical scene of a football game. In combina-
tion, they form a three-step text routine available to the writers both of LTCs and MRs
containing 1) the description of some preparatory steps, e.g. a one-on-one or an assist,
2) the effort or shot itself and 3) the miss or the intervention. Most schematically, this
routine can be represented as follows:
[nach …] PREPARATION/ASSIST […] EFFORT aber/doch […]
MISS/INTERVENTION
(12–13) are two examples from the corpus which put the players’ effort into words in
exactly the way described above, that is with an adjective and an adpositional phrase
indicating the path of the ball:13
(12) [Nach einem Eckball von der linken Seite] [verlängert Bender den Ball am kurzen
ASSIST
Pfosten geschickt mit dem Kopf aufs Tor] doch [Zieler passt auf und hält knapp vor
EFFORT
13
More examples can be found with the following query: "nach"%c [pos="ART"]? [pos="ADJA"]?
[pos="NN"] [pos!="\$.*"]* "," "aber|.*doch"
8
[After a corner from the left] [Bender skilfully extends the ball towards the goal at
ASSIST
the near post with his head] but [Zieler is attentive and saves just in front of the
EFFORT
(13) [Erst nach einer Eckballhereingabe von Hajnal] [köpfte Eggimann gefährlich aufs
ASSIST
[Only after a corner kick from Hajnal] [Eggimann headed dangerously on goal]
ASSIST EFFORT
This schema builds up the the basic linguistic structure of a most common and register-
typical text routine. As a rather abstract schema it can be adapted easily and variably
enriched with a notably large set of expressive and vivid filler items. Although the sin-
gle instantiations seem to be unique descriptions of unique scenes, a corpus linguistic
analysis shows their extremely schematic nature, “an underlying rigidity of phraseolo-
gy, despite a rich superficial variation” (Sinclair 1991:121), and this very interaction of
idiomacity and open-choice is constitutive for the register of football coverage.
Many of the findings presented above also hold for the English data. For example, spa-
tial descriptions with the preposition from (like from (about/around) CD yards (2,904
hits) or from the edge of the area/box (1,706 hits) can be found frequently. Again, the
pattern from (a) _JJ _NN is suitable for both genres, even if the frequency table 4 shows
a clear dominance in the MRs, where exact details can be omitted:
No. of texts Absolute and relative (pmw)
(LTC/MR) no. of tokens (LTC/MR)
from close/long/point-blank range 636 / 536 974 / 662
(167.6 / 920.52)
from a tight/narrow/acute angle 384 / 156 460 / 162
(79.2 / 225.3)
from a good/wide/central position 118 / 48 126 / 49
(21.7 / 68.1)
Table 4: Most frequent adpositional phrases of the type from (a) _JJ _NN (Sportsmole corpus)
In the LTC subcorpus, a highly frequent part-of-speech trigram with the key lemma ball
is VVZ DT ball (5,276 hits). It is mostly followed by a preposition (2,551 hits) (14) or
an adverb (1,335 hits) (15), both of which are mainly used to specify the path of the
ball.
(14) Gerrard moves forward and whips the ball into the box. (spm_PL1415_lt_978)
(15) Welbeck breaks away down the left and pulls the ball back for Bellerin.
(spm_PL1516_lt_1282)
As in the German data, a count of the filler items for the verb slot shows up a rich varie-
ty of synonyms. The most frequent agentive verbs within this pattern gives, plays and
sends (the ball) are rather neutral and unspecific, but there are many others which char-
acterize the manner of shooting like heads, lifts, pokes, curls (the ball). And again, for
denoting powerful shots the range of available verbs seems to be particularly wide:
9
flicks, whips, knocks, fires, smashes, lashes, strikes, punches, blasts, hammers, fizzes, slams,
blazes, flashes, powers, pumps, whacks, cannons, smacks, thunders
In the LTC corpus, the adversative conjunction but is also a highly significant key lem-
ma (53,448 hits, 9,198 pmw). Applying the same method as above, that is a collocation
analysis with a window of 7 to the right, the following examples (16–19) are typical
ways of describing misses or, from the defender’s perspective, interventions (significant
collocates, which all rank among the first 20 positions in the collocation list, are itali-
cized):
(16) West Ham continue to threaten as Payet delivers a brilliant free kick onto the head of
Collins, but his effort is wide of the post. (spm_PL1617_lt_1418)
(17) Almost a chance for Liverpool from a corner as Mane finds a bit of space, but can’t
get enough on his header at the front post. (spm_PL1617_lt_1502)
(18) Janmaat clips the ball into the box, but it is far too close to Cech and he is able to
collect. (spm_PL1617_lt_1525)
(19) De Bruyne and Sterling link up before the England international delivers in a cross,
but Watford clear the danger. (spm_PL1617_lt_1479)
Counterparts of the patterns with nach can also be found in the English data. A query
for the pattern after _DT _JJ _NN returns 956 (LTC) and 254 (MR) hits, and again it is
used to indicate the overall time structure of the game (after the half-hour mark), but
also the preparatory steps of specific scenes (after a great/clever/fabulous
run/pass/cross).
In the English data the temporal subordinating conjunction before is frequent, too (see
(20–23)). Like after, it links two different moves into one scene, but focusses more
strongly on the preceding event, while the subsequent event is – grammatically spoken
– subordinated. In most cases it is combined with a gerund. This pattern is distributed as
shown in table 5.
No. of texts No. of tokens
LTC 1,327 5,280 (908.7 pmw)
MR 747 1,091 (1517.1 pmw)
Table 5: before _VVG
As was shown for the pattern nach _ADJA _NN in the German data, the pattern before
_VVG can be combined to a three-step account of a scene, which turns out to be a wide-
ly-used text routine in LTCs (20–21).14
(20) [Ibrahimovic collects a long pass] before [cutting inside and shooting]
ASSIST EFFORT but [Dann
makes the block] (spm_PL1617_lt_1480)
INTERVENTION
(21) [Sigurdsson gets the ball on the edge of the box] before [trying to swerve it past
ASSIST
14
More examples can be found with the query: "before" [pos="VVG"] [pos!="SENT"]* "but"
10
(22) [The midfielder played a one-two with Lukaku] before [thumping a shot at
PREPARATION
the left of the area] before [curling a strike towards goal] [but it cannoned off
PREPARATION SHOT
Moreover, in MRs constructions with the adverb when (which proves to be a key lemma
for the MR subcorpus) are also frequent. A collocation analysis of the 4,935 hits in the
MR subcorpus shows that in 1,596 cases (32%) it is preceded by an exact time specifi-
cation with the lemma minute. This time specification is usually combined with an indi-
cation of the state of the match or, more precisely, a change of this state. To give some
examples from the corpus:
(24) [Reading regained the lead] [on 62 minutes] when [Noel Hunt headed home]
STATE OF MATCH TIME GOAL
(25) [Lambert made the breakthrough] [after 32 minutes] when [he reacted quickest
STATE OF MATCH TIME
to a deflected Lallana free kick] PREPARATION [to poke home from close range] GOAL
(spm_PL1213_spb_105)
(26) [Steve Clarke’s side did equalise] [in the 43rd minute] when [another Brunt
STATE OF MATCH TIME
corner was met by Jonas Olsson and Bunn could only tip onto the bar] [with Gera ASSIST
(spm_PL1213_spb_144)
With this text routine, the task of reporting both on the overall game and its decisive
scenes can be fulfilled in a most economical way. As already noted for the German data,
the text routine leaves some open choices and is flexible enough to insert most varied
and often very expressive descriptions of the players’ moves, but still it provides a use-
ful template to combine these components in proven manner. Thus, not only the lexical
items themselves, but also their embedding in approved syntactic patterns are constitu-
tive and indicative for the register of football coverage.
11
is rather small and the analysis is based on a manual count of qualitative categoriza-
tions.
In addition to this interpretative procedure, I follow a data-driven approach of detecting
idioms based on statistical ngram analysis. In contrast to the type of ngram analysis ap-
plied in section 4, which looked for fixed patterns with one or more specified items (at
least on the part-of-speech level), statistical ngram analysis works without any previous
specification but calculates the statistical association score for every ngram in a corpus
(Evert 2009). I am using the open source Ngram Statistics Package (Banerjee &
Pedersen 2003). In a first step, the software counts all occurrences of all ngrams (here:
bigrams or word pairs) in the corpus within a window of up to five words. Then, the
statistical association is calculated for every word pair type by comparing its observed
frequency with its expected frequency based on the assumption of a random distribu-
tion. The results are ranked by their association scores (i.e. significance) according to a
chi-square test. That way, the software detects word pairs that may occur relatively rare-
ly (a minimum frequency of 5 is set as standard), but if they occur at all, they do so in
this very combination (co-occurrence). In other words, the software detects fixed word
pairs.
As the software works on the basis of bare counting of word forms without any linguis-
tic (morpho-syntactic or lexical) information, it does not differentiate between the vari-
ous classes of phrasemes as defined in the literature (e.g. idioms vs. structural
phrasemes like with regard to). Moreover, proper names like Garry Monk are typically
ranked very high for they best meet the requirement of co-occurrence of their compo-
nents. But after taking out proper names (of persons, clubs and stadiums), the remaining
word pairs are components of mostly idiomatic expressions. Table 6 presents the first
15 word pairs ranked by their association score and its absolute frequency in the sub-
corpora (every word pair was checked in the corpus to determine the shape it usually
takes in the texts; the main components according to the software output are highlighted
in bold).
Sportsmole LTC Sportsmole MR
to claim bragging rights (56) to claim bragging rights (10)
to whet the appetite (25) to be given one’s marching orders (26)
huff(ing) and puff(ing) (37) south coast (16)
to be at sixes and sevens (10) to add/put the icing on the cake (14)
to get into the nitty gritty (7) one-way traffic (15)
to be surplus to requirements (5) free kick (854)
a collector’s item (5) relegation zone (166)
to be licking one’s lips (23 for a second bookable offence (16)
doom and gloom (38) from a tight angle (112)
alarm bells ringing (12) this afternoon (744)
to throw the kitchen sink (25) for large parts (22)
last-chance saloon (7) to/a share (of) the spoils (64)
the grand scheme of things (14) to keep one’s clean sheet (intact) (62)
ladies and gentlemen (28) five-at-the-back system (6)
one-way traffic (84) exchanged passes (39)
12
Kicker LTC15 Kicker MR
Herzlich Willkommen [welcome] (32) ad acta legen [to shelve sth.] (8)
Lucky Punch [i.e. decisive (last-minute) goal] weder Fisch noch Fleisch [neither fish nor fowl]
(60) (8)
auf Messers Schneide [on a knife edge] (9) unter Dach und Fach [approx. wrapped up] (56)
never change a winning team (5) never change a winning team (6)
Wechselkontingent ausgeschöpft [out of substi- auf Messers Schneide [on a knife edge] (19)
tutions] (43)
Standing Ovations (8) in trockene Tücher packen [approx. to make sth.
being home and dry] (13)
der Drops ist gelutscht [approx. it’s done and wie das Kaninchen vor der Schlange [approx.
dusted] (7) like a rabbit caught in the headlights] (10)
Freund und Feind [friend and foe] (245) oberstes Gebot [top priority] (5)
weder Fisch noch Fleisch [neither fish nor fowl] freies Schussfeld [free field of fire] (6)
(16)
Dreh- und Angelpunkt [pivotal point] (5) im wahrsten Sinne des Wortes [in the truest sense
of the word] (9)
in höchster Not [in the nick of time] (299) Freund und Feind [friend and foe] (35)
rote Hosen und blaue Stutzen [red shorts and im Großen und Ganzen [approx. generally spea-
blue socks] (132) king] (28)
Hin und Her [back and forth] (8) mit Zähnen und Klauen [tooth and nail] (8)
Big Point (7) sich die Butter vom Brot nehmen lassen [to let sb.
take the bread out of one’s mouth] (6)
mit vereinten Kräften [with united forces] (81) mit angezogener Handbremse [with the hand-
brake applied] (18)
das rettende Ufer [approx. dry land] (28) Wechselbad der Gefühle [approx. roller-coaster
of emotions] (13)
Table 6: Statistically significant word pairs
As the table shows, the software is able to detect word pairs that can easily be complet-
ed to meaningful and conventional formulaic expressions by any human reader. Most of
them are semantically opaque idioms (to be at sixes and sevens), often with figurative
meanings (to put the icing on the cake), but routine formulae (Ladies and Gentlemen)
are also detected. In some cases, the idiomaticity derives from metaphoric transfer from
another domain (one-way traffic; mit angezogener Handbremse [with the hand brake
applied]). Some word pairs, escpecially in the Sportsmole MR subcorpus, are technical
terms (free kick, relegation zone, five-at-the-back system).16 However, it is striking that
(at least through the lense of the applied methods of automatized ngram detection) the
majority of the detected formulaic expressions in both languages are not sport-specific
idioms but are borrowed from other domains, including (but not limited to) the well-
known domain of warfare (Bergh 2011). A great number of the used idioms thus link
football coverage to everyday language. They may sound ‘sporty’, especially to sports
followers that do know that they frequently used in sports, but a specific trait of the lan-
guage of sports coverage seems to be not only its technical terminology but rather its
15
I am excluding word pairs that are caused by automatized LTC entries like Gelbe Karte [yellow card]
or Anpfiff 1./2. Halbzeit [kickoff 1 /2 half].
st nd
16
Due to morphological reasons, in German such technical terms are realized as compounds (Freistoß
[free kick], Abstiegszone [relegation zone]).
13
integration of idiomatic patterns that are used in other domains too.17 It is rather the
frequency and the combination of idioms than the idioms themselves that constitute the
register specifics of football coverage and that will be recognized and appreciated by
those who are familiar with it.
While the syntactic patterns described in section 4 serve as templates that still need to
be filled individually, idioms can be described rather as prefabricated building blocks
that can be used as they are. Also, their functional range differs: Whereas the syntactic
patterns are used for the description of single game scenes, the idioms mostly serve to
give more general assessments of a game. Since both the writers of LTCs and MRs have
to give summarizing and evaluative accounts of the game events, the mostly figurative
idioms are functional and, in terms of language production, an economical choice. As
well-established means of description, they can provide vivid and intuitively accessible
accounts without the need of giving further details.
Although idioms usually are lexically fixed and lose their idiomaticity after changing
one of their components, they still show a certain degree of variability on the text level
(Levin 2008:145). For reasons of textual cohesion and coherence, but also for demon-
strating creativity as a journalist, even idioms with non-exchangeable parts can be modi-
fied to some degree, as long as its default form is still recognizable. In (27) the idiom to
throw the kitchen sink at sb., which can be paraphrased as a team’s ‘trying to break
down the other side with everything they have got’, is itensified by insertion of new
elements (Jaki 2014:24).
(27) Now, will Fulham throw the kitchen sink, the microwave and the utensils at Chel-
sea? (spm_PL1314_lt_537)
Cases of clipping (Jaki 2014:25) can be found, too (28–29):
(28) It’s kitchen sink time for Liverpool. (spm_PL1516_lt_1218)
(29) It’s well and truly kitchen sink stuff here but West Ham holding so, so strong.
(spm_PL1516_lt_1084)
These formulations might be described as elliptic forms of time to throw the kitchen
sink or the like, but as parallel forms to set exclamations like it’s party time! they also
gain new pragmatic functions.
In the German data, the idiomatic expression mit angezogener Handbremse [with the
handbrake applied], which approx. means ‘decelerated’, is used in that exact form in 29
out of 36 cases. However, the idiom can be syntactically adjusted as in (30).
(30) Mit zunehmender Spieldauer zog der FC die Handbremse immer weiter an und ach-
tete nun stärker auf die Defensive. (k_BL1415_spb_2696)
As the game progressed, the club applied the handbrake more and more and paid more
attention to its defense.
17
The pervasiveness of the detected expressions beyond sports can be checked by querying them in the
British National Corpus or the Deutsches Referenzkorpus.
14
Arguably, the supplement immer weiter [more and more] would not make sense for real
hand brakes, yet it shows that its figurative meaning of ‘decelerating’ remains activated.
In other cases, the corresponding verb is reversed (31).
(31) Nach dem Seitenwechsel löste der FSV die Handbremse und legte den Vorwärts-
gang ein. (k_BL1314_spb_2443)
After changing sides, the FSV released the handbrake and engaged forward gear.
In this new variant, the idiom can be combined with another one from the same domain
of car driving (Vorwärtsgang einlegen [to engage forward gear]), thus providing a fig-
urative means of establishing textual coherence. In (32) the idiom is strongly truncated:
(32) Das Passspiel ist gewohnt sicher, nach Handbremse sieht es nicht wirklich aus.
(k_BL1415_lt_2582)
Passing is safe as usual, it doesn’t really look like handbrake.
Here, the (syntactically non-embedded) lexical item Handbremse still invokes the over-
all judgement usually passed by the whole idiom, while this very judgement is rejected
with regard to the actual impression of the game (sieht nicht nach x aus [doesn’t look
like x]). Thus, this example shows a creative and allusive use of idioms that fulfils
community-building functions (Wray & Perkins 2000:14). Although the idiom is used
in other domains, too, this truncated formulation will be fully understandable only for
those who are familiar with the register of football coverage.
6 Conclusion
In this study, I have used data-driven, corpus linguistic methods to investigate two types
of formulaic sequences within German and English football live text commentaries and
match reports. First, syntactic patterns that serve as templates for describing recurring
events like efforts, misses or goals were described. As rather schematic patterns, they
can be flexibly adjusted and filled with a large set of mostly expressive and vivid lexical
items to render them unique and emotionalizing despite their schematic nature. In com-
bination, these patterns serve as text routines that are available to the writers and serve
as register-marking devices. Second, methods for automatized detection of idioms were
applied. It was shown that besides special phrasemes idioms from other domains are
also frequently used for giving summarizing and evaluating accounts of the games. Alt-
hough these idioms link football coverage to everyday language, their density and com-
bination serves as a register-marking device. Moreover, variations and modifications of
these idioms can be found, which, by alluding to register typical ways of writing, have
community-building functions. To take up Sinclair (1991:109f.) again, the corpus ana-
lytic results have shown that writers rely on a broad range of prestructured patterns,
which still leave enough open choices to demonstrate creativity and deliver appealing
narratives of the games. These corpus linguistic findings largely confirm those of previ-
ous work on the topic, but due to the quantity of data and the applied methods a more
comprehensive inventory of formulaic language use could be established. Moreover, the
study has shown the common features of live text commentaries and match reports with
regard to formulaicity.
15
This study focussed on German and English data and showed that apart from minor
differences in the syntactic details many types of formulaic sequences can be found in
both languages. Exploratory analyses based on random samples already suggest that
many of the findings hold for other languages, too, and indicate that football coverage
can be seen as cross-cultural registers (Werner 2016:298ff.). For example, the three-step
text routine described, that links the description of an assist, an effort and an intercep-
tion, can also be found in French (33), Spanish (34) and Italian (35) live text commen-
taries.
(33) Après une frappe lointaine d’Iloki déviée en corner, le centre de Thomasson est
repris par Sigthorsson, mais Barrada dégage le ballon de la tête sur sa ligne.
(med_L1_15_lt_404)18
After a long-range shot by Iloki deflected to the corner, Thomasson’s centre is tak-
en up by Sigthorsson, but Barrada clears the ball with his head on the line.
(34) Se dedicó el turco a regatear hasta a dos rivales dentro del área antes de sacar un cen-
tro entre dos jugadores béticos al límite del área pequeña, pero lo rechazó bien el Be-
tis. (as_PD1314_lt_287)19
The turk dribbles towards two opponents in the box before crossing from between
two Betis players at the edge of the penalty area, but it is well denied by Betis.
(35) Dopo una bella serpentina De Paul la mette in mezzo dalla destra ma Badu non in-
quadra la porta di testa. (git_SA1617_lt_61)20
After a beautiful serpentine [slalom], De Paul puts it into the center from the right,
but Badu can’t head the ball into the goal.
Just as football is a transcultural practice, so are at least some of the formulaic patterns
of writing about it. Future research might consider in more detail if this transculturality
also affects other linguistic and textual levels like the use of metaphors, ways of staging
orality and emotional involvement. Also, recent developments in automatized text gen-
eration and translation which are supposed to homogenize football coverage will raise
new issues regarding formulaicity and transculturality.
References
Banerjee, Satanjeev & Ted Pedersen. 2003. The Design, Implementation, and Use of the Ngram Statistics
Package. Proceedings of the 4th International Conference on Computational Linguistics and Intel-
ligent Text Processing, 370–381. (CICLing’03). Berlin, Heidelberg: Springer-Verlag.
Berger, Peter L & Thomas Luckmann. 1966. The Social Construction of Reality: A Treatise in the Socio-
logy of Knowledge. Garden City: Anchor Books.
Bergh, Gunnar. 2011. Football is war. A case study of minute by minute football commentary. Veredas
2011(2). 83–93.
Bondi, Marina. 2010. Perspectives on keywords and keyness: An introduction. In Marina Bondi & Mike
Scott (eds.), Studies in Corpus Linguistics, 1–18. Amsterdam: Benjamins.
Bryant, Jennings, Paul Comisky & Dolf Zillmann. 1977. Drama in sports commentary. Journal of Com-
18
http://www.matchendirect.fr/foot-score/2046024-nantes-marseille.html
19
https://resultados.as.com/resultados/futbol/primera/2013_2014/directo/regular_a_29_2673
20
http://www.goal.com/it/match/udinese-vs-lazio/2305821/live-commentary
16
munication 27(3). 140–149.
Chovanec, Jan. 2015. Participant roles and embedded interactions in online sports broadcasts. In Marta
Dynel & Jan Chovanec (eds.), Participation in public and social media interactions, vol. 256, 67–
95. Amsterdam: John Benjamins Publishing Company. doi:10.1075/pbns.256.04cho.
https://benjamins.com/catalog/pbns.256.04cho (13 July, 2018).
Erman, Britt & Beatrice Warren. 2009. The idiom principle and the open choice principle. Text 20(1). 29–
62. doi:10.1515/text.1.2000.20.1.29.
Croft, William. 2001. Radical construction grammar: syntactic theory in typological perspective. Oxford,
New York: Oxford University Press.
Dalmas, Martine, Dmitrij Dobrovol’skij, Dirk Goldhahn & Uwe Quasthof. 2015. Bewertung durch Ad-
jektive. Ansätze einer korpusgestützten Untersuchung zur Synonymie. Zeitschrift für Literaturwis-
senschaft und Linguistik 45(1). 12–29.
Evert, Stefan. 2009. Corpora and collocations. In Anke Lüdeling (ed.), Corpus Linguistics. An Internat-
ional Handbook. Berlin, Boston: De Gruyter Mouton. 1212–1248.
Feilke, Helmuth. 2010. „Aller guten Dinge sind drei“ – Überlegungen zu Textroutinen & literalen Proze-
duren. In Iris Bons, Thomas Gloning & Dennis Kaltwasser (eds.), Fest-Platte für Gerd Fritz.
http://www.festschrift-gerd-fritz.de/files/feilke_2010_literale-prozeduren-und-textroutinen.pdf.
Ferguson, Charles A. 1983. Sports Announcer Talk: Syntactic Aspects of Register Variation. Language in
Society 12(2). 153–172.
Giulianotti, Richard. 2002. Supporters, followers, fans, and flaneurs. A taxonomy of spectator identities
in football. Journal of Sport and Social Issues 26(1). 25–46.
Goldhahn, Dirk, Thomas Eckart & Uwe Quasthoff. 2012. Building Large Monolingual Dictionaries at the
Leipzig Corpora Collection: From 100 to 200 Languages. Proceedings of the 8th International
Language Ressources and Evaluation (LREC’12). http://www.lrec-
conf.org/proceedings/lrec2012/pdf/327_Paper.pdf.
Gumperz, John J. 1982. Discourse strategies. (Studies in Interactional Sociolinguistics 1). Cambridge,
New York: Cambridge University Press.
Günthner, Susanne. 2007. Intercultural communication and the relevance of cultural specific repertoires
of communicative genres. In Helga Kotthoff & Helen Spencer-Oatey (eds.), Handbook of Intercul-
tural Communication, 127–152. Berlin, New York: Mouton de Gruyter.
Hardie, Andrew. 2012. CQPweb — combining power, flexibility and usability in a corpus analysis tool.
International Journal of Corpus Linguistics 17(3). 380–409.
Jaki, Sylvia. 2014. Phraseological substitutions in newspaper headlines: “More than Meats the Eye.”
(Human Cognitive Processing (HCP). Cognitive Foundations of Language Structure and Use vol-
ume 46). Amsterdam; Philadelphia: Benjamins.
Jucker, Andreas H. 2010. “Audacious, brilliant!! What a strike!” Live text commentaries on the Internet
as real-time narratives. In Christian R. Hoffmann (ed.), Narrative revisited. Telling a story in the
age of new media, 57–78. Amsterdam: Benjamins.
Kern, Friederike. 2010. Speaking dramatically: The prosody of live radio commentary of football match-
es. In Dagmar Barth-Weingarten, Elisabeth Reber & Margret Selting (eds.), Prosody in interac-
tion, 217–238. (Studies in Discourse and Grammar 23). Amsterdam: Benjamins.
Kern, Friederike. 2014. “und der schlägt soFORT nach VORne” – Zur Konstitution von Spannung und
Raum in Fußball-Livereportagen im Radio. In Peter Auer & Pia Bergmann (eds.), Sprache im Ge-
brauch: räumlich, zeitlich, interaktional. Festschrift für Peter Auer, 327–342. (Oralingua 9). Hei-
delberg: Universitätsverlag Winter.
Kirschner, Heiko & Michael Wetzels. 2017. “We sell emotions”. Die kommunikative Konstruktion von
Sportübertragungen am Beispiel Fußball und eSport. In Jo Reichertz & René Tuma (eds.), Der
kommunikative Konstruktivismus bei der Arbeit, 256–290. Weinheim, Basel: Beltz Juventa.
Kuiper, Koenraad. 1996. Smooth talkers: the linguistic performance of auctioneers and sportscasters.
17
Mahwah, N.J: L. Erlbaum Associates.
Levin, Magnus. 2008. “Hitting the back of the net just before the final whistle”: High-frequency phrases
in footbal reporting. In Eva Lavric et al. (ed.), The linguistics of football, 143–153. (Language in
Performance 38). Tübingen: Narr.
Matulina, Željka & Zrinka Ćoralić. 2008. Idioms in Football reporting. In Eva Lavric et al. (ed.), The
linguistics of football, 101–111. (Language in Performance 38). Tübingen: Narr.
Meier, Simon. 2017. Korpora zur Fußballlinguistik – eine mehrsprachige Forschungsressource zur Spra-
che der Fußballberichterstattung. Zeitschrift für germanistische Linguistik 45(2). 345–349.
doi:10.1515/zgl-2017-0018.
Merkel, Udo. 2012. Football fans and clubs in Germany. Conflicts, crises and compromises. Soccer &
Society 13(3). 359–376.
Nordin, Henrik. 2008. The use of conceptual metaphors by Swedish and German football commentators.
In Eva Lavric et al. (ed.), The linguistics of football, 113–120. (Language in Performance 38). Tü-
bingen: Narr.
Pfeiffer, Christian. 2014. Phraseologie in der Fußballberichterstattung der Printmedien. Eine quantitative
Analyse. In Vida Jesensek & Dmitrij Dobrovol’skij (eds.), Phraseologie und Kultur, 491–515.
Maribor: Univerza v Mariboru.
Pinto, Robert C. 1995. Post hoc, ergo propter hoc. In Hans V. Hansen & Robert C. Pinto (eds.), Fallacies.
Classical and contemporary readings, 302–311. University Park PA: Pennsylvania State Universi-
ty Press.
Raack, Alex. 2015. Den MUSS er machen! Phrasen, Posen, Plattitüden - die wunderbare Welt der Fuß-
ball-Klischees. Hamburg: Edel.
Schmid, Helmut. 1994. Probabilistic Part-of-Speech Tagging Using Decision Trees. Proceedings of In-
ternational Conference on New Methods in Language Processing, Manchester, UK.
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.28.1139.
Schmidt, Thomas. 2009. The Kicktionary – a multilingual lexical resource of football language. Multilin-
gual FrameNets in Computational LexicographyMethods and Applications, 101–131. Berlin, Bos-
ton: De Gruyter Mouton.
Sinclair, John. 1991. Corpus, concordance, collocation. Oxford: Oxford University Press.
Telegraph Sport. 2016. The football buzzwords, clichés and stock phrases that need to die, immediately.
The Telegraph. https://www.telegraph.co.uk/football/0/the-football-buzzwords-clichs-and-stock-
phrases-that-need-to-die/ (9 March, 2018).
Werner, Valentin. 2016. Real-time online text commentaries: A cross-cultural perspective. In Christoph
Schubert & Christina Sanchez-Stockhammer (eds.), Variational text linguistics. Revisiting register
in English, 271–306. Berlin, Boston: De Gruyter.
Wilton, Antje. 2017. The interactional construction of evaluation in post-match football interviews. In
David Caldwell, John Walsh, Elaine W. Vine & Jon Jureidini (eds.), The discourse of sport: anal-
yses from social linguistics, 92–112. (Routledge Studies in Sociolinguistics 12). New York, Lon-
don: Routledge.
Wray, Alison & Michael R. Perkins. 2000. The functions of formulaic language. An integrated model.
Language & Communication 20(1). 1–28.
18