Using Corpora For Language and Linguistic Research: Exercise 1: Learn The Basics

1. This document provides instructions for using the Corpus of Contemporary American English (COCA) to conduct linguistic research. It outlines 5 exercises to help users learn the basics of searching the corpus, finding collocates, comparing words, looking up lemmas versus inflected forms, and using partial constructions. 2. Users are shown how to register for an account to access the corpus, search for words, view context examples, find frequency and distribution information, and save search results. 3. The exercises demonstrate how to retrieve collocates of a word based on part of speech and distance, compare synonymous words, obtain all inflected forms by using square brackets in searches, and search for missing parts using asterisk

Uploaded by

Max Pacheco Leal

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

69 views

Using Corpora For Language and Linguistic Research: Exercise 1: Learn The Basics

Uploaded by

Max Pacheco Leal

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

University of Bahrain

Department of English Language and Literature

Conference and Seminars Committee
2016-2017

Using Corpora for Language and Linguistic Research

1. Go to http://corpus.byu.edu/coca/
2. Register, so that you can long in every time you need to use the corpus.
3. Go to ACCOUNT  Usage Limits, to see how many queries you are allowed per day depending on your
user status.
4. Check out corpus information by clinking on these tabs. This will give you information about the size of the
corpus, and the different genres included in it, etc.

Exercise 1: Learn the basics

5. Go to SEARCH, and type the word nice, then hit find matching strings.
6. Check out the FREQ of the word, then tick the box next to the word to retrieve all the contexts where the
word has been used.

7. Notice how many pages of results there are.

8. Also notice where each context has been retrieved from, and from what year.
9. Notice also that you can download a random sample from the corpus consisting of 100 – 200 – 500 – 1000
words. You may choose your data sample size, hit the button, then copy and paste the contexts into an excel
sheet.
10. If you’re interested in particular contexts you can save them in a list that you can go back to later. You need to
provide a title for your list – read more about the function of save list under HELP.

11. You can find out more about the distribution of a word (or structure) in the different genres within the corpus
by clicking on Chart.

12. You may also search for strings of words. Type lose weight and see what you can get.
13. REMEMBER: always hit reset before you start a new search .

DELL – March 2017 – Dr. Dana Abdulrahim Page 1

Exercise 2: Find out the most frequent collocates of a word

14. Hit the Collocates button, and then type nice in the given search box, then hit Find collocates.

15. You can limit your search to particular words that collocate with your search word, which you need to provide
in the second box. Again remember, if the corpus gives you error messages, hit the reset button to make sure
you don’t give any conflicting search commands.
16. Notice that you can specify the part of speech of the collocates you’re interested in, as well as the
distance/location of that word in relation to the KWIC (i.e. before or after the KWIC, two/three/four words
after the KWIC, etc.)
17. You may choose the part of speech that you’re interested in of the collocates. What do you find when you type
the following?

18. What do you think the following search command is looking for?

Exercise 3: Compare between the use of two words

19. If you need to check out the difference between synonymous words, you can do that by hitting the Compare
button, and typing both words in the given search boxes.

20. Notice how the results change when you change the conditions of the search (e.g. part of speech of the
collocates, distance of the collocate, etc.)

DELL – March 2017 – Dr. Dana Abdulrahim Page 2

21. The results table helps you compare between uses of these words by highlighting the number of times each of
the search words was found to collocate with a given word. Hit W1 and W2 to find specific contexts of use.

Exercise 4: Looking up lemmas vs. inflected forms

22. So far we’ve experimented with inflected forms. However, if you’re searching for a verb, e.g. ‘go’ and would
like to retrieve contexts containing all inflected forms of ‘go’: go, went, gone, going, goes, etc. you need to
type your search word in square brackets.

23. The lexical form in brackets is the lemma (i.e. the un-inflected form of the lexical item, listed in the
dictionary) and typing it in the search box will yield all of its inflected forms; whereas when we remove the
brackets in our search we are looking for particular inflected forms.
24. you can try that again now with [nice]. What inflected forms can you find?

Exercise 5: Using partial constructions to search for varieties

25. Type *more in the search box and examine the frequency tables. Now type more* and see what you get.
26. Type more * than and again check out the frequency table. This is a partial construction where the
asterisk (*) indicates a missing part that the corpus fills in with existing words/morphemes.
27. What does the following string mean and what outcomes do you expect the corpus to provide?

Now that you’ve learned the basics of corpus search, go ahead and have fun experimenting with other lexical
items/constructions!

DELL – March 2017 – Dr. Dana Abdulrahim Page 3

Gps Trimble R6 Manual PDF
100% (2)
Gps Trimble R6 Manual PDF
187 pages
Exploring English With Online Corpora
100% (2)
Exploring English With Online Corpora
209 pages
An Introduction To Corpus Linguistics, Bennett
100% (3)
An Introduction To Corpus Linguistics, Bennett
22 pages
Unit 2: The Study of English Module 3 Lexis
No ratings yet
Unit 2: The Study of English Module 3 Lexis
6 pages
Purposive Communication: Module 8: Communication For Work Purposes
100% (2)
Purposive Communication: Module 8: Communication For Work Purposes
18 pages
Fuzzy Identification of Systems and Its Applications To Modeling and Control
No ratings yet
Fuzzy Identification of Systems and Its Applications To Modeling and Control
17 pages
Getting Started With Antconc Wide Emu 2013
No ratings yet
Getting Started With Antconc Wide Emu 2013
11 pages
Corpus Intro
No ratings yet
Corpus Intro
9 pages
BNC170BBNNNCCC
No ratings yet
BNC170BBNNNCCC
170 pages
Corpora in Didactics
No ratings yet
Corpora in Didactics
54 pages
Linguistic Learning Practice Portfolio
No ratings yet
Linguistic Learning Practice Portfolio
28 pages
List, Chart, Collocates, Compare and Concordances List List
No ratings yet
List, Chart, Collocates, Compare and Concordances List List
4 pages
Text Stat Users Guide
No ratings yet
Text Stat Users Guide
9 pages
Introduction to english corpora dot org
No ratings yet
Introduction to english corpora dot org
8 pages
Concordancing and ELT: Porntip Bodeepongse
No ratings yet
Concordancing and ELT: Porntip Bodeepongse
19 pages
Quickstart Guide To Text Analysis With Textstat
No ratings yet
Quickstart Guide To Text Analysis With Textstat
2 pages
Linguistic Learning Practice Portfolio
No ratings yet
Linguistic Learning Practice Portfolio
21 pages
English Corpora
No ratings yet
English Corpora
27 pages
Using Corpora For English Language Learners: A Guide To
No ratings yet
Using Corpora For English Language Learners: A Guide To
15 pages
The International Encyclopedia of Language and Social Interaction - 2015 - Vaughan
No ratings yet
The International Encyclopedia of Language and Social Interaction - 2015 - Vaughan
17 pages
Robert Poole - A Guide To Using Corpora For English Language Learners-Edinburgh University Press (2022)
No ratings yet
Robert Poole - A Guide To Using Corpora For English Language Learners-Edinburgh University Press (2022)
192 pages
COCA_Worksheet(student)-final
No ratings yet
COCA_Worksheet(student)-final
3 pages
Electronic Language Resources
No ratings yet
Electronic Language Resources
67 pages
Help
No ratings yet
Help
12 pages
Corpus Bases Language Studies
No ratings yet
Corpus Bases Language Studies
312 pages
Cheng 2012 PP 3-8 Intro
No ratings yet
Cheng 2012 PP 3-8 Intro
6 pages
A 17 Concordance
No ratings yet
A 17 Concordance
13 pages
Corpus Based Language Studies PDF
20% (5)
Corpus Based Language Studies PDF
6 pages
The Changing Face of Corpus Linguistics Antoinette Renouf - The complete ebook version is now available for download
No ratings yet
The Changing Face of Corpus Linguistics Antoinette Renouf - The complete ebook version is now available for download
54 pages
CorpusSearch Guide
No ratings yet
CorpusSearch Guide
104 pages
Corpus Linguistics: An Introduction
No ratings yet
Corpus Linguistics: An Introduction
43 pages
Cambridge Sketch Engine Getting Started 2.0
No ratings yet
Cambridge Sketch Engine Getting Started 2.0
24 pages
CORPES - Manual en-US
No ratings yet
CORPES - Manual en-US
26 pages
PhE Lecture 10 - The Corpus and Corpus Analysis
No ratings yet
PhE Lecture 10 - The Corpus and Corpus Analysis
75 pages
Developing and Evaluating A Learner Friendly Collocation System With User Query Data
No ratings yet
Developing and Evaluating A Learner Friendly Collocation System With User Query Data
26 pages
Iweb Overview
No ratings yet
Iweb Overview
8 pages
The Changing Face of Corpus Linguistics Antoinette Renouf instant download
100% (1)
The Changing Face of Corpus Linguistics Antoinette Renouf instant download
36 pages
Quantitative Linguistics With R
No ratings yet
Quantitative Linguistics With R
29 pages
Corpus Linguistics
No ratings yet
Corpus Linguistics
3 pages
Digital Corpora Presentation
No ratings yet
Digital Corpora Presentation
20 pages
Voyant Analysis
No ratings yet
Voyant Analysis
17 pages
Corpus Based Language Studies An advanced resource book 1st Edition Tony Mcenery 2024 scribd download
100% (1)
Corpus Based Language Studies An advanced resource book 1st Edition Tony Mcenery 2024 scribd download
67 pages
Digital Corpora as a Source of Authentic Materials in Teaching Grammar
No ratings yet
Digital Corpora as a Source of Authentic Materials in Teaching Grammar
12 pages
Using The Corpus in Linguistic Research
No ratings yet
Using The Corpus in Linguistic Research
22 pages
Corpus_linguistics_and_the_description_of_English_-_facebook_com_LinguaLIB
No ratings yet
Corpus_linguistics_and_the_description_of_English_-_facebook_com_LinguaLIB
241 pages
Lindquist H. Corpus Linguistics and The Description of English
No ratings yet
Lindquist H. Corpus Linguistics and The Description of English
241 pages
Jones_2022
No ratings yet
Jones_2022
14 pages
Unit 7 Text book AL Bad and Good English
No ratings yet
Unit 7 Text book AL Bad and Good English
17 pages
Tutorial A Gentle Introduction To TXM Key Concepts in One and An Half Hour
No ratings yet
Tutorial A Gentle Introduction To TXM Key Concepts in One and An Half Hour
11 pages
Corpus Introduction
No ratings yet
Corpus Introduction
22 pages
Using Corpora in The Language Learning Classroom Corpus Linguistics For Teachers My Atc
No ratings yet
Using Corpora in The Language Learning Classroom Corpus Linguistics For Teachers My Atc
22 pages
(Ebooks PDF) Download The Changing Face of Corpus Linguistics Antoinette Renouf Full Chapters
100% (3)
(Ebooks PDF) Download The Changing Face of Corpus Linguistics Antoinette Renouf Full Chapters
66 pages
Quick Cups Of Coca
From Everand
Quick Cups Of Coca
Mura Nava
No ratings yet
Dicción 1
No ratings yet
Dicción 1
52 pages
The Corpus of Contemporary American English
No ratings yet
The Corpus of Contemporary American English
7 pages
Instant ebooks textbook The Changing Face of Corpus Linguistics Antoinette Renouf download all chapters
100% (12)
Instant ebooks textbook The Changing Face of Corpus Linguistics Antoinette Renouf download all chapters
50 pages
Corpora and Materials Remastered
0% (1)
Corpora and Materials Remastered
31 pages
Corpus Definitions. Last Year
No ratings yet
Corpus Definitions. Last Year
6 pages
Doing Linguistics With A Corpus
No ratings yet
Doing Linguistics With A Corpus
88 pages
Chap 2 part 1
No ratings yet
Chap 2 part 1
8 pages
Sixty Words or Phrases Commonly Misused by ESL/EFL Students Preparing for Universities
From Everand
Sixty Words or Phrases Commonly Misused by ESL/EFL Students Preparing for Universities
Kenneth Cranker
2.5/5 (7)
GRE Words In Context: List 1
From Everand
GRE Words In Context: List 1
Vibrant Publishers
No ratings yet
Better Sentence Writing in 30 Minutes a Day
From Everand
Better Sentence Writing in 30 Minutes a Day
Diana Campbell
No ratings yet
Food Recommendation System Using One-Stage Algorithm
No ratings yet
Food Recommendation System Using One-Stage Algorithm
14 pages
Data Link Layer - Ethernet LAN - ARP - Physical PDF
No ratings yet
Data Link Layer - Ethernet LAN - ARP - Physical PDF
58 pages
Mediaencoder Reference
No ratings yet
Mediaencoder Reference
51 pages
CloudEngine 6863 Data Center Switch Datasheet
No ratings yet
CloudEngine 6863 Data Center Switch Datasheet
13 pages
RIce Plant Disease Detection Using Different AI Approaches
No ratings yet
RIce Plant Disease Detection Using Different AI Approaches
11 pages
Rack Mount Economic IP Amplifier: T-7760B/T-77120B/T-77240B/T-77350B/T-77500B/T-77650B
No ratings yet
Rack Mount Economic IP Amplifier: T-7760B/T-77120B/T-77240B/T-77350B/T-77500B/T-77650B
1 page
Rx3i-Ethernet Manual PDF
100% (1)
Rx3i-Ethernet Manual PDF
328 pages
Get Beginning Lua Programming Programmer to Programmer 1st Edition Kurt Jung free all chapters
100% (3)
Get Beginning Lua Programming Programmer to Programmer 1st Edition Kurt Jung free all chapters
55 pages
Student Management System C++
No ratings yet
Student Management System C++
9 pages
2022 Instrumentation and Control Servicing NC3 Outline
No ratings yet
2022 Instrumentation and Control Servicing NC3 Outline
2 pages
Gek 130922B PDF
No ratings yet
Gek 130922B PDF
796 pages
Mergeable Heap
No ratings yet
Mergeable Heap
51 pages
Brochure A70 - HR
No ratings yet
Brochure A70 - HR
7 pages
Application of Planar Graph To Design Printed Circuit Board
No ratings yet
Application of Planar Graph To Design Printed Circuit Board
6 pages
Chapter 2
No ratings yet
Chapter 2
27 pages
IoT UNit 3 IPU
No ratings yet
IoT UNit 3 IPU
116 pages
Government e Marketplace (GeM)
No ratings yet
Government e Marketplace (GeM)
4 pages
Snmpv1 Network Management: Organization and Information Models
No ratings yet
Snmpv1 Network Management: Organization and Information Models
62 pages
B.SC (Computer Science) 2013 Pattern PDF
No ratings yet
B.SC (Computer Science) 2013 Pattern PDF
129 pages
Guide To 4D Simulation For VDC Using Fuzor
No ratings yet
Guide To 4D Simulation For VDC Using Fuzor
30 pages
KYC App SRS
No ratings yet
KYC App SRS
13 pages
Kaggle State of Machine Learning and Data Science Report 2022
No ratings yet
Kaggle State of Machine Learning and Data Science Report 2022
25 pages
Adobe Lightroom Tutorial Get Started With Lightroom - Udemy Blog
No ratings yet
Adobe Lightroom Tutorial Get Started With Lightroom - Udemy Blog
1 page
CLC 280 Liquid CPU Cooler: Waterblock Specs Teflon Nano Bearing Fan Evga Flow Control
No ratings yet
CLC 280 Liquid CPU Cooler: Waterblock Specs Teflon Nano Bearing Fan Evga Flow Control
1 page
MODULE 4 hpc
No ratings yet
MODULE 4 hpc
41 pages
Iteration, Algebra Revision Notes From A-Level Maths Tutor
No ratings yet
Iteration, Algebra Revision Notes From A-Level Maths Tutor
5 pages
Data Analytics - Unit - 1
No ratings yet
Data Analytics - Unit - 1
25 pages