corpus Flashcards

(395 cards)

1
Q

What is a corpus? [Corpus Linguistics]

A

A principled collection of texts, either spoken or written, stored on a computer and available for analysis using specialized software. [O’Keefe, McCarthy and Carter]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is required for a collection of texts to be considered a true corpus? [Corpus Design]

A

It must represent a specific theme or domain, and its validity depends on how representative it is of that target language.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How is a spoken corpus made representative? [Corpus Design]

A

By including accurate, varied samples from different geographic areas, educational backgrounds, and social classes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the etymology of the word “corpus”? [Corpus Linguistics]

A

It originates from the Latin word “carpa” (body), meaning a body of text.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are the three core analytical functions of corpus software applications? [Corpus Tools]

A

Generating word lists, concordances, and keyword lists.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What does a word and multi-word frequency list provide? [Corpus Analysis]

A

Quantitative counts detailing which individual words or phrases occur most frequently within the dataset.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is a concordance? [Corpus Analysis]

A

A search result displaying an individual word or phrase centered in the middle of a line, flanked by its immediate left and right co-text.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is a collocation? [Corpus Analysis]

A

A statistical tendency of specific words to frequently co-occur or follow one another (e.g., “blonde hair” or “bargain with”).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is an n-gram? [Corpus Analysis]

A

Contiguous sequences of words (like two-word or four-word clusters) that function together, often carrying specific discourse functions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is a keyword list? [Corpus Analysis]

A

A comparative metric contrasting a target corpus against a larger reference corpus to identify unusually frequent words, revealing the “aboutness” of the text.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Why is the British National Corpus (BNC) frequently used for keyword comparison? [Corpus Reference]

A

It is a massive, 100-million-word dataset representing a wide variety of spoken and written English, making it an ideal baseline.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How has corpus linguistics changed modern lexicography? [Lexicography]

A

Dictionaries are now informed by authentic language data, often prioritizing actual frequency and real-world usage over literal definitions (e.g., the figurative use of “fire”).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How can corpus tools distinguish near-synonyms like “terrible” and “horrible”? [Semantic Analysis]

A

By using functions like “Word Sketch Difference” to isolate and compare the unique collocational behaviors of the similar words.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What characterizes word frequencies in spoken English compared to written English? [Sociolinguistics]

A

Spoken English exhibits a higher frequency of discourse markers, personal pronouns, and politeness strategies.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is Sketch Engine? [Corpus Tools]

A

A cloud-based application used for corpus analysis, featuring functions like word sketches, concordances, and n-grams.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are the primary pedagogical applications of corpus research? [Language Teaching]

A

It informs curriculum design, vocabulary prioritization, dictionary creation, and the evaluation of both learner competence and language errors.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is the focus of the 90% assignment in the AL7732 module? [Module Assessment]

A

Writing a 2,000-3,000 word case study report profiling a learner’s strengths and errors using corpus tools to determine if they are accurately placed at a C1 level.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is required for the 10% forum assignment in the AL7732 module? [Module Assessment]

A

Reviewing a single lesson using five specific headings and replying to at least one classmate’s review with a minimum of 100 words.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Assessment 1 Requirements [Module Assessment]

A

Review a lesson from the resource book Teaching English with Corpora using five specific headings (title/page, level, summary, liked, improved) and reply to at least one classmate.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Assessment 1 Word Count Breakdown [Module Assessment]

A

A 50-word summary, 100 words on what was liked, 100 words on what could be improved, and a 100-word peer reply.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Assessment 1 Deadline [Module Assessment]

A

Friday, April 24, 2026.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Word Sketch [Sketch Engine]

A

A tool that generates categorized tables and visualizations of a word’s collocations, grouping them by syntactic relationships like modifiers, verbs, and prepositional phrases.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Semantic Nuance via Modifiers [Corpus Linguistics]

A

Analyzing the specific adverbs that modify a word (e.g., “surprisingly warm”) to reveal idiomatic, metaphoric, or interpersonal meanings rather than literal definitions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Word Sketch Difference [Sketch Engine]

A

A comparative tool used to distinguish near-synonyms (like “hot” vs. “warm”) by contrasting their aggregate frequency counts and visually mapping their shared and exclusive collocations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Binomial Sequencing Analysis [Corpus Linguistics]
Utilizing concordance lines to empirically prove the fixed, most frequent word order of paired items (e.g., confirming "bread and butter" while showing zero occurrences of "butter and bread").
26
Lemma [Corpus Linguistics]
A base linguistic form that groups together all morphological inflections for frequency counting (e.g., tallying "go," "goes," "going," and "went" as a single unit).
27
Filtered Word Lists [Sketch Engine]
The ability to generate frequency lists strictly restricted to specific grammatical categories, allowing researchers to isolate all prepositions or adverbs in a corpus.
28
LexTutor [Pedagogical Tools]
A web-based platform (lextutor.ca) utilizing corpora like the BNC and COCA to facilitate vocabulary assessment, testing, and materials development.
29
Automated Cloze Generation [Pedagogical Tools]
Using LexTutor to automatically create interactive gap-fill exercises by selectively removing vocabulary based on corpus frequency bands (e.g., post-1000 words) or specific parts of speech.
30
What is the primary objective of the 90% learner profile report? [Module Assessment]
To write a 2,000–3,000-word case study analyzing a learner's writing to empirically determine if they are accurately placed at the C1 CEFR level.
31
How must linguistic errors be formatted when coding the learner text? [Corpus Coding]
They must be systematically tagged using custom codes enclosed within angle brackets (e.g., greens ).
32
What is required to accompany the error coding system in the final report? [Corpus Coding]
A clear key that explicitly defines what each custom error code represents.
33
What is the purpose of the competency analysis in the learner profile? [Learner Profiling]
To identify and evaluate what the learner does correctly, focusing on their successful command of advanced vocabulary, syntax, and pragmatic structures.
34
What is the Open Cambridge Learner Corpus (OpenCLC)? [Corpus Resources]
A 2.9-million-word subset of student exam responses used to cross-reference a learner's output against peers with similar demographic or linguistic backgrounds.
35
What is Text Inspector? [Corpus Tools]
A web-based analytical tool that estimates the CEFR level of a submitted text by calculating metrics such as lexical diversity, density, and academic word percentage.
36
What is the English Vocabulary Profile (EVP)? [Corpus Resources]
A reference database that profiles specific words, idioms, and phrases to indicate the typical CEFR proficiency level at which a learner can use them accurately.
37
What is the English Grammar Profile (EGP)? [Corpus Resources]
A reference database that maps syntactic and grammatical structures to specific CEFR levels using descriptive "can-do" statements.
38
What is the weighting and primary task of Assessment 2? [Module Assessment]
90% of the grade; writing a case study report profiling a learner named Joseph to determine if C1 is his correct proficiency level using corpus tools.
39
What are the demographic and linguistic background details of the case study learner "Joseph"? [Learner Profile]
A 50-year-old from the Democratic Republic of Congo whose first language is French, placed in a C1 English class.
40
What specific format is required for the Assessment 2 submission? [Module Assessment]
A 2,000–3,000-word case study report that utilizes screenshots of corpus tools to substantiate findings, rather than a traditional academic essay.
41
How must linguistic errors be strictly formatted for Sketch Engine to automatically count them? [Error Coding]
They must be enclosed in angle brackets with a custom code and a closing tag with a forward slash (e.g., error ).
42
What is the functional purpose of the closing tag (e.g., ) in text markup? [Error Coding]
It explicitly indicates to the corpus software exactly where the erroneous text ends so frequencies can be automatically calculated.
43
What critical component must be included in the report alongside the coded text? [Error Coding]
A clear, tabular key that explicitly defines what each custom error code represents (e.g., = missing article).
44
Why is 40% of the Assessment 2 grade dedicated to competency analysis? [Learner Profiling]
To ensure the evaluation empirically identifies what the learner does successfully (e.g., advanced syntax, pragmatics) rather than solely focusing on their mistakes.
45
What specific linguistic elements demonstrate communicative competence even when surrounding grammar is flawed? [Learner Profiling]
The accurate use of de-lexical verbs, collocations, idiomatic language, discourse markers, or academic distancing phrases.
46
What metrics does Text Inspector evaluate to estimate the CEFR level of a submitted text? [Corpus Tools]
Lexical diversity, density, and the percentage of academic vocabulary and pragmatic discourse markers used.
47
How does the "Tagger" function in Text Inspector facilitate grammatical analysis? [Corpus Tools]
It categorizes the entire text by specific parts of speech (e.g., isolating all coordinating conjunctions) to evaluate structural complexity.
48
What is the primary function of the English Vocabulary Profile (EVP) and English Grammar Profile (EGP)? [Corpus Tools]
To cross-reference and verify the specific CEFR proficiency level of individual words, idioms, or syntactic structures used by a learner.
49
How is the Open Cambridge Learner Corpus (OpenCLC) utilized in comparative learner profiling? [Corpus Tools]
To compare the frequency of a specific learner's lexical choices against a vast dataset of other learners segmented by age, first language, and CEFR level.
50
What pedagogical principle dictates that language should be taught as phrases and strings rather than isolated single words? [Corpus Theory]
The concept that "words hunt in packs" and do not occur alone. [J.R. Firth, 1957]
51
What is a de-lexical verb? [Lexical Semantics]
A verb that carries little independent meaning but forms a meaningful semantic phrase when combined with a specific noun (e.g., "make an effort," "have a party").
52
What is a multi-word verb? [Lexical Semantics]
A verb consisting of a main verb and one or more particles that together create a distinct, often non-literal meaning (e.g., "take after").
53
What phrase is cited as a prime example of an idiomatic chunk where the individual words lose their independent meaning when separated? [Lexical Semantics]
"Of course." [John Sinclair, 1991]
54
What does the idiom "take him to the cleaners" mean? [Idiomaticity]
To sue someone and legally take all of their money.
55
Why should English language learners be encouraged to make "lots and lots of new mistakes"? [Pedagogy]
Because making new errors, rather than repeating fossilized ones, is essential for language acquisition and provides teachers with clear instructional focus. [Liz Regan, 2003]
56
What specific linguistic behavior marks a learner as advanced (B1-C2 or IELTS Band 7+) across assessment frameworks? [Language Assessment]
The accurate and frequent use of figurative language strings, de-lexical verbs, collocations, and idioms.
57
What analogy is used to describe the dual perspective of seeing learner errors versus learner competence? [Learner Language Evaluation]
The duck-rabbit illusion, illustrating that it is often easier to see the "duck" (errors) than the "rabbit" (competence).
58
What is contrastive analysis? [Applied Linguistics History]
A 1930s-1950s behaviorist approach that compared a learner's L1 and L2 to predict difficulties based on structural similarities and differences. [Robert Lado]
59
What term describes the phenomenon where structural similarities between an L1 and L2 facilitate language learning? [Contrastive Analysis]
Positive transfer. [Robert Lado]
60
What term describes the phenomenon where structural differences between an L1 and L2 cause learning difficulties? [Contrastive Analysis]
Negative transfer (or interference). [Robert Lado]
61
Which language teaching method was directly informed by contrastive analysis and relied on habit formation and repetitive drills? [Language Pedagogy]
The Audiolingual method.
62
Who advocated for error analysis in the 1960s, viewing errors not as bad habits but as "windows into the learning process"? [Applied Linguistics History]
Stephen Pit Corder.
63
What are developmental errors? [Error Analysis]
Intralingual errors that are common to both first and second language acquisition (e.g., overgeneralizing the past tense to say "eated"). [Stephen Pit Corder]
64
What are interlingual errors? [Error Analysis]
Errors caused by the direct interference or negative transfer from the learner's first language. [Stephen Pit Corder]
65
What is interlanguage? [Second Language Acquisition]
The concept that learner language is not just an imperfect copy of the target language, but a developing, rule-governed linguistic system in its own right.
66
Who established the first systematic electronic collection of learner language (the International Corpus of Learner English)? [Corpus Linguistics]
Sylviane Granger.
67
What is corpus-based competency analysis? [Corpus Linguistics]
The systematic use of learner corpora to describe and identify what learners are successfully able to do at different proficiency levels, rather than just what they do wrong.
68
What is the English Vocabulary Profile (EVP)? [Corpus Resources]
A database developed by Cambridge that identifies the specific CEFR level at which learners typically acquire certain words and their varying senses (e.g., "live in" at A1 vs. "live on" at B2). [Annette Capel]
69
What is the English Grammar Profile (EGP)? [Corpus Resources]
A database containing over 1,200 descriptors that map specific grammatical structures and competencies to their corresponding CEFR levels. [Anne O'Keefe and Geraldine Mark]
70
What tool integrates the English Vocabulary Profile to automatically analyze the percentage of words in a learner's text that belong to each CEFR level? [Corpus Tools]
Text Inspector.
71
What massive database contains 55 million words of learner exam scripts, 32 million of which are error-coded? [Corpus Resources]
The Cambridge Learner Corpus.
72
What is the accuracy-complexity trade-off? [Second Language Acquisition]
The phenomenon where learners taking risks to produce more complex, higher-level language structures temporarily experience an increase in their error rate. [Jennifer Tuveson]
73
Why is stabilization in advanced (C1/C2) learners not considered "linguistic rigor mortis"? [Second Language Acquisition]
Because continued errors at advanced levels often indicate ongoing risk-taking and organic language development rather than a cessation of learning. [Diane Larsen-Freeman]
74
What is the Cambridge Learner Corpus? [Corpus Resources]
A dataset containing over 55 million words from 200,000 Cambridge exam scripts, featuring 32 million manually error-coded words across 143 first languages.
75
What is the Open Cambridge Learner Corpus? [Corpus Resources]
A publicly accessible subset containing nearly 3 million words that is aligned with CEFR levels but lacks manual error coding.
76
Why is tracking both the "exam taken" and "performance achieved" metadata critical for analyzing learner data? [Corpus Methodology]
Because a learner might sit for an exam at one CEFR level (e.g., B2) but actually perform at a higher or lower true proficiency level (e.g., C1 or B1).
77
What is Corpus Query Language (CQL)? [Corpus Tools]
A specialized syntax used in software like Sketch Engine to conduct advanced searches using wildcards, regular expressions, and Part-of-Speech tags.
78
How do you construct a CQL query to find any word beginning with the letters "wh"? [Corpus Query Language]
word="wh.*"
79
How do you construct a CQL query to find any word ending with the letters "ing"? [Corpus Query Language]
word=".*ing"
80
What CQL tag is used to search strictly for regular adverbs? [Corpus Query Language]
tag="RB"
81
What CQL tag is used to search for regular adverbs as well as their comparative and superlative forms? [Corpus Query Language]
tag="RB."
82
What CQL tag is used to isolate all nouns? [Corpus Query Language]
tag="N.*"
83
What CQL tag is used to isolate all adjectives? [Corpus Query Language]
tag="J.*"
84
How do you construct a CQL query to find a sequence of an adverb followed immediately by an adjective? [Corpus Query Language]
tag="RB" tag="J.*"
85
How does the use of adverbs developmentally shift from B1 to C2 proficiency? [Learner Language Development]
B1 learners rely heavily on basic intensifiers (e.g., "very," "really"), whereas C2 learners utilize greater lexical variety, hedging, and discourse organization (e.g., "especially," "moreover").
86
What is error coding in the context of learner corpora? [Corpus Methodology]
The systematic process of identifying and marking errors within a text using a specific bracketed syntax so corpus software can automatically tabulate them.
87
How do you correctly format the opening tag when coding an error? [Error Coding Syntax]
Insert angle brackets containing a chosen symbol representing the error type immediately before the erroneous text (e.g., ).
88
How do you correctly format the closing tag when coding an error? [Error Coding Syntax]
Insert angle brackets containing a forward slash followed by the same error symbol immediately after the erroneous text (e.g., ).
89
What is the functional purpose of the closing tag (e.g., )? [Error Coding Syntax]
It explicitly indicates to the corpus software exactly where the error sequence ends, allowing for accurate automated counting and frequency analysis.
90
What critical document must you create and submit alongside your coded text? [Corpus Methodology]
A comprehensive key or table that explicitly defines what each custom error code represents, ideally including an example sentence for each.
92
What must you include in your final report regarding your chosen coding system? [Corpus Methodology]
A reflective rationale explaining the analytical decisions behind your specific coding design, justified by the patterns you observed in the data.
93
What is a lemma? [Lexical Semantics]
The foundational base form of a word (e.g., "run") that serves as the root for all of its grammatical and morphological variants.
94
What are inflectional morphemes? [Morphology]
Grammatical markers or affixes added to a word to indicate tense, aspect, number, or comparison (e.g., the "-s" in "runs" or the "-er" in "happier").
95
What kind of error occurs when a learner fails to follow words like "must" or "should" with a base verb? [Error Types]
A modal verb error.
96
What defines a run-on sentence or comma splice? [Error Types]
The grammatical error of incorrectly connecting two fully independent clauses with only a comma, rather than separating them with a period, semicolon, or coordinating conjunction.
97
What kind of error is occurring when a learner writes "Republic of ireland" instead of "Republic of Ireland"? [Error Types]
A capitalization or proper noun error.
98
What is a word form error? [Error Types]
A morphological mistake where the incorrect grammatical variant of a word is used, such as writing the adjective "economical" when the adverb "economically" is required.
99
What error is occurring if a learner writes "damage that smoking can cause in human health" instead of "to human health"? [Error Types]
A preposition error.
100
What are the five sequential steps in the error coding workflow? [Corpus Methodology]
1. Scan the text for recurring patterns. 2. Design a draft coding key. 3. Systematically code the texts. 4. Compute the totals using corpus software. 5. Analyze the resulting data to draft the final report.
101
Mistake [Learner Error Analysis]
An unintentional performance-related lapse or slip where the learner possesses the underlying linguistic competence and can typically self-correct.
102
Error [Learner Error Analysis]
A systematic manifestation of a deficit in interlanguage competence requiring pedagogical intervention, as the learner lacks the underlying knowledge to self-correct.
103
Affective Filter Hypothesis [Applied Linguistics]
Stephen Krashen's theory stating that students learn best in low-anxiety environments where motivation is high and errors are treated as a natural part of acquisition.
104
Capitalization (CA) error [Error Categories]
Failing to capitalize the first word of a sentence or a proper noun.
105
Subject-Verb Agreement (AG) error [Error Categories]
Failing to match the subject and verb in number, such as omitting the third-person singular "s".
106
Pluralization (PL) error [Error Categories]
Incorrectly pluralizing non-count nouns, such as writing "furnitures" or "evidences".
107
Parallelism (PRL) error [Error Categories]
Lacking structural consistency within a series or list, such as mixing gerunds and infinitives.
108
Fragmentation (FRG) error [Error Categories]
Punctuating a dependent subordinate clause as if it were a complete, independent sentence.
109
Relative Pronoun (PRO) error [Error Categories]
Confusing animate pronouns (who/that) with inanimate pronouns (which).
110
Gender Balance error [Error Categories]
Using exclusively male pronouns to refer to generic nouns, rather than using gender-inclusive forms.
111
Reflective Rationale [Learner Profile Report]
A section reflecting on the challenges faced while designing the error coding system and justifying the chosen codes.
112
General Observation [Learner Profile Report]
Broad introductory statements characterizing a specific error type before presenting detailed analysis.
113
Data Reporting [Learner Profile Report]
Presenting quantitative data drawn from corpus software, such as tables showing error frequencies among different students.
114
Deep Analysis & Sub-categorization [Learner Profile Report]
Providing qualitative analysis by breaking down errors into specific patterns observed in concordance lines, evidenced by screenshots.
115
Pedagogical Recommendations [Learner Profile Report]
Suggesting specific instructional interventions to help the learner improve based on the corpus analysis findings.
116
What questions about language learning are corpus linguistics tools generally poorly suited to answer? [Corpus Linguistics Limitations]
Questions regarding the cognitive storage of vocabulary (the mental lexicon) and the psychological motivations of learners. [McCarthy, 2026]
117
Why is a raw frequency list of a large corpus largely ineffective for identifying a core lexical vocabulary? [Corpus Methodology]
Because it will be overwhelmingly dominated by grammatical function words (e.g., "the," "of," "and") rather than lexical content words. [McCarthy, 2026]
118
What is a stop list? [Corpus Methodology]
A finite list of known grammatical function words that the computer is instructed to ignore when generating a lexical frequency list. [McCarthy, 2026]
119
What is lemmatization? [Corpus Methodology]
The process of grouping words by their base form (lemma) so that inflected variants (e.g., take, took, taken) are counted as a single lexical unit. [McCarthy, 2026]
120
What distinguishes a word family from a lemma? [Lexicology]
A word family includes not only the base form and its inflections but also its derivations (prefixes and suffixes), such as grouping "war," "pre-war," and "post-war." [McCarthy, 2026]
121
What is a de-lexical verb? [Lexical Semantics]
A highly frequent verb (e.g., get, go, make, take) that possesses very little independent lexical content and instead derives its meaning from the words it collocates with. [McCarthy, 2026]
122
What primary linguistic elements comprise the approximately one-third of vocabulary that differs between spoken and written corpora? [Sociolinguistics]
Words used to create and maintain interpersonal social relations, and words that form cemented multi-word chunks. [McCarthy, 2026]
123
What is a collocation? [Lexical Semantics]
A frequent pairing of words that maintain their associative strength even when separated by other intervening words (e.g., "blonde, beautiful, long hair"). [McCarthy, 2026]
124
What is a lexical chunk? [Lexical Semantics]
A multi-word item that is firmly cemented together in a specific sequence, where altering or interrupting the sequence destroys its meaning (e.g., "by the way"). [McCarthy, 2026]
125
What is corpus coverage? [Corpus Methodology]
The percentage of a given text or corpus that is accounted for by a specific number of the most frequent words. [McCarthy, 2026]
126
What percentage of a typical corpus is covered by the first 2,000 most frequent words? [Vocabulary Acquisition]
Approximately 83%. [McCarthy, 2026]
127
What percentage of corpus coverage is generally required for a learner to read or listen comfortably without losing motivation? [Vocabulary Acquisition]
Approximately 95%, which requires knowledge of roughly the top 10,000 words. [McCarthy, 2026]
128
Since teaching 10,000 words in a standard classroom is unrealistic, what is the primary pedagogical strategy for vocabulary acquisition? [Language Pedagogy]
Equipping students with independent vocabulary learning strategies, particularly training them to learn words in multi-word chunks rather than as single isolated units. [McCarthy, 2026]
129
Why is learning vocabulary in multi-word chunks considered a vital cognitive strategy for language learners? [Psycholinguistics]
Because the mental lexicon stores chunks rather than just single words, and retrieving chunks allows for instantaneous access and greater speaking fluency. [McCarthy, 2026]
130
What prerequisite knowledge is required to extract accurate and useful corpus data using an Artificial Intelligence platform? [Artificial Intelligence Tools]
A solid foundational understanding of corpus linguistics, which is necessary to write precise prompts for specific outputs like "keyword lists" or "collocations." [McCarthy, 2026]
131
Market-driven nature of ELT publishing [Materials Development]
Publishers commission books based on commercial market research and teacher expectations rather than academic linguistic research. [Burton, 2026]
132
Coursebook vs. Novelist production model [Materials Development]
Unlike novelists, ELT authors are commissioned by publishers to write to a specific concept under massive time pressure (often <1 year). [Burton, 2026]
133
Conditional construction discrepancy [Grammar Misrepresentation]
The standard ELT 4-type system accounts for only 44% of real-world "if" clauses found in the BNC. [Gabrielatos; Burton, 2026]
134
Hypothetical vs. Past "if" clauses [Grammar Misrepresentation]
Coursebooks teach "if + past simple" as hypothetical; however, 1/3 of corpus examples refer to actual past events (e.g., "If I offended you, I'm sorry"). [Burton, 2026]
135
Present Perfect over-representation [Grammar Misrepresentation]
Coursebooks over-rely on "since" and "for" while ignoring the highly frequent co-occurrence of the adverb "now" with the present perfect. [Shortall; Burton, 2026]
136
Quotatives in reported speech [Spoken Grammar]
Coursebooks focus on "said/asked," while corpora show high frequencies of spoken introducers like "be like," "be all," or "go." [Burton, 2026]
137
Ellipsis in spoken discourse [Spoken Grammar]
The natural omission of subjects or auxiliary verbs (e.g., "[I] didn't know you used boiling water") commonly found in corpora but ignored in "clean" coursebook dialogues. [McCarthy & Carter; Burton, 2026]
138
Fronted Topic [Spoken Grammar]
Placing an extra element at the start of a clause for focus (e.g., "This friend of mine, her son..."), a feature common in speech but absent from materials. [McCarthy & Carter; Burton, 2026]
139
Continuative/Comment "which" clauses [Spoken Grammar]
Using "which" to add a comment on a previous statement (e.g., "...which is nice"); these make up 70% of spoken relative clauses but are treated as "advanced" or secondary in books. [Tao & McCarthy; Burton, 2026]
140
Three-part exchange structure [Discourse Analysis]
Authentic conversation follows a "Question-Answer-Comment" pattern; coursebook dialogues often skip the third "Comment" part, making them sound robotic. [McCarthy & Carter; Burton, 2026]
141
Inauthenticity of transactional dialogues [Discourse Analysis]
Real-world transactions contain "messy" features like false starts, overlaps, and back-channeling, which are typically removed from "clean" coursebook recordings. [Gilmore; Burton, 2026]
142
Arbitrary lexical selection [Lexis Misrepresentation]
Studies show as little as 1% overlap in lexical phrases across different coursebooks, suggesting authors rely on intuition rather than principled corpus data. [Koprovsky; Burton, 2026]
143
Barriers to corpus use among authors [Materials Development]
Surveyed authors cited a lack of time, lack of access to software, and a perceived lack of technical expertise as reasons for not using corpora. [Burton, 2012/2026]
144
The "Disconnect" in ELT [Applied Linguistics]
The gap between academic linguists (who suggest changes based on data) and publishers (who only change if the market demands it). [Burton, 2026]
145
English Profile Project impact [Materials Development]
Newer series like "Cambridge English Empower" use EVP/EGP data to select level-appropriate meanings of words (e.g., "just") based on actual learner performance. [Burton, 2026]
146
What characterizes the "canon" at the advanced level (C1-C2) compared to lower levels? [Learner Levels]
A lack of consensus on scope and sequence; grammar often becomes a "rag bag" of obscure items rather than a clear progression. [McCarthy, 2026]
147
What is the "ever-reducing returns" problem in advanced vocabulary? [Vocabulary Acquisition]
After the core 2,000–3,000 words, additional words are statistically rare and provide very little additional text coverage. [McCarthy, 2026]
148
Paradigmatic [Linguistics Theory]
The vertical axis of choice within a set (e.g., choosing "red" instead of "blue"); these meaning contrasts dominate lower-level learning. [Halliday; McCarthy, 2026]
149
Syntagmatic [Linguistics Theory]
The horizontal axis of structure and combination (e.g., collocation and word order); this becomes the primary focus for advanced learners. [Halliday; McCarthy, 2026]
150
Lexicogrammar [Linguistics Theory]
Michael Halliday’s concept that grammar and lexis are not separate but exist on a single continuum. [Halliday; McCarthy, 2026]
151
Grammaticalization [Linguistics Theory]
The process where a grammatical structure freezes into a routine, formulaic lexical chunk (e.g., "you know" becoming a discourse marker). [McCarthy, 2026]
152
Irreversible Binomials [Lexical Semantics]
Two-word phrases connected by "and" with a fixed, unchangeable order (e.g., "pros and cons," "wear and tear"). [McCarthy, 2026]
153
What common fossilized error do advanced learners make with irreversible binomials? [Learner Errors]
Incorrect word order, such as saying "white and black" instead of "black and white." [McCarthy, 2026]
154
How can learning the Academic Word List (AWL) help overcome the vocabulary coverage dilemma? [Vocabulary Acquisition]
Mastering the 570 word families in the AWL can increase text coverage by 10%, a massive gain compared to learning random rare words. [McCarthy, 2026]
155
Attrition vs. Deactivation [Psycholinguistics]
The phenomenon where older, core knowledge is not lost permanently but becomes "pushed out" or less available as the brain is crowded with new, advanced items. [McCarthy, 2026]
156
What assumes Assumption-based present usage of the Future Perfect (will have been)? [Grammar Functions]
Using the form to make assumptions about the present or past knowledge (e.g., "You will have heard the news") rather than projecting into the future. [McCarthy, 2026]
157
Imperative Conditional [Grammar Patterns]
A non-standard conditional structure that uses a command to express a condition (e.g., "Go into any shop and you will see..."). [McCarthy, 2026]
158
What three patterns of the English Subjunctive are attested in corpora? [Grammar Patterns]
1. Verb patterns (insist that he go); 2. Noun patterns (requirement that he wear); 3. Adjective patterns (crucial that he be). [McCarthy, 2026]
159
Nominalization (Grammatical Metaphor) [Academic Writing]
The process of transforming verbs or adjectives into nouns (e.g., "I decided" → "My decision"), which is statistically linked to higher grades and academic success. [Halliday; McCarthy, 2026]
160
Double Modality [Academic Writing]
The pairing of a modal verb with a modal adverb (e.g., "could possibly," "might well") to achieve greater stylistic sophistication in writing. [McCarthy, 2026]
161
What is the "Flip Test" in ELT publishing? [Materials Development]
A marketing reality where teachers reject advanced textbooks if they don't see traditional (though infrequent) items like the subjunctive when flipping through the pages. [McCarthy, 2026]
162
What is the CEFR framework? [Language Assessment]
A universal scale (A1 to C2) that defines what learners "can do" at each language proficiency level.
163
What communication skills characterize the A1/A2 (Basic) levels? [Language Assessment]
Introducing oneself and talking about daily routines using the present simple tense and frequency adverbs.
164
What communication skills characterize the B1/B2 (Independent) levels? [Language Assessment]
Describing experiences, dreams, and plans, as well as discussing abstract topics like politics or technology.
165
What communication skills characterize the C1/C2 (Proficient) levels? [Language Assessment]
Expressing complex ideas clearly in professional or academic settings and understanding nuances, slang, and jokes.
166
What tool determines the specific CEFR level of a vocabulary word based on its exact meaning or usage context? [Corpus Tools]
The English Vocabulary Profile (EVP).
167
Why might a single word like "head" appear multiple times in the EVP across different CEFR levels? [Corpus Tools]
Because its CEFR level changes depending on whether it is used as a basic noun (A1), an abstract noun (B1), a verb (B2), or an idiom (C2).
168
How can teachers utilize the EVP to design level-appropriate course materials? [Corpus Methodology]
By generating topic-specific word lists restricted to a specific CEFR level (e.g., retrieving only C1 words related to politics).
169
What software is used to gauge a learner's vocabulary proficiency by calculating the percentage of words belonging to each CEFR level? [Corpus Tools]
Text Inspector.
170
What is the difference between a "token" and a "type" in corpus linguistics? [Lexical Analysis]
A token is the total number of words in a text including repetitions, whereas a type is the number of distinct, unique words.
171
What does it indicate if 98% of a learner's text belongs to the top 2,000 most frequent BNC words? [Lexical Analysis]
It indicates a very basic vocabulary with low word coverage of specialized or lower-frequency terms.
172
What tool is used to measure the lexical density of a text? [Corpus Tools]
Lexical Tutor (LexTutor).
173
What are content words? [Lexical Analysis]
Words that carry meaning, specifically nouns, verbs, adjectives, and adverbs.
174
What are function words? [Lexical Analysis]
Words that provide grammatical structure, such as prepositions, pronouns, and conjunctions.
175
What is lexical density? [Lexical Analysis]
The ratio of content words to the total number of words in a text.
176
What does a high lexical density indicate about a learner's language proficiency? [Lexical Analysis]
It indicates a more advanced, sophisticated vocabulary compared to an over-reliance on basic function words.
177
What tool maps specific grammatical structures and syntactical patterns to their corresponding CEFR levels? [Corpus Tools]
The English Grammar Profile (EGP).
178
How do you empirically prove a learner's grammatical competence using Sketch Engine and the EGP? [Corpus Methodology]
By finding specific grammatical constructions in the learner's text via Sketch Engine and consulting the EGP to verify what CEFR level those constructions represent.
179
What visual evidence should be included in a learner profile report to prove a student is utilizing specific grammar structures or vocabulary? [Corpus Methodology]
Screenshots of concordance lines generated from Sketch Engine.
180
What must a learner profile report provide after presenting the numeric data and discussing the corpus analysis? [Language Assessment]
Pedagogical recommendations for the teacher on how to practically enhance the learner's competences.
181
Competence-based Approach [Corpus Linguistics]
A pedagogical and analytical shift focusing on what learners can achieve linguistically (the "glass half full") rather than focusing on learner errors.
182
Types [Linguistic Metrics]
The count of distinctive, unique words in a text excluding all repetitions.
183
Tokens [Linguistic Metrics]
The total number of words in a text, including every instance of repeated functional and content words.
184
**Lexical Density [Linguistic Metrics]
The proportion of content words (nouns, verbs, adjectives, adverbs) relative to the total number of words in a sample.
185
English Vocabulary Profile (EVP) [Applied Linguistics]
An online resource that identifies which specific words and meanings are typically mastered at each CEFR level.
186
English Grammar Profile (EGP) [Applied Linguistics]
A resource that maps grammatical structures and learner usage patterns to the corresponding CEFR proficiency levels.
187
Text Inspector [Corpus Software]
A tool used to analyze learner data by providing a percentage breakdown of vocabulary across CEFR levels (A1–C2).
188
Lex Tutor [Corpus Software]
A platform used to measure lexical density and provide vocabulary profiles using tools like VP Classic.
189
Corpus Matrix [Research Design]
A systematic planning document defining variables and specifications (e.g., L1, medium of instruction, age) for building a research corpus.
190
Word Coverage [Lexical Analysis]
The measure of how much of a learner's text falls within a specific frequency benchmark, such as the BNC 2,000 most frequent words.
191
What is the primary function of Google Ngram Viewer? [Corpus Tools]
A diachronic analysis tool tracking the frequency of words or phrases across digitized books from the 1500s to 2022.
192
How must multiple search terms be formatted in Google Ngram Viewer to generate a comparative timeline? [Corpus Tools]
They must be typed sequentially, separated by commas, with no spaces between the terms.
193
How can Google Ngram Viewer be applied pedagogically for historical linguistic analysis? [Language Pedagogy]
Students can investigate the emergence of idioms, compare usage timelines of semantically related terms, and click specific chronological periods to reveal actual source texts.
194
What is SkELL? [Corpus Tools]
A free, streamlined interface derived from Sketch Engine, designed to allow language learners to extract and analyze corpus data without the complexity of the full software suite.
195
How do concordance lines in SkELL facilitate inductive learning? [Language Pedagogy]
They display target phrases within authentic sentence contexts, allowing students to independently deduce meaning and identify strict syntactical positioning.
196
What specific SkELL feature provides a summary of a word's collocational behavior by categorizing typical verbs, nouns, and modifiers? [Corpus Tools]
Word Sketches.
197
What is the function of the "Similar Words" feature in SkELL? [Corpus Tools]
It generates visual clusters of synonyms and near-synonyms based purely on their structural and distributional behavior within the corpus.
198
What is Write and Improve? [Assessment Tools]
An automated, corpus-driven writing evaluation platform powered by the Cambridge Learner Corpus designed for real-time formative assessment.
199
How does Write and Improve quantify and track a learner's writing proficiency? [Assessment Tools]
It calculates and assigns a CEFR proficiency level to submitted text, highlights syntactic and lexical errors, and plots longitudinal progress on a visual graph.
200
Which two major native-speaker corpora primarily power the diagnostic tools available on LexTutor? [Corpus Tools]
The British National Corpus (BNC) and the Corpus of Contemporary American English (COCA).
201
What is the primary purpose of the Vocabulary Levels Test on LexTutor? [Assessment Tools]
To assess a learner's receptive vocabulary knowledge across distinct frequency bands, advancing from the first 2,000 words up to the 10,000-word level or the University Word List.
202
Which specific LexTutor assessment provides a quantitative estimate of the absolute total number of English words a learner possesses in their mental lexicon? [Assessment Tools]
The Vocabulary Size Test.
203
What is the function of the Phrase Profiler tool within LexTutor? [Corpus Tools]
It parses submitted text to extract and categorize multi-word units, cross-referencing them against established databases such as the Academic Collocations List and specific idiomatic inventories.
204
What is Diachronic Analysis? [Linguistic Terminology]
The study and tracking of how language, specifically the frequency and usage of words or phrases, evolves over a timeline.
205
What is Inductive Learning? [Language Pedagogy]
A discovery-based approach where learners are exposed to raw data and must independently deduce the underlying linguistic rules, meanings, or patterns.
206
What is Receptive Vocabulary Knowledge? [Vocabulary Acquisition]
The lexical items that a learner can recognize and comprehend during reading or listening tasks, distinct from the vocabulary they can actively produce.
207
What is the primary function of Google Ngram Viewer? [Corpus Linguistics Tools]
A diachronic analysis tool tracking the frequency of words or phrases across digitized books from the 1500s to 2022.
208
How must multiple search terms be formatted in Google Ngram Viewer to generate a comparative timeline? [Corpus Linguistics Tools]
They must be typed sequentially, separated by commas, with no spaces between the terms (e.g., Albert Einstein,Sherlock Holmes).
209
How can Google Ngram Viewer be applied for literary analysis in the classroom? [Corpus Pedagogy]
Students can query terms from required reading and inspect the chronological timeline to reveal the exact source texts, publications, and page numbers from specific historical eras.
210
What sociolinguistic trend did the Ngram query for "dirty cop" reveal in the lecture? [Corpus Analysis]
The phrase had near-zero historical frequency but spiked exponentially between 1980 and 2000, allowing students to analyze modern discourse on corruption.
211
What diachronic relationship did the Ngram query for "empathy" and "sympathy" reveal? [Corpus Analysis]
"Sympathy" maintained consistent historical usage, whereas "empathy" emerged almost entirely post-1960.
212
What specific phrase demonstrated cyclical historical fluctuations before spiking in the 1990s via an Ngram query? [Corpus Analysis]
"Raining cats and dogs."
213
What is SkELL? [Corpus Linguistics Tools]
A free, streamlined interface derived from Sketch Engine, designed for language learners to extract corpus data without complex querying syntax.
214
How do concordance lines in SkELL facilitate inductive learning? [Corpus Pedagogy]
They display target phrases within authentic sentence contexts, allowing learners to independently deduce semantic meaning and syntactical positioning without explicit teacher definition.
215
What syntactical positioning rule was deduced using SkELL's concordance lines for the idiom "out of the blue"? [Corpus Analysis]
The concordance data proved the idiom functions almost exclusively in the sentence-final position.
216
How did SkELL differentiate the near-synonyms "purse" and "wallet" in the lecture? [Corpus Analysis]
Concordance lines highlighted gendered and descriptive collocations, associating "purse" with women and specific colors, and "wallet" with men.
217
What does the "Word Sketches" feature in SkELL provide? [Corpus Linguistics Tools]
A summary of a word's collocational behavior, categorizing the specific verbs, objects, and modifiers (e.g., "short-lived" for "happiness") that typically co-occur with the target term.
218
What is the function of the "Similar Words" feature in SkELL? [Corpus Linguistics Tools]
It generates visual clusters of synonyms and near-synonyms based purely on their structural and distributional behavior within the corpus.
219
What is Write and Improve? [Automated Assessment Tools]
An automated, corpus-driven writing evaluation platform powered by the Cambridge Learner Corpus designed for real-time formative assessment.
220
How does Write and Improve quantify and track a learner's writing proficiency? [Automated Assessment Tools]
It instantly calculates and assigns a CEFR proficiency level (A1–C2) to a submitted text, highlights errors as "suspicious words," and plots longitudinal progress on a visual graph.
221
What specialized English examination modules are integrated into the Write and Improve platform? [Automated Assessment Tools]
IELTS Academic, IELTS General, B2 First certificate, Business English, and English for Healthcare.
222
What administrative capabilities does the "Class View" dashboard in Write and Improve offer instructors? [Corpus Pedagogy]
Instructors can establish digital workbooks, assign timed writing tasks, and monitor individual or class-wide progression and error frequency metrics.
223
Which two major corpora primarily power the diagnostic tools on LexTutor? [Corpus Linguistics Tools]
The British National Corpus (BNC) and the Corpus of Contemporary American English (COCA).
224
What specific lexical ranges does the Vocabulary Levels Test on LexTutor evaluate? [Vocabulary Assessment]
Receptive vocabulary knowledge across distinct frequency bands: the 1st 2,000 words, 3rd 2,000 words, 5,000 words, 10,000 words, and the University Word List.
225
Which LexTutor assessment provides a quantitative estimate of the absolute total number of English words a learner possesses? [Vocabulary Assessment]
The Vocabulary Size Test.
226
What does the Phrasal Verb Test on LexTutor evaluate? [Vocabulary Assessment]
Comprehension of phrasal verbs based on empirical frequency data from the British National Corpus (BNC).
227
What is the function of the Phrase Profiler tool within LexTutor? [Corpus Linguistics Tools]
It parses a submitted text to extract and categorize multi-word units by cross-referencing them against established academic, structural, and idiomatic databases.
228
Which specific database does LexTutor's Phrase Profiler use to identify academic collocations? [Corpus Linguistics Tools]
The Academic Word List.
229
What is the The Martinez and Smith idiom list also known as the PHRASal Expressions List ?
An inventory of 505 multiword expressions (MWEs) that occur within the 5,000 most frequent word families in the BNC, specifically selected for being semantically non-transparent for L2 learners (Martinez & Schmitt, 2012).
230
Which specific database does LexTutor's Phrase Profiler use to identify colligations and structural transitions? [Corpus Linguistics Tools]
The Oxford Placement Test lexicon.
231
What is the Cancode Corpus? [Corpus Linguistics]
The Cambridge and Nottingham Corpus of Discourse in English, a five-million-word spoken corpus used to study conversational language. [McCarthy & Carter]
232
What three distinct elements are required for "conversational success" according to Cancode Corpus data? [Discourse Analysis]
The accurate use of chunks, a repertoire of small interactive words, and confluence. [McCarthy]
233
What is the primary interactive function of the most frequent four-word spoken chunk, "you know what I mean"? [Discourse Analysis]
Listener monitoring, used to continuously check if the speaker and listener are on the same wavelength. [McCarthy]
234
What is the communicative function of high vagueness in spoken English? [Discourse Analysis]
It projects a shared worldview and assumes shared knowledge between speakers, preventing listener fatigue. [McCarthy]
235
What is a Vague Category Marker (VCM) or General Extender? [Discourse Analysis]
A phrase used to cap off a list, signaling an assumed shared category without explicitly naming every item (e.g., "and things like that"). [McCarthy]
236
What is an exemplar in the context of Vague Category Markers? [Discourse Analysis]
The specific noun phrase or clause (e.g., "bone development") provided immediately before a marker to tune the listener into the correct category. [McCarthy]
237
What did research on Norwegian secondary school students reveal about fluency? [Applied Linguistics]
There is a direct statistical correlation between high fluency grades and the deployment of small interactive words like "just," "so," "then," and "actually." [Hasselgren, 2004]
238
What phrase is used to describe the outsized interpersonal impact of short, frequent words like "just" and "actually"? [Linguistic Theory]
"Small words with big meanings." [Sinclair]
239
What is the strategic function of placing "actually" in the turn-initial position? [Spoken Grammar]
To politely correct a listener's assumption without causing a face-threat. [McCarthy]
240
At what proficiency level do English learners typically begin using "actually" naturally in the strategic turn-initial position? [Applied Linguistics]
The B2 or C1 levels. [McCarthy]
241
What geographical metaphor describes how speakers smoothly stitch their utterances together to create a single, unbroken flow of interaction? [Discourse Analysis]
Confluence. [McCarthy]
242
What characterizes the linguistic phenomenon of "Turn Openers"? [Spoken Grammar]
The first words of a speaker's turn explicitly demonstrate their reaction to what they have just heard, rather than transmitting new content. [Hong In Tao, 2003]
243
What are the strict positional preferences for the turn openers "Oh," "Well," and "Basically"? [Corpus Linguistics]
"Oh" and "Well" strongly prefer position 1, while "Basically" overwhelmingly prefers position 2. [McCarthy]
244
Why do highly frequent articles like "the" and "a" rarely operate as spoken turn openers? [Spoken Grammar]
They refer to objects and fail to provide the vital interactive or retroactive link required to maintain conversational flow. [McCarthy]
245
What is Syntactic Co-construction in spoken English? [Spoken Grammar]
The practice of using grammatically dependent items (like "which" or "if" clauses) as freestanding turns to hook directly onto the previous speaker's main clause. [McCarthy]
246
What is the maximum temporal threshold speakers typically allow before intervening to co-construct a sentence or end a silence? [Discourse Analysis]
2.5 seconds. [Riga, 2003]
247
What three-step pedagogical process is required to teach subconscious conversational flow strategies? [Language Pedagogy]
1. Noticing (via listening), 2. Input (explaining the function), and 3. Drilling/Practice. [McCarthy]
248
Why is social conversation utilized as the baseline for analyzing specialized spoken language? [Corpus Methodology]
Because it is the type of discourse people spend the vast majority of their lives engaging in, providing a standard measure against which specialized corpora can be compared. [McCarthy]
249
What percentage of occurrences of the verb "know" in spoken academic English are found specifically within the chunk "you know"? [Academic Spoken English]
64%, closely mirroring its 66% frequency in casual social conversation. [McCarthy]
250
What does the high frequency of the chunk "you know" indicate about the transmission of academic knowledge? [Academic Spoken English]
It demonstrates that knowledge transmission relies heavily on projecting shared worlds and assumptions to ensure the speaker and listener are on the same wavelength. [McCarthy]
251
How is "keyness" determined in corpus linguistics? [Corpus Linguistics]
By taking a general spoken corpus and asking the software to measure the extent to which specific words are significantly more or less frequent in a target corpus. [McCarthy]
252
What does the high keyness of the discourse marker "OK" in academic speaking indicate about its specific function? [Academic Spoken English]
It functions structurally as a sectional paragraph marker to signal the stages of progression in the development of knowledge. [McCarthy]
253
What discourse roles are highlighted by the fact that 95% of "OK" and "right" tokens are deployed by lecturers and tutors? [Discourse Analysis]
It highlights the unequal power dynamics and the subtle control teachers exert over academic interactions, despite the appearance of a shared conversation. [McCarthy]
254
What do "negative keywords" in a keyword list reveal about a target corpus compared to a baseline? [Corpus Linguistics]
They identify the words that do not distinguish the two corpora, revealing areas where the specialized language and the baseline are similar or identical. [McCarthy]
255
What specific negative keywords are shared equally between social conversation and academic speaking? [Academic Spoken English]
Pronouns and hesitations such as "well", "er", "I", and "you". [McCarthy]
256
What specific multi-word chunk ranks highly (number 9 in Cancode, number 3 in CLASS) across British and Irish academic corpora? [Academic Spoken English]
"In terms of". [McCarthy]
257
What does the massive frequency of "in terms of" in academic speaking reveal about its underlying epistemology? [Academic Spoken English]
It reveals that academic knowledge is structured and taught by explicitly relating ideas to one another, rather than presenting them as isolated facts. [McCarthy]
258
What does the MICASE corpus reveal about the distribution of "in terms of" across different academic disciplines? [Corpus Linguistics]
It is highly frequent in the social sciences and education, but considerably less frequent in the physical sciences and engineering. [McCarthy]
259
What is the most frequent four-word chunk built around the word "sense" in both the Cancode and MICASE corpora? [Academic Spoken English]
"In the sense that". [McCarthy]
260
What specific epistemological function does the chunk "in the sense that" perform in an academic setting? [Academic Spoken English]
It narrows down a general idea by explicitly focusing on a more precise, specific meaning. [McCarthy]
261
What is the pragmatic function of the chunk "you might want to" when used by a tutor toward a student? [Pragmatics]
It operates strictly as a masked directive or command. [McCarthy]
262
Why do academic tutors routinely use masked directives like "you might want to" instead of explicit commands? [Pragmatics]
To foster a cooperative interaction and subtly invite the student into membership within a discourse community or community of practice. [McCarthy]
263
What two structural components make up a Vague Category Marker (VCM)? [Discourse Analysis]
An exemplar (or multiple exemplars) followed by a general extender or marker. [McCarthy]
264
What syntactical feature strictly defines an adjunctive Vague Category Marker? [Discourse Analysis]
It begins with the word "and" (e.g., "and that sort of thing"). [McCarthy]
265
What is the precise pragmatic effect of using an adjunctive Vague Category Marker? [Discourse Analysis]
It signals the assumption that the listener knows the parameters and can more or less completely fill the members of the referenced category. [McCarthy]
266
What syntactical feature strictly defines a disjunctive Vague Category Marker? [Discourse Analysis]
It begins with the word "or" (e.g., "or whatever"). [McCarthy]
267
What is the precise pragmatic effect of using a disjunctive Vague Category Marker? [Discourse Analysis]
It leaves the category deliberately open-ended, serving as an invitation for the listener to creatively expand the parameters. [McCarthy]
268
What specific computational technique measures how equally individual items are distributed across an entire corpus? [Corpus Linguistics]
Consistency analysis. [McCarthy]
269
How does the distribution of "I" and "you" fundamentally change between social conversation and academic speaking? [Corpus Linguistics]
In conversation they are distributed equally, whereas in academic speaking "you" achieves much wider distribution while "I" becomes narrower and dependent on the specific event type. [McCarthy]
270
What discourse strategy is employed when a lecturer combines VCMs with the possessive determiner "your" (e.g., "your population statistics")? [Discourse Analysis]
The lecturer is explicitly framing the academic concepts as belonging to the student in order to pull them into an inclusive, shared world. [McCarthy]
271
What is the MICASE corpus? [Corpus Linguistics]
The Michigan Corpus of Academic Spoken English, a 1.7-million-word corpus developed by John Swales and his team. [McCarthy]
272
What is the CLASS corpus? [Corpus Linguistics]
The Cambridge Limerick and Shannon corpus, containing academic data collected from a hotel management college. [McCarthy]
273
What is the BASE corpus? [Corpus Linguistics]
The British Academic Spoken English corpus, functioning as a parallel corpus to the American MICASE. [McCarthy]
274
What specific psychological and cognitive phenomena can corpus linguistics not easily answer? [Wh-Question, Corpus Linguistics]
The nature of the mental lexicon and the psychology of learning vocabulary, although artificial intelligence may change this limitation. (McCarthy, 2025)
275
What are the five primary methodological approaches for searching a corpus? [Wh-Question, Corpus Linguistics]
Frequency lists, keywords (statistically significant words), collocations and chunks, coverage measures, and concordances. (McCarthy, 2025)
276
What are the two primary filtering processes utilized when generating frequency lists? [Wh-Question, Corpus Linguistics]
Stop lists and lemmatisation. (McCarthy, 2025)
277
Stop list [Noun, Corpus Linguistics]
A filtering process that removes specific categories of words, such as all grammar words, from corpus data before analysis. (McCarthy, 2025)
278
Lemmatisation [Noun, Corpus Linguistics]
The amalgamation of word-forms to include both the base form and its inflected forms, which is distinct from 'word families' that also include derivations. (McCarthy, 2025)
279
What are the top three most frequent verbs identified in the spoken BNC2014 corpus? [Wh-Question, Corpus Linguistics]
Be, do, and have. (McCarthy, 2025)
280
Key words [Noun, Corpus Linguistics]
Words which occur with significantly high or low frequency in a given corpus when statistically compared to a reference corpus. (McCarthy, 2025)
281
What semantic category of vocabulary dominates the top 20 keywords in written Business English? [Wh-Question, Corpus Linguistics]
High-frequency content words related to operational concepts, such as management, strategy, and performance. (McCarthy, 2025)
282
How do frequency lists and keyword lists diverge regarding top "-ly" adverbs in spoken Business English? [Wh-Question, Corpus Linguistics]
"Actually" is the most frequent adverb by sheer occurrence, whereas "effectively" possesses the highest statistical keyness. (McCarthy, 2025)
283
What is the average quantitative size of a core vocabulary required for everyday written and spoken use? [Wh-Question, Corpus Linguistics]
2,500 to 3,000 words. (Nation and Waring, 1997; Schmitt and Schmitt, 2014; Szudarski, 2018; McCarthy, 2023)
284
What accounts for the '35%' difference in lexical composition found in spoken vocabulary? [Wh-Question, Corpus Linguistics]
Items of unusually high frequency that are implicated in the creation and maintenance of social relations, which are often involved in chunks. (McCarthy, 2025)
285
What are the two main typologies of word combinations that demonstrate vocabulary extends beyond single words? [Wh-Question, Corpus Linguistics]
Collocation and chunks. (McCarthy, 2025)
286
Collocation [Noun, Corpus Linguistics]
A type of word combination characterized by the frequent and expected co-occurrence of specific words, such as "business strategy" or "market share". (McCarthy, 2025)
287
What lexical features distinguish the top 20 two-word collocations in spoken Business English from written Business English? [Wh-Question, Corpus Linguistics]
Spoken Business English features a critically high frequency of discourse markers, hedges, and phrasal verb collocations, such as "you know" and "I mean". (McCarthy, 2025)
288
What are the most frequent 3-word chunks in spoken Business English versus spoken Academic English? [Wh-Question, Corpus Linguistics]
"I don't know" is the most frequent chunk in spoken Business English, while "a lot of" is the most frequent in spoken Academic English. (McCarthy, 2025)
289
Why is it analytically necessary to read concordances for meaning rather than relying solely on frequency data? [Wh-Question, Corpus Linguistics]
Because special vocabularies develop pragmatic specialisations and specific meanings for words and chunks that frequency lists cannot reveal. (McCarthy, 2025)
290
How has the phrase "going forward" developed pragmatic specialisation within spoken Business English? [Wh-Question, Corpus Linguistics]
It has shifted from denoting physical movement to signifying progression "into the future". (McCarthy, 2025)
291
What specific capability do modern artificial intelligence tools provide for the development of large data collections? [Corpus Linguistics]
They allow for the automatic gathering of data according to defined curation parameters to rapidly assemble and grow corpora. [O'Keeffe and McCarthy, 2021]
292
Through what specific technological medium has access to multi-billion-word corpora across numerous languages been significantly enhanced in the last decade? [Corpus Linguistics]
Online corpus interfaces. [O'Keeffe and McCarthy, 2021]
293
What type of corpus is designed to quickly compile data as a major societal event occurs to track opinions and linguistic coinage? [Corpus Linguistics]
Rapid-response corpora. [O'Keeffe and McCarthy, 2021]
294
What philosophical concern arises from the automated and rapid curation of large corpora regarding the documentation of society? [Corpus Linguistics]
Whether the rapid curation accurately reflects or refractions our shared social reality. [O'Keeffe and McCarthy, 2021]
295
What specific technological method has modernized the gathering of spoken corpus data by utilizing personal mobile devices? [Corpus Linguistics]
Crowdsourcing. [O'Keeffe and McCarthy, 2021]
296
Multi-modal (adjective) [Corpus Linguistics]
Characterized by the combination of various communicative modes, such as speech, body language, and text, within a single dataset. [O'Keeffe and McCarthy, 2021]
297
What methodological deficit currently causes stagnation in the corpus analysis of data from social media and online streaming platforms? [Corpus Linguistics]
Reducing rich communication into one-dimensional, impoverished orthographic transcripts. [O'Keeffe and McCarthy, 2021]
298
Despite the ease of automated text harvesting, what traditional tenets of corpus creation must researchers continue to safeguard? [Corpus Linguistics]
The principles of careful sampling, corpus design, and representativeness. [O'Keeffe and McCarthy, 2021]
299
Contrastive (adjective) [Corpus Linguistics]
Pertaining to the traditional method of error analysis in learner corpora that isolates L1 versus L2 linguistic interference. [O'Keeffe and McCarthy, 2021]
300
What is the primary objective of the recent "profiling turn" within learner corpus research? [Corpus Linguistics]
To empirically test calibrated proficiency scales to demonstrate what language learners can do. [O'Keeffe and McCarthy, 2021]
301
What specific operational function do learner corpora fulfill when integrated into modern machine learning systems? [Corpus Linguistics]
They serve as training data for algorithms to process learner performances for automated assessment and feedback. [O'Keeffe and McCarthy, 2021]
302
Three-dimensional (adjective) [Corpus Linguistics]
Pertaining to a data cubing technique in machine learning that models learner data across three axes: the learner, their academic discipline, and change over time. [O'Keeffe and McCarthy, 2021]
303
What specific type of corpus is currently being developed to train machine learning chatbots to deliver adaptive pedagogical feedback? [Corpus Linguistics]
Chatroom corpora of teacher-learner interactions. [O'Keeffe and McCarthy, 2021]
304
Longitudinal (adjective) [Corpus Linguistics]
Pertaining to learner corpora that gather data over extended periods, requiring partnerships between SLA and CL experts to fully utilize. [O'Keeffe and McCarthy, 2021]
305
What theoretical perspective on language learning is rapidly growing due to the integration of corpus linguistics into Second Language Acquisition research? [Corpus Linguistics]
Usage-based perspectives. [O'Keeffe and McCarthy, 2021]
306
What historical practice from the thirteenth century represents the methodological spirit of modern corpus concordancing? [Corpus Linguistics]
The manual, line-by-line indexing of words and their citations by biblical scholars. [O'Keeffe and McCarthy, 2021]
307
Key Word In Context (noun phrase) [Corpus Linguistics]
An early automated concordancing format developed by library and information scientists in the 1970s. [O'Keeffe and McCarthy, 2021]
308
What practical necessity drove early pre-Chomskyan structural linguists and lexicographers to amass large collections of data slips? [Corpus Linguistics]
The need to collect reliable, empirical samples of language usage to compile dictionaries and grammatical descriptions. [O'Keeffe and McCarthy, 2021]
309
What specific capability do modern artificial intelligence tools provide for the development of large data collections? [Corpus Linguistics]
They allow for the automatic gathering of data according to defined curation parameters to rapidly assemble and grow corpora. [O'Keeffe and McCarthy, 2021]
310
Through what specific technological medium has access to multi-billion-word corpora across numerous languages been significantly enhanced in the last decade? [Corpus Linguistics]
Online corpus interfaces. [O'Keeffe and McCarthy, 2021]
311
What type of corpus is designed to quickly compile data as a major societal event occurs to track opinions and linguistic coinage? [Corpus Linguistics]
Rapid-response corpora. [O'Keeffe and McCarthy, 2021]
312
What philosophical concern arises from the automated and rapid curation of large corpora regarding the documentation of society? [Corpus Linguistics]
Whether the rapid curation accurately reflects or refractions our shared social reality. [O'Keeffe and McCarthy, 2021]
313
What specific technological method has modernized the gathering of spoken corpus data by utilizing personal mobile devices? [Corpus Linguistics]
Crowdsourcing. [O'Keeffe and McCarthy, 2021]
314
Multi-modal (adjective) [Corpus Linguistics]
Characterized by the combination of various communicative modes, such as speech, body language, and text, within a single dataset. [O'Keeffe and McCarthy, 2021]
315
What methodological deficit currently causes stagnation in the corpus analysis of data from social media and online streaming platforms? [Corpus Linguistics]
Reducing rich communication into one-dimensional, impoverished orthographic transcripts. [O'Keeffe and McCarthy, 2021]
316
Despite the ease of automated text harvesting, what traditional tenets of corpus creation must researchers continue to safeguard? [Corpus Linguistics]
The principles of careful sampling, corpus design, and representativeness. [O'Keeffe and McCarthy, 2021]
317
Contrastive (adjective) [Corpus Linguistics]
Pertaining to the traditional method of error analysis in learner corpora that isolates L1 versus L2 linguistic interference. [O'Keeffe and McCarthy, 2021]
318
What is the primary objective of the recent "profiling turn" within learner corpus research? [Corpus Linguistics]
To empirically test calibrated proficiency scales to demonstrate what language learners can do. [O'Keeffe and McCarthy, 2021]
319
What specific operational function do learner corpora fulfill when integrated into modern machine learning systems? [Corpus Linguistics]
They serve as training data for algorithms to process learner performances for automated assessment and feedback. [O'Keeffe and McCarthy, 2021]
320
Three-dimensional (adjective) [Corpus Linguistics]
Pertaining to a data cubing technique in machine learning that models learner data across three axes: the learner, their academic discipline, and change over time. [O'Keeffe and McCarthy, 2021]
321
What specific type of corpus is currently being developed to train machine learning chatbots to deliver adaptive pedagogical feedback? [Corpus Linguistics]
Chatroom corpora of teacher-learner interactions. [O'Keeffe and McCarthy, 2021]
322
Longitudinal (adjective) [Corpus Linguistics]
Pertaining to learner corpora that gather data over extended periods, requiring partnerships between SLA and CL experts to fully utilize. [O'Keeffe and McCarthy, 2021]
323
What theoretical perspective on language learning is rapidly growing due to the integration of corpus linguistics into Second Language Acquisition research? [Corpus Linguistics]
Usage-based perspectives. [O'Keeffe and McCarthy, 2021]
324
What historical practice from the thirteenth century represents the methodological spirit of modern corpus concordancing? [Corpus Linguistics]
The manual, line-by-line indexing of words and their citations by biblical scholars. [O'Keeffe and McCarthy, 2021]
325
Key Word In Context (noun phrase) [Corpus Linguistics]
An early automated concordancing format developed by library and information scientists in the 1970s. [O'Keeffe and McCarthy, 2021]
326
What practical necessity drove early pre-Chomskyan structural linguists and lexicographers to amass large collections of data slips? [Corpus Linguistics]
The need to collect reliable, empirical samples of language usage to compile dictionaries and grammatical descriptions. [O'Keeffe and McCarthy, 2021]
327
What is the Common European Framework of Reference for Languages (CEFR)? [Applied Linguistics]
An established, language-neutral benchmark for language competence comprising six levels (A1 to C2) defined by intuitively derived "can-do statements". (O’Keeffe & Mark, 2017)
328
Why is the CEFR criticized by language test developers and researchers? [Applied Linguistics]
Its descriptors are generic, intuitively derived, and lack empirical evidence, leading to arbitrary and inconsistent interpretations. (O’Keeffe & Mark, 2017)
329
What is the purpose of the English Profile (EP)? [Applied Linguistics]
To provide empirical detail about learner English competence, supplementing and adapting the generic, language-neutral CEFR. (O’Keeffe & Mark, 2017)
330
What is the English Grammar Profile (EGP)? [Applied Linguistics]
A database of over 1,200 empirically-derived statements detailing what learners can do with English grammar at each CEFR level, based on the Cambridge Learner Corpus. (O’Keeffe & Mark, 2017)
331
How do usage-based theories explain the acquisition of grammatical knowledge? [Second Language Acquisition]
They posit that frequently occurring form-meaning pairings become entrenched in a learner's mind through repeated use and experience. (O’Keeffe & Mark, 2017)
332
What is the accuracy-complexity trade-off effect? [Second Language Acquisition]
A phenomenon where advanced learners attempt to use more complex language, which inherently increases their risk of error and hinders measurable improvements in accuracy. (O’Keeffe & Mark, 2017)
333
Why might a learner over-represent specific linguistic features in exam data? [Corpus Linguistics]
They may rely on language they are comfortable with or engage in exam display to exhibit knowledge of structures perceived as complex. (O’Keeffe & Mark, 2017)
334
Why is establishing baseline comparability difficult in L2 corpus research? [Corpus Linguistics]
There is rarely an L1 corpus that perfectly matches the learner corpus in terms of representativeness, task constraints, and context. (O’Keeffe & Mark, 2017)
335
Reference Level Descriptors (Noun phrase) [Applied Linguistics]
Performance-based "can-do statements" derived by experts to define the minimum requirements for each stage within a proficiency framework. (O’Keeffe & Mark, 2017)
336
Ceiling effect (Noun phrase) [Second Language Acquisition]
A stabilization phase in language development, typically between B2 and C2 levels, where error rates stop decreasing significantly despite continued learning. (O’Keeffe & Mark, 2017)
337
Entrenched (Adjective) [Psycholinguistics]
The state of a form-meaning pairing becoming firmly established as grammatical knowledge in a learner's mind due to frequency of use. (O’Keeffe & Mark, 2017)
338
Exam display (Noun phrase) [Corpus Linguistics]
A task effect where a learner deliberately uses specific linguistic features to demonstrate their knowledge during a test. (O’Keeffe & Mark, 2017)
339
Idealized competence (Noun phrase) [Applied Linguistics]
A theoretical, homogenous L1 target based on a consensus of native-speaker success, which rarely exists in reality. (O’Keeffe & Mark, 2017)
340
Interlanguage (Noun) [Second Language Acquisition]
The developing linguistic system of an L2 learner, which can be observed and compared across different proficiency levels. (O’Keeffe & Mark, 2017)
341
What is the primary focus of the English Grammar Profile (EGP) regarding learner grammar? [Applied Linguistics]
The development of grammar competence across proficiency levels, rather than tracking patterns of error decline, plateau, or regression. (O’Keeffe & Mark, 2017)
342
How does vocabulary expansion affect grammatical competence across CEFR levels? [Second Language Acquisition]
Knowing more lexis allows learners to expand the repertoire of grammatical and pragmatic uses for a specific syntactic form. (O’Keeffe & Mark, 2017)
343
How does the use of the past simple affirmative form differ between A1 and B1 levels regarding lexical range? [Corpus Linguistics]
At A1, the syntactic pattern is stable but used with a limited range of verbs; by B1, the identical morphosyntactic pattern is applied to a wide range of verbs. (O’Keeffe & Mark, 2017)
344
Grammatical polysemy (Noun phrase) [Second Language Acquisition]
The phenomenon where a single grammatical pattern is progressively deployed across a wider range of meanings and for greater pragmatic effect as a learner's lexical repertoire grows. (O’Keeffe & Mark, 2017)
345
How does "grammatical polysemy" mirror lexical polysemy in learner development? [Second Language Acquisition]
Just as learners acquire multiple meanings for a single vocabulary word over time, they acquire multiple pragmatic and semantic functions for a single, stabilized grammatical structure. (O’Keeffe & Mark, 2017)
346
How does the "adverb + adjective" pattern evolve from A1 to C1 proficiency? [Corpus Linguistics]
At A1, the pattern is restricted to simple combinations like "very + adjective," whereas at C1, it utilizes advanced lexis to add pragmatic force and function as a focusing device. (O’Keeffe & Mark, 2017)
347
What awareness do learners develop about stabilized grammatical patterns as they progress to higher proficiency levels? [Applied Linguistics]
They become increasingly aware of the collocational and colligational limitations of the pattern, understanding which specific lexical items are primed to fill syntactic slots. (O’Keeffe & Mark, 2017)
348
Developmental endpoint (Noun phrase) [Second Language Acquisition]
The stage at which a grammatical form reaches syntactic stabilization, typically at lower proficiency levels, before later being deployed with greater semantic complexity. (O’Keeffe & Mark, 2017)
349
How do the EGP findings reinterpret the "ceiling effect" in learner grammar development? [Second Language Acquisition]
Rather than indicating an end to learning, the stabilization of form at A and B levels serves as a foundational baseline for deploying those forms with greater meaning complexity and dexterity at higher levels. (O’Keeffe & Mark, 2017)
350
Why is learner grammar considered a dynamic system? [Applied Linguistics]
It is in constant development and is never fully complete, continually evolving in complexity even after specific syntactic forms appear to have stabilized. (O’Keeffe & Mark, 2017)
351
Pragmatic competence (Noun phrase) [Pragmatics]
The ability to skillfully manipulate an acquired syntactic form for subtlety of meaning, focus, or social functions. (O’Keeffe & Mark, 2017)
352
How does the pragmatic use of the past simple tense evolve at the B2 level? [Pragmatics]
The form is deployed for pragmatic effect, utilizing verbs like "wondered" or "wanted" as politeness structures for requesting or thanking rather than indicating past time. (O’Keeffe & Mark, 2017)
353
What are the primary methodological limitations of the EGP study regarding the Cambridge Learner Corpus? [Corpus Linguistics]
The data is strictly limited to written examinations, and proficiency calibration relies exclusively on the assessment criteria of a single examination board. (O’Keeffe & Mark, 2017)
354
Why is parallel corpus research using non-Cambridge and spoken exam data necessary? [Corpus Linguistics]
To expose anomalies in competence calibration, compare findings across different testing frameworks, and examine learner competence beyond the constraints of written exams. (O’Keeffe & Mark, 2017)
355
Paradigmatic level (Noun phrase) [Linguistics]
The axis of linguistic choice where a learner selects specific, contextually primed lexical items to fill a stable syntactic slot. (O’Keeffe & Mark, 2017)
356
What are the two primary future functions of the phrase "going to" that are typically presented to learners of English as a Foreign Language (EFL)? [Pedagogical Grammar]
Expressing plans or intentions, and making predictions based on present evidence. [Burton, 2021]
357
What additional grammatical restriction regarding "going to" is frequently included in ELT materials despite its absence from major academic reference grammars? [Pedagogical Grammar]
The rule asserting that speakers typically avoid using the verb "go", the verb "come", or verbs of movement generally, immediately following the future marker "going to". [Burton, 2021]
358
How do materials designers frequently word pedagogical grammar rules to withhold complete commitment to their categorical accuracy? [Pedagogical Grammar]
They utilize hedging language, framing rules in relative terms as tendencies (e.g., stating that speakers "tend not to" or "avoid" a form rather than strictly prohibiting it). [Burton, 2021]
359
What is the historical rationale behind using frequency-based arguments in ELT grammar rules? [Pedagogical Grammar]
It implies that if native speakers use a specific structure infrequently, learners should also avoid it because high frequency of that structure would be considered erroneous. [Burton, 2021]
360
Where does historical evidence suggest the rule proscribing "go" and "come" after "going to" originated? [Pedagogical Grammar]
It first appeared explicitly in Harold Palmer's 1924 pedagogical text, A grammar of spoken English on a strictly phonetic basis. [Burton, 2021]
361
Which two corpora were selected to source empirical evidence for testing the accuracy of the "going to go" restriction? [Corpus Linguistics]
The British National Corpus (BNC) and the Corpus of Contemporary American English (COCA). [Burton, 2021]
362
Why were the verbs "say" and "take" selected as comparative benchmarks against "go" and "come" during the corpus analysis? [Corpus Linguistics]
Because their overall frequencies as lemmas in the BNC are highly similar to "go" and "come", providing a baseline to test if "go" and "come" appear unexpectedly less often after "going to". [Burton, 2021]
363
What did raw frequency tests reveal regarding the use of "going to go" and "going to come" compared to other infinitive chunks? [Corpus Linguistics]
Both strings are highly attested; "going to go" is the eighth most frequent "going to + infinitive" chunk in the BNC, directly refuting the claim that the combination is avoided. [Burton, 2021]
364
How did the introduction of the modal verb "will" assist in testing the hypothesis that semantic constraints (e.g., the need to discuss travel vs. speech) cause lower frequencies of "going to go"? [Corpus Linguistics]
Analyzing "will go" versus "will say" provided a baseline for how often speakers talk about future travel versus future speech, demonstrating that the lower frequency of "going to go" compared to "going to get" contradicts expected usage patterns if the rule were true. [Burton, 2021]
365
What did the Pointwise Mutual Information (PMI) and Odds Ratio calculations conclude about the relationship between "going to" and the verbs "go" and "come"? [Corpus Linguistics]
The calculations yielded neutral or positive scores (e.g., Odds Ratio above 1), providing no statistical evidence that the structure "going to" repels or avoids the verbs "go" and "come". [Burton, 2021]
366
Why do commercial English Language Teaching publications continue to perpetuate the inaccurate "going to go" rule? [Pedagogical Grammar]
The rule is part of an unexamined "canon" of ELT grammar points; coursebook writers often lack the time or training to consult corpora, and publishers hesitate to update descriptions for fear of alienating traditional teachers. [Burton, 2021]
367
Pedagogical grammar (noun) [Pedagogical Grammar]
Grammatical descriptions designed specifically for second language learners, which frequently compromise absolute descriptive truth in favor of clarity, simplicity, or conceptual parsimony. [Burton, 2021]
368
Corpus (noun) [Corpus Linguistics]
A large, structured database of machine-readable spoken and written text used by researchers to identify empirical linguistic frequencies and usage patterns. [Burton, 2021]
369
Lemma (noun) [Corpus Linguistics]
The dictionary base form of a word that encompasses all its inflected variations (e.g., the lemma "know" includes "know," "knows," "knowing," "knew," and "known"). [Burton, 2021]
370
Multi-word unit (noun) [Corpus Linguistics]
A sequential string of words that operate together as a single grammatical or semantic entity, such as "going to go". [Burton, 2021]
371
Hedging (noun) [Pedagogical Grammar]
A linguistic device used to express a lack of categorical commitment to the truth of a statement, often seen in rules phrased as "we usually avoid" rather than "we never use." [Burton, 2021]
372
Pointwise Mutual Information / PMI (noun) [Corpus Linguistics]
A statistical measure of association that calculates whether two linguistic elements co-occur more or less often than would be expected by chance; a score below zero indicates the elements repel each other. [Burton, 2021]
373
Odds Ratio (noun) [Corpus Linguistics]
A measure of association between two events; in corpus statistics, a score below 1 indicates that two linguistic features are observed together less frequently than expected, suggesting avoidance. [Burton, 2021]
374
Grammar Canon (noun) [Pedagogical Grammar]
A well-established, collective agreement within the language teaching profession regarding which specific grammatical structures must be taught, resulting in rules being repeated uncritically across materials. [Burton, 2021]
375
What percentage of naturally occurring conditional sentences does the traditional ELT four-type system fail to account for according to Jones and Waller's analysis? [Pedagogical Grammar]
Approximately 46 percent, as the traditional system only accounts for 54 percent of actual usage. [Burton, 2022]
376
Why do ELT coursebooks and pedagogic grammars persist in using the demonstrably inadequate four-conditional categorization system? [Pedagogical Grammar]
It offers a "pedagogic convenience" because the four conditionals act as discrete teaching points that fit easily into a structural syllabus and are simple to test. [Burton, 2022]
377
How does the proposed conditional system shift its grammatical focus compared to the traditional ELT model? [Pedagogical Grammar]
It abandons treating the conditional sentence as a single unified structure and focuses primarily on the verb form in the if-clause. [Burton, 2022]
378
Why does the proposed system stop teaching the main clause as a specific conditional structure, with the exception of the unreal past? [Pedagogical Grammar]
Because tense choice in the main clause functions the same as in other contexts, meaning learners can simply apply existing knowledge of modal verbs and future forms rather than learning them as part of a conditional pair. [Burton, 2022]
379
What historical evidence demonstrates that the "zero conditional" is not an original part of the ELT conditional paradigm? [Pedagogical Grammar]
The term "zero conditional" only appeared relatively recently, coined likely because the numbers 1 to 3 were already assigned, and was absent from major studies of conditionals even in the late 1980s and early 1990s. [Burton, 2022]
380
Where did the original three-way distinction of first, second, and third conditionals originate? [Pedagogical Grammar]
It appears to have originated in W.S. Allen's 1947 learner's grammar, "Living English Structure". [Burton, 2022]
381
How does the proposed categorization system utilize data from the English Grammar Profile (EGP)? [Pedagogical Grammar]
It uses the EGP's empirical data on learner competence at each CEFR level to organize conditional instruction into "core" and "non-core" information for multi-level syllabuses. [Burton, 2022]
382
What distinguishes "core" information from "non-core" information in the proposed conditional syllabus? [Pedagogical Grammar]
Core information represents the basic structural knowledge and earliest competences demonstrated by learners, while non-core information includes variations and expanding repertoires typically acquired at higher proficiency levels. [Burton, 2022]
383
How did Maule structurally categorize conditional sentences in his 1988 diagram? [Pedagogical Grammar]
He categorized them using a four-way matrix distinguishing between past and non-past reference, and real and unreal (counterfactual) situations. [Burton, 2022]
384
What widely used conditional structure, omitted from the traditional ELT system, is accounted for in Maule's diagram and the new proposed system? [Pedagogical Grammar]
The "real, past" conditional (e.g., "If it rained, the streets always flooded"), which Gabrielatos found to make up over one-third of the uses of the past simple in if-clauses. [Burton, 2022]
385
What verb forms characterize "Conditional A" (real, non-past) in the proposed system? [Pedagogical Grammar]
The core use of a present tense in the if-clause to refer to actions or states in the present or future. [Burton, 2022]
386
What verb forms characterize "Conditional B" (real, past) in the proposed system? [Pedagogical Grammar]
The core use of "if" plus the past simple to talk about repeated events in the past or events that may or may not have occurred on a specific past occasion. [Burton, 2022]
387
What verb forms characterize "Conditional C" (unreal, non-past) in the proposed system? [Pedagogical Grammar]
The core use of the past simple or past continuous in the if-clause to talk about the hypothetical present or future. [Burton, 2022]
388
What verb forms characterize "Conditional D" (unreal, past) in the proposed system? [Pedagogical Grammar]
The core use of the past perfect (simple or continuous) in the if-clause, accompanied by "would have" in the main clause. [Burton, 2022]
389
Why are "mixed conditionals" explicitly excluded from the proposed categorization system? [Pedagogical Grammar]
They are very infrequent, and by shifting the grammatical focus to individual clauses rather than the whole sentence, mixed structures no longer require a separate, dedicated analysis for learners to produce them. [Burton, 2022]
390
What percentage of naturally occurring conditional sentences is the newly proposed A/B/C/D system able to account for? [Pedagogical Grammar]
It accounts for 86 percent of the data, which increases to 89 percent if specific modal and continuous forms are categorized under the past simple or continuous. [Burton, 2022]
391
Indicative conditional (noun) [Linguistics]
A categorization used in semantics and logic that conflates the traditional zero and first conditionals. [Burton, 2022]
392
Counterfactual conditional (noun) [Linguistics]
A categorization used in semantics and logic that corresponds to the traditional second and third conditionals, describing unreal or hypothetical situations. [Burton, 2022]
393
Relevance conditional (noun) [Linguistics]
Sentences where the if-clause shows the relevance of the main clause rather than setting up a condition for its truth, a form rarely covered in ELT. [Burton, 2022]
394
Biscuit conditional (noun) [Linguistics]
A philosophical term synonymous with relevance conditionals, derived from J. L. Austin's example: "There are biscuits on the sideboard if you want them". [Burton, 2022]
395
English Grammar Profile / EGP (noun) [Linguistics]
A database of over 1,000 grammar competency statements mapped to CEFR levels based on an analysis of the Cambridge Learner Corpus. [Burton, 2022]