ABSTRACT
This article considers the problem of how to bring foreign language students with a limited vocabulary knowledge, consisting mainly of high-frequency words, to the point where they are able to adequately comprehend authentic texts in a target domain or genre. It proposes bridging the vocabulary gap by first determining whic h word families account for 95% of the target domain's running words, and then having students learn these word families by reading texts in an order that allows for the incremental introduction of target vocabulary. This is made possible by a recently developed computer program that sorts through a collection of texts and a) finds texts with a suitably high proportion of target words, b) ensures that over the course of these texts, most or all target words are encountered five or more times, and c) creates an order for reading these texts, such that each new text contains a reasonably small number of new target words and a maximum number of familiar words. A computer-based study, involving the sorting of 293 Voice of America news texts, resulted in the finding that a) the introduction of new target vocabulary in each text could be kept to a reasonably small amount for the majority of texts, and b) the number of target vocabulary items occurring fewer than five times could be kept to a minimum when the list of target vocabulary accounted for 96% of the domain's running words, rather than 95%.
THE PROBLEM: L1 VERSUS L2 VOCABULARY ACQUISITION
There is considerable evidence that L1 learners acquire a large amount of their vocabulary through guessing from context (Nagy & Herman, 1987; Sternberg, 1987). The frequency at which the L1 learner encounters words, and the variety of contexts in which words are encountered, ensure that the learner will eventually come across most new words in a context where the word is guessable. Research suggests, however, that foreign language students do not undergo the same rich and varied exposure to vocabulary (Singleton, 1999). As a result, although EFL elementary-level students quickly learn many of the highfrequency words that occur in teaching materials, they experience a breakdown in their ability to guess from context when faced with the much lower frequency words found in unsimplified texts. This is because the low-frequency words found in unsimplified texts make up too large a proportion of those texts. In other words, since there are not enough familiar words in the text for the learner to use as clues, guessing unfamiliar words from context becomes extremely difficult or impossible.
The problem, then, is how to expand a student's vocabulary knowledge to the point where he or she recognizes enough of the words in unsimplified texts to be able to guess unfamiliar words from context. Put another way: what is needed is a strategy for bridging the gap between a knowledge of the kinds of high-frequency words found in elementary texts, and a knowledge of the words necessary for the student to be able to resume incidental vocabulary learning. The problem can be broken into two parts: a) Which words are needed in order to bridge this gap? b) Which methods should be used to teach these words quickly and effectively?
Which Words
Carroll, Davies, and Richman (1971) pointed out, nearly three decades ago, that about 80% of the running words (tokens) in any English text are accounted for by the 2,000 most frequent word families(1) of English. Nation (1990) has drawn to our attention the importance of knowing these word families to reading comprehension. A reader who is familiar with 80% of the tokens in a text, however, is still not able to adequately comprehend the text. Studies by Liu & Nation (1985) and Laufer (1989) point toward 95% as the amount of coverage required in order for a reader to adequately understand a text and guess new words from context. Finding a reasonably-sized vocabulary list that accounts for 95% of the tokens of all unsimplified texts, however, has proven difficult. Instead, it may be more feasible to focus on moving the student from elementary-level texts to texts in a specific domain or genre.
How to Get 95 % Coverage in Academic Texts
Researchers interested in vocabulary acquisition by students enrolled in ESP (English for Specific Purposes) courses point out that just over 90% of the running words in academic texts can be accounted for by two word-lists, West's General Service List (GSL; 1953) -- which includes the 2,000 most frequent word families of English -- and Xue & Nation's University Word List (UWL; 1984) -- which is made up of words frequently found in academic texts (Nation & Hwang, 1995).(2) In addition, academic texts contain a number of word families specific to the academic domain that is the subject of the text (Sutarsyah, Nation, & Kennedy, 1994). In one study, researchers working with an economics textbook found that word families from the GSL and UWL accounted for over 91% of tokens in the text, and estimated the number of domain-specific word families at 460 (Sutarsyah et al., 1994). Unpublished research by the author suggests that domain-specific word families (defined by their greater frequency of occurrence in a narrow range of texts circumscribed by the domain) account for more than 4% of academic economics texts' tokens, thus bringing the total to 95%.(3) If we assume that this figure holds true for other academic domains, we can conclude that for academic texts it is possible to come up with a reasonably-sized combination of word lists (GSL at 2,300 word families + UWL at 800 word families + economics domain list at 460 word families) that accounts for 95% coverage of the text. Knowing these word families should allow learners to comprehend the texts and attempt to guess the remaining 5% of tokens from context.
Which Method
Once a word list or combination of word lists accounting for 95% of tokens in the target domain has been found, the next question to consider is which method is best suited to acquainting students with the word families on this list quickly and effectively. Some interesting solutions to this problem have been suggested by researchers interested in the problem of ESP vocabulary acquisition. At issue for these researchers is how to integrate the speed of explicit instruction with the traditional benefits of readingbased vocabulary acquisition. In response to this problem, a number of instructional strategies have been devised which attempt to teach target vocabulary items quickly, while ensuring that each item is supplied with some form of meaningful context.
One strategy that has shown much promise over the last few years is vocabulary instruction via computerbased concordancing. First, a computer-based corpus is created by scanning texts in the students' target domain into a computer. Subsequently, any word that exists in the corpus can be viewed by the student surrounded by its immediate context (or contexts, as there are usually multiple instances of the word in the corpus). Cobb and Horst (1999) argue that a concordance-based tutor has three advantages over incidental reading-based and traditional word list learning strategies: a) computer concordancing conserves the efficiency of list targeting while allowing for exposure to the new word in multiple contexts, b) it allows for a way to ensure that each word is encountered a minimum of five times, and c) the learner can choose among the example sentences generated by the concordancer for one that makes sense to him or her (Cobb & Horst, 2001). Note that, relevant to the second argument, a study by Saragi, Nation, & Meister (1978) has shown that a word needs to be encountered at least five times in order to be well retained.(4)
Other computer-based lexical tutors have been drawing attention in recent years. Of note is a tutor developed by Peter Groot (2000) named CAVOCA (Computer Assisted VOCabulary Acquisition). CAVOCA is designed to operationalize current theories about how lexical storage works. Hence, students using CAVOCA are introduced to a word by having to guess the word from context, think about correct versus incorrect usage of the word, read the word in the context of example sentences, and finally produce the word in a CLOZE exercise. According to Groot, this kind of rigorous involvement with the word should encourage deeper processing and longer-term retention than traditional learning strategies like bilingual word list memorization.
PROVIDING CONTROLLED EXPOSURE TO TARGET VOCABULARY THROUGH THE SCREENING AND ARRANGING OF TEXTS
The strategies mentioned above offer alternatives to reading-based incidental vocabulary learning, which, as both researchers point out, is not necessarily best-suited to ESP purposes. The three major complaints about reading-based vocabulary acquisition are that a) it is an inefficient strategy for learning target words (readers must wade through many other words, in haphazard fashion, before they come across a target word), b) even if a target word is encountered during reading, there is no guarantee it will be encountered five or more times, and c) even if a and b were not problems, the high proportion of unfamiliar words in unsimplified texts ensures that for L2 learners with a limited vocabulary of high-frequency words, guessing new target words from context is difficult or impossible.
If these three problems could somehow be resolved, however, there may be good reason for encouraging reading-based vocabulary acquisition over non-reading-based strategies. Krashen (1989) has argued vigorously that extensive reading is the only strategy that provides the learner with complete and nonsuperficial knowledge of a word. The pleasure that many learners experience when reading a whole text is also an important factor to consider, since, ideally, it creates the motivation to read more (and hence, learn more words). Finally, important reading skills are exercised during the reading of whole texts that are not exercised during the reading of example sentences (making predictions, recognizing genre, etc.). Developing these skills may be crucial to further reading (and again, further vocabulary learning).




Mobile Edition
Print
Get the Mag
Weekly Updates