More Resources

Integrating corpus consultation in language studies.


ABSTRACT

Alongside developments in language research, the potential of corpora as a resource in language learning and teaching has been evident to researchers and teachers since the late 1960s. Despite publications which emphasise the benefits of corpus consultation for language learners (Bernardini, 2002; Kennedy & Miceli, 2001), there is little evidence to suggest that direct corpus consultation is coming to be seen as a complement or alternative to consultation of a dictionary, course book, or grammar by the majority of learners. There is thus a need for research to underpin the integration of corpora and concordancing in the language-learning environment.

This study begins with an account of published research relating to course design and structure in the area of corpus consultation by language learners. The focus then narrows to the initial training of learners in corpus consultation, using as an example a course involving undergraduate students on several language degree programmes. The results of the students' consultation of the corpora are examined, including choice of search word(s), analytical skills, the problems encountered, and their evaluation of the activity. The results reveal how corpus consultation can complement traditional language-learning resources, while also raising questions concerning its integration in the language-learning environment.

INTRODUCTION

Since large computerised corpora of English were created in the 1960s, there has been a steady increase in the number of publications devoted to their use in the context of language teaching and learning. The pioneering work of Johns (1986) and Tribble and Jones (1990) was followed by an explosion of studies devoted to various aspects of the use of corpora in language learning in various contexts, for example the publications resulting from the TALC (Teaching and Language Corpora) conferences on teaching and language corpora (see, e.g., Burnard & McEnery, 2000; Kettemann & Marko, 2002). From the early 1990s onward, corpora were clearly being consulted by language teachers, and also by learners, at least in courses run by researchers and enthusiasts, and this activity was gaining in popularity by a process which McEnery and Wilson (1997, p. 5) describe as percolation. This has created a need for research to underpin this new development, focusing on aspects such as the type of corpora to be consulted, large or small, general or domain-specific, tagged or untagged.

Other pedagogic issues also require investigation, such as the advantages of direct access to corpora as opposed to mediation by the teacher through the preparation of corpus-based worksheets, the strategies which learners need to acquire to benefit from direct consultation, and, last but not least, the means by which this new activity can best be integrated into the language-learning environment. Some of these issues are already receiving considerable attention from researchers, with a number of studies recommending the use of small corpora tailored to the learners' needs (Aston, 1997; Roe, 2000), while others champion large corpus concordancing (Bernardini, 2000; Cheng, Warren, & Xun-feng, 2003). Direct access to corpora by learners is the subject of a number of studies (see, e.g., Bernardini, 2002; Chambers & O'Sullivan, 2004; Kennedy & Miceli, 2001, 2002), with a cautionary note from Johns (1997, p. 113) recommending the use of corpus results mediated by the teacher as a first stage. While there is already a substantial and increasing body of research in several aspects of direct corpus consultation by learners, there is still considerable scope for developments, particularly in the area of course design and structure, concerning how one can successfully integrate corpus consultation into a programme of language study in higher education.

The publications on corpus consultation quoted in this study give varying amounts of information on the types of course structure within which they are operating, including the aims of their courses and the time allotted to them. But all this is presented as a given, understandably so, as the studies do not aim to investigate issues arising from course design and structure. The aim of this study is to examine a number of aspects of course design in corpora and language learning involving direct access by learners, focusing not on the training of corpus linguists but rather on the popularisation of corpus consultation by a wide spectrum of learners. After a brief overview of the types of courses which are described in the studies referred to above and other similar publications, one example will be examined in more detail, namely a section of a second-year undergraduate course on language and technology which aims to encourage the learners to use corpora as a resource in their language learning alongside other resources such as the dictionary, course book, and grammar. The course aims, structure, content, and assessment will be briefly described, paying particular attention to the training provided in concordancing and corpus analysis, the corpus resources used, the students' choice of an aspect of the language to be studied, the strategies which they require to benefit from the corpus consultation, their success or otherwise in analysing the results, and their evaluation of the activity. This will enable us to draw some conclusions concerning the factors which favour the integration of corpora and concordancing into the language-learning environment and the obstacles which remain to be surmounted.

COURSE DESIGN IN CORPORA AND LANGUAGE LEARNING

Within the disciplinary area of language studies, corpora and corpus-based methods are increasingly used outside language learning per se, in areas such as the teaching of literature (see, e.g., Kettemann, 1995; Louw, 1997) and of translation (see, e.g., Bowker, 1998; Zanettin, 2001). This section, however, will include only research concerning those wishing to learn about language either as linguistic researchers or language learners. Fligelstone (1993, p. 98) proposes what he terms a simple framework for assessing "the factors relevant to good teaching practice," grouping corpus-related activities into three categories:

TEACHING ABOUT (i.e., teaching about corpora/corpus linguistics)

TEACHING TO EXPLOIT (i.e., teaching students to exploit corpus data)

EXPLOITING TO TEACH (i.e., exploiting corpus resources in order to teach)

Even from reading only the small selection of studies of direct corpus consultation by learners referred to above, it is clear that there is considerable variation in the nature of the courses on which they are based, ranging from courses clearly designed as part of a programme of study in linguistics, to a limited amount of training included in a language course so that the learners can benefit from consulting a corpus. Davies (2000), for example, uses corpora of historical and dialectal texts when teaching an advanced course in Spanish linguistics. Similarly, the description of Paul Thompson's (2004) postgraduate module in corpora in applied linguistics in the University of Reading clearly situates it within the discipline of corpus linguistics. At the other end of the scale, in the sense not of being inferior but of having very different aims and therefore content, certain courses, mostly at undergraduate level, include a very limited amount of training in corpus consultation with the practical aim of enabling the learners to consult corpora to improve their language skills. A comparison of one such course, part of a second-year undergraduate module at the University of Limerick, and the Reading postgraduate course, reveals both the similarities and differences between them (see Table 1).

While both courses include lectures on corpus linguistics and on the analysis of corpora, alongside practical laboratory sessions, the postgraduate course is a specialist option embedded within the already specialised context of masters programmes in Applied Linguistics and ELT. It is allotted time to allow for greater depth of study and familiarisation with the tagging of corpora, while the core undergraduate teaching is, as we shall see, part of a second-year module and is obliged to make room for other aspects of technology and language study, also considered as core elements of the degree programme. It is this situation which creates the challenge of popularising corpus consultation, informing students of its potential benefits and giving them the skills to benefit from it in a very limited amount of time, as well as providing access to resources for future use and guidelines on how best to benefit from them.

Before examining the undergraduate course in more detail, it is important to note that the publications relating to corpus consultation by learners do not all fit neatly into the two types of course described in Table 1, or into one of Fligelstone's three categories. Several other studies contain elements of both the postgraduate and undergraduate courses, supporting Fligelstone's (1993, p. 98) comment that there is a certain amount of interaction between his three categories. Aston (1997, p. 61), for example, notes that the analysis of small corpora for language-learning purposes can serve as a useful starting point for students who may later wish to move on to the analysis of larger corpora in a research context. Dodd (1997), referring to the use of unedited corpus data with advanced students at undergraduate and postgraduate level, comments,

In another context, Cheng et al. (2003) are able to devote a much more substantial amount of classroom and laboratory contact hours over two semesters to corpus design and analysis than their Limerick counterparts. Corpora and concordancing are taught by them as a substantial part of second-year undergraduate courses on Information Technology and Discourse Analysis, within an English language major undergraduate programme. Their aims include both research in corpus linguistics and the practical benefits of language learning, firstly, placing the students "in the role of language researchers finding out for themselves about the English language" (p. 178), and secondly, at the same time encouraging them "to reflect on their experiences as language learners and English language majors from this form of datadriven learning" (p. 178). The much greater amount of time available to them enables them to introduce the students to work with larger corpora and to move further into the study of corpus linguistics as a discipline than the shorter undergraduate course. In the context of popularising corpus consultation, however, the Limerick course is interesting by its very limitation, in that it can be seen as a component of a course which one could reasonably envisage being included in all undergraduate language degree programmes. Kennedy and Miceli (2001, 2002) are very possibly examples of other researchers working within similar parameters, in that there is no evidence in their publications that the degree programmes involved have a noticeable bias towards Information Technology or Discourse Analysis, as in the case of Cheng et al. Looking at the variety of course design and structure within the publications which study corpus consultation by learners, it seems clear, without in any way undermining the validity of Fligelstone's framework, that the range of courses or parts of courses devoted to direct access to corpora can be situated on a continuum rather than within a clearly defined category.

Page 1 2 3 4 5 6 Next »
COPYRIGHT 2005 University of Hawaii, National Foreign Language Resource Center Reproduced with permission of the copyright holder. Further reproduction or distribution is prohibited without permission.

Copyright 2005, Gale Group. All rights reserved. Gale Group is a Thomson Corporation Company.

NOTE: All illustrations and photos have been removed from this article.


Marketplace

Learn how to distribute a press release

Try our new online printing. theupsstore.com/print
Today on Entrepreneur

Sign Up for the Latest in:
Online Business
Franchise News
Starting a Business
Sales & Marketing
Growing a Business

E-mail*

Zip Code*