More Resources

WEB-BASED LANGUAGE TESTING.


ABSTRACT

This article describes what a Web-based language test (WBT) is, how WBTs differ from traditional computer-based tests, and what uses WBTs have in language testing. After a brief review of computer-based testing, WBTs are defined and categorized as low-tech or high tech. Since low-tech tests are the more feasible, they will constitute the focus of this paper. Next, item types for low-tech WBTs are described, and validation concerns that are specific to WBTs are discussed. After a brief overview of the marriage of computer-adaptive and Web-based tests, the general advantages as well as design and implementation issues of WBTs are considered before examining the role that testing consequences play in deciding whether a WBT is an appropriate assessment instrument. It is argued that WBTs are most appropriate in low-stakes testing situations; but with proper supervision, they can also be used in medium-stakes situations although they are not generally recommended for high-stakes situations. Some possible areas for future research are suggested.

INTRODUCTION

Interest in Web-based testing is growing in the language testing community, as was obvious at recent LTRC conferences, where it was the topic of a symposium on the DIALANG project (Alderson, 2001), a paper (Roever, 2000), several in-progress reports (Malone, Carpenter, Winke, Kenyon, 2001; Sawaki, 2001; Wang et al., 2000), and poster sessions (Carr, Green, Vongpumivitch, & Xi, 2001; Bachman et al., 2000). Web-based testing is also considered in Douglas's recent book (Douglas, 2000). It is the focus of research projects at UCLA and the University of Hawai'i at Manoa, and a number of online tests for various purposes are available at this time and are listed on Glenn Fulcher's Resources in Language Testing Web site (Fulcher, 2001). This paper is intended to advance the Web-based language testing movement by outlining some of the fundamental theoretical and practical questions associated with its development. Simply defined, a Web-based language test (WBT) is a computer-based language test which is delivered via the World Wide Web (WWW). WBTs share many characteristics of more traditional computer-based tests (CBTs), but using the Web as their delivery medium adds specific advantages while their delivery medium complicates matters.

COMPUTER-BASED AND WEB-BASED TESTS

The pre-cursor to Web-based language tests (WBTs) are computer-based tests (CBTs; for a recent discussion see Drasgow & Olson-Buchanan, 1999), delivered on an individual computer or a closed network. CBTs have been used in second language testing since the early 80s ( Brown, 1997 ), although the use of computers in testing goes back a decade (Chalhoub-Deville & Deville, 1999). Computers as a testing medium attracted the attention of psychometricians because they allow the application of item response theory for delivering adaptive tests (Wainer, 1990), which can often pinpoint a test taker's ability level faster and with greater precision than paper-and-pencil tests. Based on the test taker's responses, the computer selects items of appropriate difficulty thereby avoiding delivering items that are too difficult or too easy for a test taker, but instead selects more items at the test taker's level of ability than a non-adaptive test could include. But even for non-adaptive testing, computers as the testing medium feature significant advantages. CBTs can be offered at any time unlike mass paper-and-pencil administrations which are constrained by logistical considerations. In addition, CBTs consisting of dichotomously-scored items can provide feedback on the test results immediately upon completion of the test. They can also provide immediate feedback on each test taker's responses -- a characteristic that is very useful for pedagogical purposes. The seamless integration of media enhances the testing process itself, and the tracing of a test taker's every move can provide valuable information about testing processes as part of overall test validation.

On the negative side, problems with CBTs include the introduction of construct-irrelevant variance due to test takers' differing familiarity with computers (Kirsch, Jamieson, Taylor, & Eignor, 1998), the high cost of establishing new testing centers, and the possibility of sudden and inexplicable computer breakdowns.

Types of WBTs

A WBT is an assessment instrument that is written in the "language" of the web, HTML. The test itself is consists of one or several HTML file(s) located on the tester's computer, the server, and downloaded to the test taker's computer, the client. Downloading can occur for the entire test at once, or item by item. The client computer makes use of web-browser software (such as Netscape Navigator or Microsoft Internet Explorer) to interpret and display the downloaded HTML data. Test takers respond to items on their (client) computers and may send their responses back to the server as FORM data, or their responses to dichotomously scored items may be scored clientside by means of a scoring script written in JavaScript. A script can provide immediate feedback, adapt item selection to the test taker's needs, or compute a score to be displayed after completion of the test. The same evaluation process can take place on the server by means of serverside programs.

Many different kinds of WBTs are possible, depending on the developer's budget and programming expertise, as well as computer equipment available to test takers. On the low end of the continuum of technological sophistication are tests that run completely clientside and use the server only for retrieving items and storing responses. This type of test is the easiest to build and maintain because it does not require the tester to engage in serverside programming, which tends to involve complex code writing and requires close cooperation with server administrators. In a low-tech WBT, the server only holds the test or the item pool while the selection of the next test item is accomplished by means of a script located clientside. Test-taker responses are either scored clientside or sent to the tester's email box and stored for later downloading. This low-tech approach is preferable if limited amounts of test data can be expected, adaptivity is crude or unnecessary, item pools are small, and testers are interested in remaining independent of computer and software professionals.

A high-tech WBT, on the other hand, makes heavy use of the server, for example, by having the server handle item selection through adaptive algorithms or by placing a database program on the server to collect and analyze test-taker responses. Both tasks require testers to become highly familiar with the relevant software or involve computer specialists in test setup and maintenance. This high-tech approach is preferable in cases where large amounts of test data have to be handled, complex adaptive algorithms are used, item banks are large, and budgets allow for the purchase of expensive software and the hiring of computer professionals.

In this paper, I will focus on the low-tech versions of Web-based tests, which give testers maximum control over test design, require very small operating budgets, and make the advantages of computer-based testing available to testers at many institutions.

What to Test on the Web and How to Test It

The first step in any language testing effort is a definition of the construct for what is to be tested. Will the test results allow inferences about aspects of students' overall second language competence in speaking, reading, listening, and writing (Bachman, 1990; Bachman & Palmer, 1996). Or will the test directly examine their performance on second language tasks from a pre-defined domain (McNamara, 1996; Norris, Hudson, Brown, & Yoshioka, 1998; Shohamy, 1992, 1995), such as leaving a message for a business partner, writing an abstract, or giving a closing argument in a courtroom.

Whether a test focuses on aspects of second language competence or performance, its construct validity is the overriding concern in its development and validation. To that end, the test developer must be able to detect sources of construct irrelevant variance, assess whether the construct is adequately represented, in addition to considering the test's relevance, value implications, and social consequences (Messick, 1989). Also, they must examine the test's reliability, authenticity, interactiveness, impact, and practicality (Bachman & Palmer, 1996).

In the following section, appropriate content and item types for WBTs will be discussed and some WBT-specific validation challenges briefly described.

Item Types in WBTs

The Web is not automatically more suited for the testing of general second language competence or subject-specific second language performance than are other testing mediums. To the extent that the performance to be tested involves the Web itself (e.g., writing email, filling in forms), performance testing on the Web is highly authentic and very easy to do since testers only have to create an online environment that resembles the target one. However, a WBT or any computer-based test can never truly simulate situations like "dinner at the swanky Italian bistro" (Norris et al., 1998, pp. 110-112). Rather than analyzing the possibilities of Web-based testing primarily along the lines of the competence-performance distinction, it is more useful to consider which item types are more and which ones are less appropriate for Web-based testing.

It is fairly easy to implement discrete-point grammar and vocabulary tests using radio buttons to create multiple choice items, cloze tests and C-tests with textfields for brief-response items, discourse completion tests or essays with large text areas, as well as reading comprehension tests with frames, where one frame displays the text and the other frame displays multiple-choice or brief-response questions. If the test items are dichotomous, they can be scored automatically with a scoring script. Such items can be contextualized with images (but see Gruba, 2000, for some caveats). They can also include sound and video files, although the latter are problematic: These files are often rather large, which can lead to unacceptably long download times, and they require an external player, a plug-in, which is beyond the tester's control. This plug-in allows test takers to play a soundfile repeatedly simply by clicking the plug-in's "Play" button.

Page 1 2 3 4 Next »
COPYRIGHT 2001 University of Hawaii, National Foreign Language Resource Center Reproduced with permission of the copyright holder. Further reproduction or distribution is prohibited without permission.

Copyright 2001, Gale Group. All rights reserved. Gale Group is a Thomson Corporation Company.

NOTE: All illustrations and photos have been removed from this article.


Marketplace

Learn how to distribute a press release

Try our new online printing. theupsstore.com/print
Today on Entrepreneur

Sign Up for the Latest in:
Online Business
Franchise News
Starting a Business
Sales & Marketing
Growing a Business

E-mail*

Zip Code*