More Resources

Emerging technologies: tag clouds in the blogosphere: electronic literacy and social networking.


Electronic literacy today is a moving target. How and why we read and write online are evolving at the fast pace of Internet time. One of the most striking developments in the past few years has been how new social networking phenomena on the Web like community tagging, shared bookmarking, and blogs have created convergences between consumers and creators, between reading and writing, between public and private spaces. Blogs invite us to write responses to items we have read, to move from observer to participant. Shared tagging invites us to analyze texts and sum up their distinctiveness in keywords. Writing online may involve coding or scripting, as we try to add distinctiveness in formatting or interactive functionality to our texts, blurring the lines between writing and programming. Web browsing and reading must be supplemented by abilities in sorting, navigation, and critical thinking. Integration of other media into texts complicates further the notion of literacy. We will examine in this column some of the ways in which these developments are reflected in new tools, services, and approaches to finding, creating, and transforming texts on the Web, and what this might mean for language learning.

Discovery: Tagging and the Semantic Web

One of the challenges we face in using the Web, whether as language learners or instructors, is in finding the resources appropriate to our needs. We know there is a wealth of information and opportunity on the Web, authentic texts in all languages, on-line communities of learners and practitioners, wonderfully inviting Web sites spotlighting cultural practices, vibrant exchanges of views on all subjects under the sun, and all manner of opportunities for reading and writing--if only we could find them. New methods of finding and identifying Web resources involve fundamental skills of analysis, contextualization, and conceptualization, not to mention reading and writing themselves. You can't "tag" a Web resource without being able to extract salient points the author makes, considering how to summarize in keywords what's important, and placing that text in the context of others.

Of course, the traditional and most-widely used means of finding texts, or other Web resources, is to perform a search, most often using Google. With the vastness of the Web today (one report indicates Google indexes over 8 billion Web pages) and the proliferation of junk, googling can be a hit or miss proposition. An alternative to searching is browsing by classification, as in the original Yahoo model. This is an area in which librarians and professional organizations have contributed mightily by evaluating, collecting and annotating categories of texts and resources. A site/service such as Merlot offers expertbased reviewing and ranking of Web sites, including excellent collections of language learning sites. Communities of practice such as Webheads also contribute. On the other hand, many sites that purport to be site collectors are simply commercial endeavors or just place-holders for advertising. As such sites proliferate, students more than ever need skills in critical thinking to be able to sift and evaluate.

One of the proposed solutions to the chaos of the Web, going back to a suggestion from Tim Berners-Lee, the creator of the World Wide Web, is the implementation of what has been called the Semantic Web, a system in which meaningful information about Web texts can be extracted automatically from Web pages and collected by intelligent "agents". Agents are computer programs launched from a server which function autonomously over a period of time, similar to the crawling programs used by search engines to discover and catalog Web pages. By adding meaning to information, the Semantic Web holds the promise of powerful opportunities for creating educational content through combining resources from many sources, using human or machine means, to build a variety of customized learning resources.

The challenge of the fulfillment of this vision is its reliance on 1) the inclusion of meta-data and 2) an established set of ontologies which explain terms and relations in a given subject area. The ontologies allow agents to make sense of the resource's meta-data. Creating such ontologies is not an easy process, nor one on which consensus is likely to be easy to reach. A recent development that might be of help is the creation of a Web ontology language, OWL, a markup language for publishing and sharing ontologies on the Web. The second technical requirement for the Semantic Web is wide-spread use of meta-data--this has been a tough sell to Web authors. Although meta-data systems such as the Dublin Core and IMSLOM have been in place for some time, they are by no means universally used (even by search engines). The meta-data specification most often associated with the Semantic Web is RDF (Resource Framework Discovery). RDF describes resources in XML and is meant to be used in situations in which the information needs to be processed by applications, rather than to be displayed to people. The recently proposed RDF/A specification streamlines considerably the creation of RDF by allowing it to be directly embedded into the HTML of a page (added as a simple tag attribute) rather than contained in a separate file or in the page header.

The promise of the Semantic Web is evident in the experimental "semantic browser", Magpie, an add-on to Internet Explorer or Mozilla/Firefox, which associates words and phrases in a Web text with available ontologies and keeps track of key terms in dynamically created "collectors". The unique feature of Magpie is that it does not require manually annotated texts but searches and collects based on keywords in the appropriate ontology. Another alternative browser, Conzilla, is a "concept browser" which presents information in the form of context maps. W3C's Amaya is an experimental browser that leverages the combination of ontologies and RDF; it makes use of a W3C project called Annotea, which features shared annotations stored on a central server. An implementation of the kind of text mining and collecting envisioned by the Semantic Web can be seen in the daily news analysis (Europe Media Monitor) available from the Joint Research Centre of the EU. It searches out articles written in a variety of languages in a given subject area, extracts and stores references to places, people, and organizations and generates a geographical map (highlighting mentioned locations) and a set of commented links. As more keywords are used in different news clusters, the system learns over time which entities are associated with one another.

While the Semantic Web has been mostly of scholarly interest and not widely discussed outside of academic and techie circles, another effort to create order out of chaos on the Web has proven to be explosively popular. Community tagging is a bottom-up, grass-roots phenomenon, in which users classify resources with searchable keywords. The tags are free-form labels chosen by the user, not selected from a controlled vocabulary. The first wide-spread use was on flickr, a site which offers photo-sharing services. Users of flickr are able to add their own tags to any photo. Users can also aggregate pictures into photosets, create public or private groups, and easily add flickr-stored photos to a blog. In the past two years there have been a number of sites and services which make use of this kind of open tagging system. Some of the better-known are del.icio.us, a bookmarking service, Technorati, a blog cataloging site, and digg, a gathering place for tech fans. These sites create clickable "tag clouds" for resources, groupings of tags arranged alphabetically, with the most used or popular keywords highlighted through being shown in a larger font. Figure 1 below shows the most popular tags on flickr from the middle of March, 2006.

[FIGURE 1 OMITTED]

One should note that the tags represented here are all in English, but on some sites (particularly on Technorati) other languages are also used. There are tagging sites which cater to other languages such as French (BlogMarks) and Japanese (Livemark). Many such sites make use of RSS (Really Simple Syndication) to notify interested users of changes and new developments. In flickr, RSS feeds can be attached to individual tags, or to photos and discussions. In addition to RSS, flickr and other social networking sites typically offer functions such as search (for users and tags), comments (and comment trails), and APIs (application program interfaces) for posting to or from the tools, used especially in combination with blogs. An interesting use of RSS in combination with tagging is at the Flashcard exchange, where, for example, one can view or subscribe to all flashcards posted for learning Spanish (or other languages).

Figure 1. Tag cloud from flickr

The tagging process is by no means simply technical--a way of categorizing resources--it also has a strong social dimension as users of the site find common interests and create on-line communities. It represents another example of the fuzziness separating consumers and creators on the Web today. A contribution to a tagging site, seen by other users, may cause additional tags or comments to be added, automatically building and updating and thus ultimately defining a resource. Instead of one person making a judgment about a blog entry, photo, or other resource, a consensual classification is created. In effect, a text or object identifies itself over time. This creation of "folksonomies", as they have been called, can be seen as a democratic implementation of the Semantic Web. The idea of users becoming creators is one of the key concepts behind what some refer to as Web 2.0. It also involves the kind of social networking and "collective filtering" that can be seen on sites such as amazon, ebay, or netflix, in which users' reviews and comments build a self-generating database of information. The emphasis is on the Web as a gathering place in which users both benefit and contribute. Of course, in the process a lot of reading and writing is being done in discussion forums. Feedback or comment forms are part of all social or community networking sites.

Page 1 2 3 Next »
COPYRIGHT 2006 University of Hawaii, National Foreign Language Resource Center Reproduced with permission of the copyright holder. Further reproduction or distribution is prohibited without permission.

Copyright 2006, Gale Group. All rights reserved. Gale Group is a Thomson Corporation Company.

NOTE: All illustrations and photos have been removed from this article.


Marketplace

Learn how to distribute a press release

Try our new online printing. theupsstore.com/print
Today on Entrepreneur

Sign Up for the Latest in:
Online Business
Franchise News
Starting a Business
Sales & Marketing
Growing a Business

E-mail*

Zip Code*