INTRODUCTION
This paper is a companion to "Business activity patterns: A
new model for collaborative business applications," which also
appears in this issue of the IBM Systems Journal. (1)
Today's tools provide little support for team members working
together on a collaborative process. E-mail is the predominant
communication tool used today, and it has been overused for purposes
other than simple communication, such as exchanging files, scheduling
meetings, and archiving data. (2) Using e-mail to manage activities has
many drawbacks. For example, it can be difficult to determine the
current status of an activity which is managed by e-mail, and if people
join an ongoing activity, it can be difficult to bring them up to speed
with other team members.
At the other end of the spectrum are formal
business-process-workflow systems. These systems direct processes and
the people involved in them, but are overly rigid for most everyday
business activities. (3,4) A middle ground between e-mail and workflow
systems would better suit many collaborative activities.
The goal of the Unified Activity Management (UAM) project at IBM
Research is to define a new model for collaborative work based on a
shared semantic representation of collaborative activities. (5-7)
"Activities" as used here refers to a digital schema-based
representation that describes the properties of a collaborative work
project and semantically relates the people, artifacts, tools, events,
and other elements which are involved in carrying out the project.
Examples of activities include organizing a large event or conference,
responding to a request for proposals, and resolving a trouble ticket
(mechanism used in an organization to detect, report, and resolve a
problem). The activity model and how it is used to support business
applications is described in depth in a companion paper in this issue.
(1)
The UAM approach
The objective of the UAM project (8) is to design a system that
supports collaborative work processes, with multiple people coordinating
their work in order to accomplish a shared goal. Our work is based on
the assumption that there is a great potential benefit in supporting the
non-structured aspects of everyday business activities, those that are
not managed by workflow processes and existing corporate applications.
These kinds of activities are often managed by using handwritten notes,
e-mail, telephone conversations, and other informal means. This
objective has led to a number of choices in how activities are
represented.
First, we believe that activity representations should have
semantics and structure. For instance, each activity has a creator, a
title, a description, and a set of people involved in its execution,
each with a potentially different role (participant, observer, etc.).
Activities may have resources associated with them, such as Web pages or
word-processing documents; resources may be of different types, such as
a reference document or an output of the activity. We hypothesize that
formalizing the activity structure explicitly enables the participants
in the activity to see how the different parts relate to each other and
to more easily track the current status of the activity. In the section
"Unified Activity ontology," we describe our representation of
activities.
Second, we believe that activities are fundamentally composed of
metadata, as opposed to content. Activities serve as the glue that joins
individual items of content created and managed in word processors,
spreadsheets, e-mail, and Web applications. Rather than reinventing each
of these business applications in a new, monolithic application, we take
the position that activities should provide a framework for collecting
all of these items and presenting them in a single, unified view. As a
result, we have developed a model that we call "activities as
service": a lightweight Web service infrastructure for creating,
managing, and querying activity data. We have used this infrastructure
to develop Web-based activity management systems. More important,
however, we believe that activity data is most useful when presented
within the context of the tools and applications people already use.
This paper describes our representation of activities and presents
the Wax system, a Web service framework for activities that leverages a
semantic representation of activity. Wax takes advantage of emerging
technologies such as lightweight (REST [Representational State
Transfer]) Web services, RSS (Rich Site Summary), and the semantic Web
to provide access to activity-related data as a service. We present the
results we obtained in using the Wax system to manage two large business
activities and discuss which features of our design were most helpful to
the participants as they used the system.
Related work
Previous approaches to supporting collaborative tasks generally
fall within the categories of workflow systems or personal information
managers (PIMs).
Formal workflow systems are often rigid and frequently assume fixed
roles for users and a fixed pattern for actions. One such system is the
Coordinator. (9) These systems are characterized by a rigid
specification of the processes to be executed. Furthermore, workflows
tend to work as independent entities, having little integration with the
rest of the computing environment. A more flexible workflow is described
in Reference 10, wherein end users can modify the process. Our system
goes even further by dispensing with the process model altogether.
The Task Manager (11) is the earliest system of which we are aware
that is based on shared representation of tasks that are malleable and
that relate people and resources. A later system that is even closer to
our approach in using an early semantic network representation is
described in Reference 12.
Shared workspaces provide shared access to documents (such as the
Groove system (13) and Lotus * Notes * TeamRooms). These systems tend to
be difficult to use for simple, lightweight activities, and it is
unclear how they might integrate ad hoc activity with more formal
business processes or workflows.
PIMs aim at improving personal productivity by organizing
communications, contacts, and events related to an individual. They do
not support shared entities, and external interaction is handled through
messaging. In contrast, our system is centered around activities and
uses them to organize documents, people, and events.
More details about the integration of our system with business
processes are described in Reference 1 and Reference 4.
The remainder of this paper is structured as follows. We begin by
introducing a semantic representation of activity, based on the Resource
Description Framework (RDF), and we describe the ontology used to
represent activities and their properties. We then present the Web
service APIs (application programming interfaces) that we have defined
to provide access to activity data from Web applications and third-party
extensions. Next, we present the user interfaces and client plug-ins
that we have developed, which let users interact with activities.
Finally, we report on the results of two case studies in which the Wax
system was used. Our results indicate that the participants found having
an activity management system to be extremely useful and confirm our
hypothesis that a structured activity representation brings value to
activity management. We conclude with a discussion of directions for
future work.
EXPLICITLY REPRESENTING ACTIVITIES
The goal of activity management is to help users be more productive
by organizing the work they do around the concept of activities. In
order to help users manage activities, they must be represented in a
consistent way. This representation should capture the essential
semantics of an activity: the links, relationships, and resources that
differentiate it from other activities.
It is important to distinguish between the typical representation
of real-world activities in the minds of the people involved and
explicit activity representations. Real-world activities are often
implicit (or tacit); people simply perform activities without any
representation of them. Real-world activities can also be deliberately
driven toward a more or less well-articulated objective, as proposed by
Activity Theory. (14) In contrast, real-world activities can also have
explicit representations, such as activity descriptions in some medium,
for example a plan written on a whiteboard. We propose that explicit
computational representations of activities (i.e., representations
enabling an activity to be processed with computational tools) are
useful for managing them. Explicit representations can be more or less
elaborate; it is our intention to support fluid transitions between
various levels of elaboration, based on people's estimates of the
costs and benefits of creating them.
Explicit activity representations can be formal or informal.
Informal representations place no constraints on how the activity is
represented; it may be written down as a textual description or may
consist of scribbles on a Post-It ** note. The goal of our work is to
provide a unified activity representation, which captures the common
properties of activities in a standardized representation so that
activities can be shared and managed by different systems. In order to
achieve this goal, we require activity representations to follow a
formal vocabulary, which captures the common characteristics of the
activity in a unified representation so that it can be processed with
computational tools.
In analyzing real world activities, we found a large amount of
variability in what is needed to represent different kinds of
activities. For example, an independent consultant may want each of his
activities to include a property denoting the client for which the work
is being done and the billing rate for each client. On the other hand, a
programmer might want to annotate each activity with a defect report
number and the sections of source code that are relevant to the
activity. As a result, activities should be represented as objects with
a large number of optional fields that cannot be predicted. This has
several important implications for the design of the API and the data
model.
Increasing explicitness and formality puts a burden on the user
that must be counterbalanced by some expected payoff. The first
incentive to move to explicit representation is sharing: a
representation of an activity becomes a communication artifact shared by
more than one person or group. The second incentive is the automatic
support provided by the system. Our activity system strives to attain a
principle of incremental benefit where the more the user invests in
constructing a formal representation of an activity, the more support
that user can get from the system, and this benefit is commensurate with
the additional effort invested by the user.
Activity representations in the mind of the user do not require any
support, and informal representations are not particularly problematic;
but the formal representation of activities is quite a challenging
problem. If we truly want to encompass the breadth of human activities
with an activity management tool, it must deal with a difficult
representation problem. Different kinds of activities should be
represented in a common "language" when possible, but the
representation must be extensible to support new types of activities and
functionalities. It is not desirable that the user be required to
specify an activity "type" in advance--the user may not have
decided on an activity type, and the activity may change considerably
during its lifetime; for example, starting as an e-mail or a "to
do" entry and evolving into a multi-person project. What is needed
is a fluid representation that enables activities to change and acquire
new properties.
The representation problem is further complicated by the
requirement to represent not just activities, but potentially all kinds
of domain-specific objects related to them (people, documents, files,
calendar appointments, Web sites, orders, etc.). We came to realize that
the network of relationships that binds other objects to an activity is
one of the most important features of an activity. We could use URLs
(Uniform Resource Locators) or other identifiers to represent these
objects, but then we would be severely limited in what we could do with
them or the kind of queries that would be possible. Ideally, we would
like to associate selected metadata with those objects in order to
support queries within the activity system, for example, displaying all
activities with a calendar appointment occurring today. The challenge of
the task is to represent highly variable objects and their relationships
to potentially any other object.
We have investigated the use of semantic Web (15) technologies to
provide a consistent, standards-based environment for the representation
of activities. RDF, (16) a key component of the semantic Web, provides a
representation flexible enough to support our generalization of
activities by enabling relationships from multiple ontologies and data
sources to be viewed as one coherent data structure: namely, a graph.
RDF comes with a data model to express binary relationships, query
languages (RDQL (17) and SPARQL (18)), several implementations of
storage repositories, a file format (RDF/XML), an ontology language (OWL
(19)), and an inference model. As part of our investigation, we have
developed a unified activity ontology, based on OWL, that defines the
generic concept of collaborative activity and provides a common set of
properties to ensure a level of consistency and uniformity for all types
of activities. This ontology can be easily extended to new relationships
and properties, thus enabling activities to be customized for particular
end users without necessitating a system redesign.
Unified Activity ontology
The Unified Activity ontology defines a few fundamental objects:
(1) activities, (2) actors (people or software agents), (3) events
(calendar entries), and (4) resources (files and URLs). The ontology
builds on the Dublin core, (20) Friend-Of-A-Friend, (21) and iCalendar
(22) ontologies to describe standardized properties such as titles,
descriptions, and e-mail addresses.
Table 1 summarizes the key properties of the Unified Activity
ontology. The basic entity is the activity, characterized by a few
descriptive properties (title, description, result) and several
relational properties that connect this activity to other objects, such
as actors and resources.
An actor represents a person in the activity system. The basic
description of an actor includes a name and an e-mail address. An actor
must be involved in an activity with a particular role; we have defined
several roles, such as participant, observer, committed, and doer. (23)
Events represent the time-centric features of an activity. We use
the iCalendar standard to represent properties of events, such as the
title, description, and start and end dates.
Activities may include resources, which represent external
artifacts (Web pages, files) that are related to the activity. Each
resource is described by a unique identifier (URI [Uniform Resource
Identifier]) and a label. Resources may be related to activities in a
variety of roles; for example, a document may be a reference document
(which is consumed in the process of completing the activity) or an
output document (which is produced as a result of the activity).
Activities may be decomposed into subactivities, which are
activities on which the "parent" activity depends in some way.
The subactivity relation provides a way to organize the structure of an
activity and define the breakdown of the work involved in completing it.
In keeping with the RDF graph model, the subactivity relationship is one
that links a parent activity and a child activity. Because the child
activity is not contained in a single parent activity, an activity can
be a child to multiple parent activities.
We also support the concept of an activity pattern, which is a
special type of activity that is designed for reuse. Patterns are
well-suited for capturing the best practices for conducting an activity;
if the activity needs to be repeated, one can create an instance of the
pattern and customize it for the new activity. Examples of activity
patterns include planning a meeting, running a software project, and
hiring a new employee. Patterns are explained more in the companion
paper by Moody et al., (1) and one of the case studies later in this
paper illustrates the use of an activity pattern.
RDF as a development environment
Our RDF environment uses the Jena toolkit from HP Labs. Licensed
under an Apache-like open-source license, our project utilizes an RDF
API, SPARQL and RDQL query processors, an HTTP (Hypertext Transfer
Protocol) RDF API known as Joseki, and a graph API for persistence. We
have augmented the Jena toolkit with a JSP ** (Java ServerPages **) (24)
tag library that facilitates Web development, and we have augmented the
graph API to enable integration with remote data sources at the RDF
level.
Activity data stored as RDF can be accessed through the JSP tag
library by using constructs such as:
Title:
which retrieves the Dublin core "title" property from a
node identified as "focus" and prints it in a dynamic Web
page. Activity data can also be accessed through the Jena Java ** API
using constructs such as:
Resource focus=model.getResource(focusuri);
System.out.println("Title: " +
focus.getProperty(DC.title).getString());
which performs the same function as the JSP example. The other way
to access activity data is by using the Joseki HTTP API, which enables
subsets of the RDF graph to be queried. The results are returned in an
RDF serialization format such as RDF/XML.
For modifying RDF data, we provide an abstraction of a higher level
than one which simply adds and removes statements from the model. The
Unified Activity ontology specifies the data model but not the set of
operations that can be performed on that data. Using the "raw"
Java API to make changes can result in data graphs that are inconsistent
with the activity ontology. To ensure consistency, we have developed an
abstraction called a "command" that transforms the RDF data
graph in a specific manner. A command, which is executed in a logged,
reversible transaction, ensures that the resulting model continues to be
consistent with relevant ontologies. While we have prototyped a
scripting language encoded in RDF for developing compound commands, our
deployed system uses plain Java code to implement commands. Examples of
commands include creating a new activity, modifying a person's
involvement, and deleting an event.
The Jena Graph API provides access to one of the most compelling
features of RDF: the ability to create a composite graph from different
data sources, including virtual and logical ones. Because nodes and
relationships in the RDF graph are identified by URIs, it is
straightforward to build a composite graph by overlaying multiple graphs
on top of each other. We have exploited this feature to build a
framework for dynamically integrating external data sources into the
core RDF model. For example, we have integrated our enterprise directory
into the activity model, which means that activities can reference the
name, e-mail address, or job title of anyone involved in the activity
automatically, without having to explicitly add these properties to the
model. This mechanism is what enables semantic-level integration, and
theoretically enables any data source to be mapped into the Unified
Activity schema while preserving its own semantics or leveraging other
ontologies.
Queries of the composite graph are known as "federated
queries" because they relate data from different data sources
without moving all of that data into a central index. Our framework
provides some support, particularly caching and prefetching, to improve
the performance of these federated queries, but does not resolve all the
performance problems intrinsic to this configuration.
We have also built an access control model on top of our graph
framework. This model works by post-filtering, based upon the identity
of the user and the policies encoded in the RDF data, the results of
data that is read. Users who do not have access to data simply do not
see it. Likewise, calls to modify the repository can be checked at this
level as well. As a post-filtering approach, the performance of this
access control implementation is limited. In a sample worst case
scenario, someone may search for all activities containing the term
"the" in a system with 100,000 activities; 50,000 activities
may contain that term, but the user may only have access to 2 of them.
The system will retrieve 50,000 activities and filter out 49,998 of them
from the display, resulting in potentially poor performance.
Discussion
We consider the role of RDF in activities an open question. Beyond
the superficial (but real) costs of its unfamiliar representations and
API, as well as performance issues, we have found the lack of support
for complex relationships to be the largest problem with RDF as the data
model for activities. RDF encodes binary assertions like activity1
hasCreator person1, which can be either present or not. It does not have
a simple way to express an assertion like "activity1 involves
person1 as of January 15, 2006 according to person2 and with involvement
level of 'responsible'."
To express information about the relationship between
"activity1" and "person1", RDF relies on using three
approaches, each having some variations. The first approach uses what we
call a "relationship node" in which an intermediate node is
interjected to express this relationship. All of the extra information
about the relationship can be attached to the middle node
"activity1-person1-relationship", which is between the nodes
"activity1" and "person1". This relationship node
can be either a blank node (meaning it has no identifying URI) or a
regular URI node. If it is blank, then it must be referred to by the
pair of "activity1" and "person1". Moreover, queries
and retrievals require traversing this extra node, which complicates
matters for developers and impacts performance.
The second approach is known as "reification". This is
similar to the previous approach in that another node is introduced to
store the extra relationship information, but differs in that the extra
node in this case is peripheral and the data can be accessed without
knowledge about its existence. There are several ways to implement
reification with different implications for the API and performance. We
have observed that these implementations can become complex and can
significantly hinder performance.
The third approach is to use multiple relationships to link
together the same two nodes. For example, one might have the assertions
activity1 involves person1 and activity1 has Responsible person1. This
approach can be used to encode ordering by including in the model
statements such as: activity1 hasInvolvement0rdering seq1, followed by
seq1_1 person1, seq1_2 person2, etc. The disadvantage of this approach
is that the multiple related assertions must be kept consistent, and
having many paths between two objects seems to violate the principle of
keeping a data model simple.
Rather than select a single approach, we use all three of these
approaches in our current ontology. This state is confusing for
developers. Worse, however, is the fact that we intend the RDF
representation to be the logical and complete representation for
activities. As such, we cannot conceal these approaches, but rather must
expose them in the API, query language, and activity representation.
ACTIVITIES AS A WEB SERVICE
In order to realize our vision of having activities integrate
multiple applications, we have provided a lightweight REST (25)
interface to activity data. We have designed two levels of REST APIs to
interact with the server. (26) The lower-level interface (known as the
"RDF-level API") operates directly on RDF data structures. It
is not activity-specific and gives complete freedom to clients. The
higher-level interface (known as the "activity-specific API")
provides simplified access for clients who simply want to perform
standard operations on activities.
While RDF provides a compelling set of features, the
activity-specific API has enabled us to explore different ways of making
activity data accessible in the context of other applications such as
e-mail clients or Web browsers. In fact, we have found that there is
often a decision to be made when integrating activities and other
applications: namely, whether (1) the application data should be mapped
into the activity model or (2) the activity service should expose its
data to a plug-in or extension for the other application.
For example, in the past we have chosen the first of these options
when integrating e-mail into activities by mapping e-mail into a
messaging ontology and implementing a limited e-mail client as part of
an activity-management user interface (UI). Later in this paper we will
present an alternative approach that extends an open-source mail
application with activity data accessed through the Wax activity
service. Both approaches allow users to see activities and related
messages alongside each other. The advantage of the former approach is
the uniform data model; the advantage of the latter approach is the
reuse of an existing e-mail client application. As e-mail clients are
complex and heavily used and have idiosyncrasies that users come to rely
on, the latter approach is much more effective in this case. Moreover,
when bringing activity data to client plug-ins, we have also found that
an XML-based syntax appears easier and more familiar for plug-in
developers than RDF, and that the need for flexibility in this context
is limited.
By providing both RDF and activity-specific interfaces, we can
continue to explore the flexible data model of RDF, while hiding the
decisions we make about representation from the particular XML
serializations that we expose to client applications. We anticipate,
however, that much of the innovation with activity integration will be
in the form of bringing activity data into the context of existing tools
and applications, and that the activity-specific API will be the most
widely used for this purpose. All of the client plug-ins we present
later make use of the activity-specific API.
RDF-level interface
The RDF-level interface provides access to the RDF data model
directly. The interface consists of three methods. The first method
enables the querying of the RDF database by using the standard RDQL or
SPARQL protocols. We have currently implemented