In the remainder of this article, we will examine how the latest
version of the UML standard, UML 2, has been adjusted to meet the needs
of MDD. First, we examine the forces that led to the revision of the
original standard. This is followed by a summary of the major new
language capabilities. For convenience, they have been grouped into five
major categories of changes. Each of these is then described in a
section of its own. The article concludes with a view of current and
anticipated developments related to UML.
THE RATIONALE BEHIND UML 2
UML 2 is the first major revision of the UML standard, following a
series of lesser revisions. (7,8) Why was it necessary to revise UML?
The original UML standard was primarily designed with the
traditional development process in mind: the model was primarily a means
for documenting and communicating high-level design ideas. This did not
require a precise modeling language. Nonetheless, a growing number of
software architects wanted their UML models to be precise specifications
that could serve as formal blueprints to be faithfully realized by the
corresponding software implementation. Any ambiguity in such models
could lead to misinterpretations and invalid realizations. This created
pressure to define the semantics of UML much more precisely.
Simultaneously, many programmers were beginning to see the benefits of
more abstract graphical representations of their code, representations
that were shorn of the noise of programming-language syntax and more
clearly rendered its essence. For example, a graph-based rendering of a
class hierarchy that shows relationships between classes visually is
generally more easily understood than the corresponding textual
representation. This quickly led to the requirement to allow the code to
be manipulated in either graphical or textual form, whichever happened
to be more convenient at the time. Therefore, it was necessary to define
very precisely the formal relationship between the graphics and the code
and also the semantics of UML diagrams.
Both tool vendors and users responded to this pressure by defining
individual specializations of UML. Unfortunately, these custom variants
differed from case to case and from project to project, often based on
dubious or invalid interpretations of the underlying UML concepts. This
threatened to lead to the same kind of fragmentation that the original
standard was intended to eliminate. A new, more precise version of the
standard was clearly necessary to reduce the ambiguities of the original
standard. In addition, a more capable and more clearly defined mechanism
was required to support domain-specific specializations of UML.
Whereas the pressure towards MDD was the primary motivator for UML
2, another key factor was the need to model important new technologies
that had emerged since the first release of the standard, such as
Web-based applications and service-oriented architectures. Although all
of these could be represented by appropriate combinations of existing
UML 1 concepts, there were obvious benefits to providing more direct
ways of modeling these capabilities.
Finally, although we still lack a sound and systematic theory of
modeling language design, much has been learned about suitable ways of
defining, structuring, and using such languages. For example, new
theories of meta-modeling and of model transformations have emerged over
the past 10 years, which need to be incorporated into UML to ensure its
applicability and longevity. Although UML might end up being the
equivalent of FORTRAN in the domain of software modeling languages, it
is worth recalling that FORTRAN is still an active language, almost 50
years after its inception.
WHAT IS NEW IN UML 2
The new developments in UML 2 can be grouped into the following
five major categories, listed in decreasing order of significance:
1. A significantly higher level of precision in the definition of
the language--This is a result of the need to support the higher levels
of automation required for MDD. Automation implies the elimination of
ambiguity and imprecision from models (and, hence, from the modeling
language) so that they can be transformed and analyzed by specialized
computer programs.
2. An improved language organization--This is characterized by a
modularity that not only makes the language more approachable to new
users but also facilitates inter-working between tools.
3. Significant improvements in the ability to model large-scale
software systems--Some modern software applications represent
integration of existing stand-alone applications into more complex
systems of systems. This is a trend that will likely continue, resulting
in ever more complex systems. To support such trends, flexible new
hierarchical capabilities were added to the language to support software
modeling at arbitrary levels of complexity.
4. Improved support for domain-based specialization--Practical
experience with UML demonstrated the value of its extension mechanisms.
These were consolidated and refined to allow simpler and more precise
refinements of the base language.
5. Overall consolidation, rationalization, and clarification of
various modeling concepts resulting in a simplified and more consistent
language--This involved consolidation of concepts, removal of redundant
concepts, refinement of definitions, and the addition of clarifications
and examples. Each of the these categories is described individually
below.
INCREASED PRECISION OF LANGUAGE DEFINITION
Most early software modeling languages were defined informally,
with little attention paid to precision. More often than not, modeling
concepts were explained using imprecise and informal natural language.
This was deemed sufficient at the time because the majority of modeling
languages were used either for documentation or for what Martin Fowler
refers to as design "sketching". (13) The idea was to convey
the essential properties of a design, leaving detail to be worked out
during implementation.
This, however, often led to confusion because models expressed in
such languages could be--and often were--interpreted differently by
different individuals. Furthermore, unless the question of model
interpretation was explicitly discussed up front, such differences could
remain undetected, to be unmasked only in the latter phases of
development when the cost of fixing the resulting problems was much
greater.
In contrast to most other modeling languages of the time, to
minimize ambiguity the first standardized definition of UML was
specified using a metamodel. This is a model that defines the
characteristics of each UML modeling concept and its relationships to
other modeling concepts. The metamodel was defined using what is, in
essence, an elementary subset of UML called MOF, consisting primarily of
concepts defined in UML class diagrams and supplemented with a set of
formal constraints written in the Object Constraint Language (OCL). This
combination represented a formal specification of the abstract syntax of
UML (in contrast to its concrete syntax or notation). The abstract
syntax is the set of rules that can be used to determine whether a given
UML model is well formed. For example, such rules would allow us to
determine that a model in which two UML classes are joined by a state
machine transition is illegal.
Nonetheless, the degree of precision used in this initial UML
metamodel proved insufficient to support the full potential behind MDD
(see, for example, the discussion in Reference 14). In particular, the
specification of the semantics, or meaning, of the UML modeling concepts
remained inadequate for MDD-oriented activities such as automatic code
generation or formal verification.
Consequently, the degree of precision used in the definition of UML
2 was increased significantly. This was achieved by the following means:
[] A major refactoring of the language metamodel--The metamodel of
UML, specified using the MOF language, (12) defines the formal rules to
which a well-formed (i.e., syntactically correct) UML model must adhere.
For UML 2, the core of this metamodel was broken up into a set of
fine-grained low-level modeling concepts and patterns that are, in most
cases, too rudimentary or too abstract to be used directly in modeling
software applications. However, their relative simplicity makes it
relatively easy to be precise about their semantics and the
corresponding well-formedness rules. These finer-grained concepts are
then combined to produce the more complex user-level modeling concepts.
For instance, in UML 1, the notion of ownership (i.e., elements owning
other elements), the concept of namespaces (named collections of
uniquely named elements), and the concept of classifier (elements that
can be categorized according to their features), were all inextricably
bound into a single semantically complex notion. (This also meant that
it was impossible to use any one of these without implying the other
two.) In the new UML 2 metamodel, these concepts were separated, and
their syntax and semantics were defined separately.
[] Extended and more precise semantics descriptions--Defining the
semantics of the UML 1 modeling concepts was problematic in a number of
ways. The level of description was highly uneven, with some areas having
extensive and detailed descriptions (e.g., state machines), whereas
others had little or no explanations. The UML 2 specification puts much
more emphasis on the semantics and, in particular, in the key area of
basic behavioral dynamics (see below). (A more detailed discussion of
the semantics of UML can be found in Reference 15.)
COPYRIGHT 2006 All Rights
Reserved. Reproduced with permission of the copyright holder. Further reproduction or distribution is prohibited without permission.
Copyright 2006, Gale Group. All rights
reserved. Gale Group is a Thomson Corporation Company.
NOTE: All illustrations and photos have been removed from this article.