Businesses are embracing an XML future to prepare for
changing times.
by Brown, Alex
XML has just celebrated its tenth birthday, yet while that period
has seen a rapid expansion of XML's technical capabilities, its
transition from the technical to the business realm is still in its
youth. There is little doubt that technologists have been proved right
in their evangelising of XML as "the future", however there
are still many lessons to be learned from businesses so that XML can
fulfil its full potential and slot smoothly into our digital futures.
Today's changing times put ever more pressure on businesses to
be able to respond quickly, to respond well, and to respond profitably.
Increasingly companies rely upon fast, efficient movement of data,
whether in e-commerce exchanges with others or when processing data
through their own internal workflows. Whenever data is exchanged,
however, it is vital that it is both consistent and accurate.
No longer just the subject of backroom technical activity, data is
becoming a hot topic for boardroom discussion. Increasingly with
dramatic examples of identity theft and lost data in the news, it has
become a priority in many companies. In an information age, data is a
valuable company asset and its use (and abuse) must be strictly
monitored and controlled. Looking ahead it will not be enough simply
"not to fail"--smart companies which optimise use of their
data will be afforded opportunities at the expense of less savvy
competitors.
Adoption of e-commerce
All organisations that hold sensitive personal data must be
vigilant for identity theft, while industries such as the pharmaceutical
or aerospace need to be very clear on the validity of their data where
the consequences of error can be catastrophic.
Other industries, such as the retail, financial services,
advertising and petrochemical businesses are fast adopting e-commerce as
the preferred way to conduct business transactions. Orders, invoices,
supplier and product information messages all rely upon the integrity of
their underlying data to ensure smooth exchange. If the data is poor, it
will always take more time to resolve the issues and consequently will
be more expensive.
XML--fast becoming an orthodoxy
Clearly the way that data is stored within an organisation is
important and is significant for data governance. Data must be agile,
accessible and useful for the long term. XML has established itself as
the most popular format for high volume data storage, and has over
recent years grown from being adopted by many content providers and
publishers, to become the universal language underpinning the Web and
modern application software. Even commodity applications like
Microsoft[R] Office[TM] today store their files in an XML format.
XML initiated the revolution when it was published in 1998 and it
popularised structured mark up and introduced the concept of well-formed
documents. It is an incredibly simple, well-documented, straightforward
vendor-independent data format, and has lowered the barriers to
entry--any tool that can read text files can display an XML document.
Longevity for data is an issue. It is a fact that reading older
Word processing files might already be problematic (it is even difficult
to find machines with floppy drives today!). Adopting a universal
standard for storing data, ensuring that data is stored in the correct
way and being able to audit exactly what data you have, is what
companies strive for. Not to mention protecting that data for its long
term use.
Flexible and portable
XML is considered today the most portable and flexible document
format since the ASCII file, supporting all human languages. Indeed it
has been so widely adopted that it is even a standard for storing the
data in TV remote controls. It has been standardised by the World Wide
Web Consortium (W3C) as a format for computer documents.
XML is flexible enough to be customised for domains as diverse as
web sites, e-commerce and voice mail systems. However, while it
represents many opportunities for organisations, it is not a panacea and
there has, inevitably, also been considerable hype around XML. XML is
not a programming language, network transport protocol or database. On
its own is has no intelligence and no meaning. It is a format for data,
and as such, to ensure business benefit is derived from any data, there
is a need for validation of the meaning of that data, and for managing
the correctness of an organisation's entire XML data holding--in
other words, XML data governance.
XML Data Governance
Data Governance ensures that the correct schemas, systems,
personnel, procedures, practices and feedback are in place for
well-managed XML data. IT governance is developing in many areas
already, but is really in its infancy for XML processing.
For commercial publishers, a group who have been at the forefront
of XML adoption, data quality is immediately relative to the value of
the service being provided. For many publishers it is important to
verify the quality of any intellectual content coming into, and flowing
out from, the organisation. It is a known fact that bad data fails more
often than programmatic communication between systems.
There are products on the market today that address the issue of
the technical state of data for digital content. Such products can
validate data and minimise the subsequent problems that may occur
through exchange. Validating the accuracy of data also provides a
publisher's customers with an assurance of quality on approved
material being supplied.
Publishers value
Cambridge University Press (CUP), for example, has taken this
validation one step further. The organisation has developed an
application that acts as a 'data firewall'. CUP has managed to
reduce their participation in the supply chain process by developing
validation rules that are trusted enough to provide assurances that
content passes a quality threshold. At this point, CUP routes documents
directly from their suppliers to their onward destinations.
This application has been developed over time, but has clear
business benefits to all parties. Suppliers can address issues early in
the production process and the publisher gets timely throughput of
content than would not otherwise have been possible. Faster, better and
cheaper--this is the mantra of the successful business.
Griffin Brown Digital Publishing's particular product,
XML-Probe has been used to implement some public online services to
promote the use for modern validation. These have to date focused on
publishing and include ONIX for Books--an e-commerce data transmission
for the publishing industry, whereby potential users can get a feel for
the type of checks that can be done using such validation technologies.
Unique brands
However, industry level support is not enough for real-world
applications. In reality each content producer, each industry, will have
different standards, which represent that organisation's unique
brand offering. For a publisher these may, for example, be reflected in
certain house style and editorial practices; for a research organisation
it may be reflected in the richness of the data it can provide.
To accommodate a company's unique data sets, a validation
product must be able to be developed with different rule sets. The
validation tool then highlights violation of these rules, according to
pre-set constraints. This may be the way in which a date is written, for
example, or an author's name recorded. This might be done at
different stages of content production document creation, review,
classification and publication. The next stage might then be for the
tool to communicate automatically with a cleansing tool.
Data--the most valuable asset
There is no doubt that in the digital age, content is king. To
this, it might be added that "and quality is queen"--since, in
practice, the value of any data is defined by its usefulness. The
confidence of users always relies on quality assurance.
For organisations that rely upon data to support their core
business, it is important that they store data in a high quality, agile
format. Those publishers who were slow to move from print publishing and
embrace the web have quickly seen their businesses assailed by more
nimble competitors. Content providers that looked to the future saw that
digital content must be adaptable to be supplied online, and today data
must be made available to hand-held devices such as mobile phones and
BlackBerrys.
Similarly businesses that rely upon data access for engineers,
field workers and mobile employees, need to ensure that their data is
stored in a universal format that is easily understood and read by a
wide range of people and devices. Choosing a non-proprietary technology
such as XML is the future. Forward-looking companies will also already
be assessing the risks of low data quality and proactively addressing
issues of data governance and quality. www.griffinbrown.co.uk
Alex Brown, Griffin Brown Digital Publishing
COPYRIGHT 2008 A.P. Publications
Ltd. Reproduced with permission of the copyright holder. Further reproduction or distribution is prohibited without permission.
Copyright 2008 Gale, Cengage Learning. All rights
reserved. Gale Group is a Thomson Corporation Company.
NOTE: All illustrations and photos have been removed from this article.