UK National archives, Microsoft working to access old
file formats.
by Swartz, Nikki
British Library research suggests Europe loses 3 billion euros each
year in business value because of inaccessible data.
According to BBC News, the UK National Archives, which holds 900
years of written material, has more than 580 terabytes of data--equal to
580,000 encyclopedias--in older file formats that are no longer
commercially available, meaning all the information is not accessible.
"If you put paper on shelves, it's pretty certain it is
going to be there in a hundred years," said Natalie Ceeney, chief
executive of the UK National Archives. "If you stored something on
a floppy disc just three or four years ago, you'd have a hard time
finding a modern computer capable of opening it."
The growing problem of accessing old digital file formats is a
"ticking time bomb," according to Ceeney. Speaking at the
launch of a partnership with Microsoft to ensure the Archives could read
old formats, Ceeney said society faced the possibility of "losing
years of critical knowledge" because modern PCs could not always
open old file formats.
[ILLUSTRATION OMITTED]
Microsoft's UK chief Gordon Frazer said that, unless more work
is done to ensure that legacy file formats can be read and edited in the
future, we face a "digital dark hole."
Ceeney said some digital documents held by the National Archives
have already been lost forever because the programs that are able to
read them no longer exist.
According to the BBC report, the root cause of the problem is the
range of proprietary file formats that proliferated during the early
digital revolution. Technology companies, including Microsoft, used file
formats that were incompatible with software from rival firms--and also
incompatible among different versions of the same program.
Frazer said Microsoft has since shifted its position on file
formats. "Historically within the IT industry, the prevailing trend
was for proprietary file formats," he told the BBC. "We have
worked very hard to embrace open standards, specifically in the area of
file formats."
He said Microsoft's new document file format, Open XML, is
used to save files from programs such as Word, Excel, and PowerPoint,
and it is an open international standard under independent control.
But critics have questioned Microsoft's decision to create its
own new standard rather than adopt a rival system, the Open Document
Format. Microsoft's tool can translate between the two formats.
The agreement between the National Archives and Microsoft focuses
on the use of virtualization, according to the BBC report. The Archives
will be able to read older file formats in the format in which they were
originally saved by running emulated versions of the older Windows
operating systems on modern computers.
For example, if a Word document was saved using Office 97 under
Windows 95, then the National Archives will be able to open that
document by emulating the older operating system and software on a
modern machine.
According to Ceeney, older file formats present a bigger challenge
than outdated forms of media, such as floppy discs of various sizes and
punch cards.
"The media it is stored in is not relevant," she
explained. "Back-up is important, but back-up is not
preservation."
The British Library and National Archives are members of the
Planets project, which brings together European national libraries and
archives and technology companies to address digital preservation
challenges.
COPYRIGHT 2007 Association of Records Managers &
Administrators (ARMA) Reproduced with permission of the copyright holder. Further reproduction or distribution is prohibited without permission.
Copyright 2007 Gale, Cengage Learning. All rights
reserved. Gale Group is a Thomson Corporation Company.
NOTE: All illustrations and photos have been removed from this article.