Destination--quantum'.
by Llewellyn, Roger
In an age where the undisputed appeal of speed seems to be as
critical in the datacentre as it is on the Formula One track, Roger
Llewellyn, CEO of BI and data warehousing solutions provider Kognitio,
looks at the options businesses have when it comes to analysing data.
Distinguished businessman Jack Welch once said 'An
organisation's ability to learn, and translate that learning into
action rapidly is the ultimate competitive business advantage.'
This is probably why most businesses today spend time and money
researching their customers to replace the 'average white male,
employed, between the age of 20 and 25' model they targeted until
now, with a much more detailed profile. Having gathered an extremely
extensive amount of information on their customers by storing data
related to purchases e.g. time, location and nature of sale, response to
a particular promotion, etc, extracting value from this information
should be a fairly straightforward and fast process. Well, it might be
straightforward (in certain cases) but it is not necessarily fast.
Despite an obvious difference in size, weight, and look,
today's computers still rely on what is fundamentally the same
technology their predecessors used 30 years ago, namely the same
mathematical processes, albeit with much better performance,
portability, disk and memory capacities. Yet, the underlying technology
is still fundamentally the same. So while today's machines are
lighter, faster, and smaller, they need to be 'reinvented' if
they are to be truly different. This is because major breakthroughs in
technology tend to happen when a radically different approach is taken;
for example, it is widely accepted that quantum computers, which tackle
information from an entirely new angle i.e. the atomic level, will make
a massive difference to the way we analyse data because they can easily
perform tasks that today's computers are completely incapable of
performing.
Database technology has recently undergone a similarly radical
change; while for many years it relied on indexes in order to categories
and then find information, today a new breed of solutions is taking
hold. These are the "Relational Database Management Systems"
(RDBMS) that offer very high levels of performance without the need for
indexing or pre-partitioning of the data. The advantage? Users enjoy
flexible and unconstrained access to the data, as well as significantly
lower system set-up and maintenance costs.
Let us take the following example: you have an extensive clothing
catalogue with 2,000 pages of items including men's trousers,
ladies' handbags and children's shoes among others and you
want to find all men's red shirts with white buttons and
contrasting cuffs. With an index-based database the system would look
through the data firstly to look for men's red shirts, then again
to find those with white buttons, and then finally to select those with
contrasting cuffs. If on the other hand you have an RDBMS that does away
with indexes, the system will read the entire catalogue once and provide
you with the answer. You might think that reading the entire catalogue
is a waste of time but this is where the biggest change was made; these
systems are so fast that they can scan every item quicker than it takes
the first database to sort through the information using indexes.
Sophisticated algorithms and extensive use of memory-based processing
allows these databases to scan each individual entry (or row) for a
match to the query being processed so quickly that there is no need to
build an index. In addition, the elimination of indexes dramatically
reduces overall storage requirements; typically 60%-80% of a traditional
database implementation will be index storage, none of which is required
in the new systems. And finally, the absence of indexes also reduces
overall load times, as the index-build phase is eliminated.
But data is not stable because it tends to grow; more products are
introduced, seasonal promotions are kicked off, multi-vendor bundles are
launched. So what happens when the volume of data to sieve through
grows? Performance increases with it. This is due to the fact that this
type of RDBMS relies on massively parallel processing power (MPP), which
scales the system in direct proportion to the amount of information
stored. So for example a one-blade system will scan 100 million rows per
second, a ten-blade system will scan one billion rows per second, and so
on. This high degree of query scalability means that as data volumes
increase, performance will remain constant. This is another similarity
between this new breed of databases and quantum computers. As Professor
Artur Ekert of the University of Oxford said "It [quantum
computing] is like massively parallel processing but in one piece of
hardware." And the data analysis challenge is so great that we find
another connection between these high-performing databases and quantum
computers, namely that one of the key uses for the latter is going to be
the search of vast databases.
So if 1,600 internet users took eight months and an astonishing
amount of computing power to reduce RSA 129 (a 129-digit number) to two
primes, while a quantum computer could crack it in a few seconds, how
much faster is the new variety of databases compared to the traditional
index-based ones? Recent comparisons of our [WX.sub.2] analytical
database solution against Oracle database implementations carried out by
end users produced query execution times between ten and 60 times
faster. And there is a bonus: these benchmarks were carried out using
hardware platforms costing 50%-70% less than the original Oracle
platform. [WX.sub.2] was able to satisfy 2,000 queries per day as
opposed to 60 or 70.
The results clearly speak for themselves. And while the elimination
of indexes cannot yet compete with the astonishing gap in performance we
could one day have between a classic supercomputer (billions of years)
and a quantum computer (one year), the feedback is certainly very
positive: "I got my answer back in six seconds. My jaw hit the
ground!" Jim Lewis, senior research associate at the Cambridge
Astronomical Survey Unit, Cambridge University Institute of Astronomy.
Roger Llewellyn, Kognitio.
www.kognito.com
COPYRIGHT 2008 A.P. Publications
Ltd. Reproduced with permission of the copyright holder. Further reproduction or distribution is prohibited without permission.
Copyright 2008 Gale, Cengage Learning. All rights
reserved. Gale Group is a Thomson Corporation Company.
NOTE: All illustrations and photos have been removed from this article.