Why Bad Data Could Cost Entrepreneurs Millions
The one constant is that the only thing worse than not having any data at all, is drawing the wrong conclusions from bad data
The nature of being an entrepreneur entails taking risks, and that includes financial risks. But one financial risk many entrepreneurs fail to avoid stems from ignorance of the hidden costs of bad data. As many firms, including startups, become more data-driven with business models that depend on data (artificial intelligence firms, for example), poor quality data will increasingly become a systemic problem.
Data that entrepreneurs employ could cover anything and come from anywhere. Ride-hailing apps will purchase location data on a regular basis to ensure their maps are up-to-date and accurate; marketing firms will purchase location data on consumer travel patterns allowing them to sell out-of-home advertising space; and a new healthcare start-up will purchase data on health patterns and trends, to provide some examples. The data that these firms purchase will come from a data economy that currently lacks transparency. This lack of transparency increases the likelihood of bad data making its way into the company’s decision-making process.
The High Cost
Research has shown that bad data is on average costing businesses 30 per cent or more of their revenue. Research firm Gartner has found that the average cost of poor data quality on businesses amounts to anywhere between $9.7 million and $14.2 million annually. At the macro level, bad data is estimated to cost the US more than $3 trillion per year. In other words, bad data is bad for business.
For startups in sectors like Internet services, transportation, and data analytics, which have been found to have the highest rates of cash burn, losses from poor quality data is an unacceptable additional cost. Average burn rates are already high before we factor in these hidden costs: for pre-seed startups in the US, these are just under $18,000 per month. But that number gets bigger as the company grows: $75,000 for seed round companies, just under $400,000 for Series A, $500,000 for Series B, and $900,000 for later-stage startups.
Today, good quality data is so valuable, and so hard to come by, that startups which own large amounts of transparent, proprietary data of excellent provenance can secure nine-figure valuations on the back of it.
The problem really starts and ends with the data economy. Broadly put, the data economy is the production, flow, purchase, and sale of data. Data is created by data producers, such as ride-sharing apps, social media networks, telcos, banks, and a whole range of private enterprises. It is then stored anonymously in data storage centres and often purchased by a third party who seeks to use the data for their own, separate business purposes.
Middlemen and data aggregators buy and sell all this data on data marketplaces. But such high volumes of data are being bought and sold every day, going through multiple levels and exchanging hands so many times that it can become hard to ensure which data is original and which has been tampered with along the way.
Add to this, the problem of non-transparent and anonymous sourcing and you have a cocktail for rampant bad data at a global scale. “Click farms” exist precisely because they can launder their false data through the data economy, with their data ending up being purchased by legitimate businesses, who then go on to make decisions often worth hundreds of millions of dollars.
Where it Hurts
Ultimately, bad data can hurt entrepreneurs in different ways. It may mean they launch a store on a street that doesn’t have the footfall and demographics promised, resulting in lower than expected revenue. Alternatively, it could be a failure to optimize an iOS or Android app for new-user conversions because the data is leading the programming team to the wrong conclusions—something that could go on for weeks or even months. The one constant is that the only thing worse than not having any data at all, is drawing the wrong conclusions from bad data.
The current data supply chain also incurs a hidden opportunity cost. The data economy is chaotic and disorganized with disparate sources of data making its way into AI algorithms and business decision-making. Without the ability to properly map and organize this data, entrepreneurs will find it hard to build innovative solutions and alternative business models.
For example, we all use different apps during our daily life—from ride-hailing apps to get from point A to point B to food delivery apps. Individually, this data may provide us with insights into our food preferences or travel habits. Together, though, this data will provide a much more complete picture of the user and spur innovative services and solutions. Yet, integrating this data and mapping remains extremely difficult in the current data economy.
So, the next time your startup fails or your end-of-year cashflow is too deep in the red, don’t just question your individual business strategy or product, but look at your underlying data assumptions. As with any good education, it may not always be pleasant, it may be surprising, it may even be downright depressing, but it could also save you millions.