The Data Liability Most Business Leaders Don’t Know They Have — Until It’s Too Late
As AI systems grow increasingly dependent on historical data to function accurately, unrecoverable historical data has become a liability that belongs on the C-suite agenda.
Opinions expressed by Entrepreneur contributors are their own.
Key Takeaways
- Companies can no longer treat data as endlessly renewable. We’re facing a “data liability gap,” — the difference between the data you think you can access and what you can actually recover in a usable format.
- AI systems depend on complete historical datasets to learn and correct their mistakes, so lost or corrupted data can lead to flawed or incorrect conclusions.
- Many executives assume cloud availability equals data protection. In reality, cloud providers run the service, but partners and customers still own data protection and recovery.
Over the past several years, the corporate world has adopted the mantra that data is always renewable. Basically, people have treated storage as a utility and bandwidth as something that will always be there. Backup was viewed in a similar way to insurance. Since the emergence of artificial intelligence, all of this has been proven to be false. As companies now rush to use AI and predictive analytics, terrifying possibilities are arising.
We are currently facing a “data liability gap,” which is the difference between the data a company thinks it can access and what it can actually recover in a usable format. With AI systems being very dependent on old data to learn and correct their own mistakes, permanent data loss is no longer just an operational hazard; it is now something so serious that it may have to be mentioned in year-end reports. If it was lost due to negligence, the staff responsible could be fired due to the reputational risk to the business.
For generations, the C-suite viewed data protection as something akin to data recovery. They aimed to get the systems back online as quickly as possible after the main operational equipment went down. The concept of Recovery Time Objective (RTO) was something that focused on speed before anything else. The most important thing it aimed to do was get the servers back up and running.
AI has changed the game completely. Rather than caring about how long your systems are online, AI systems care about historical data. An AI language model will face severe problems if it is discovered that records from the company’s first five years of existence have been destroyed or corrupted. This will mean that its predictive algorithms will lack vital historical data needed to draw conclusions. In the worst-case scenario, it will make misleading or totally wrong conclusions.
Unrecoverable data could cost you heavily
Many CFOs will agree that data is the essential raw material needed in the AI industry. Data integrity is also important and a key backbone of keeping things running. A manufacturing company would suffer heavily if it found out that a small amount of its raw materials from its warehouse had been destroyed. If this happened, there would be a serious investigation and an adjustment to the company’s overall value.
2025 research by ExaGrid with Enterprise Strategy Group found that a mere 1% of organizations are able to recover all of their data after a ransomware attack.
However, when companies find out that important data they need from 2020 has been corrupted beyond repair, the response may be something like “it’s a pity, but we have to move on.” This is despite the fact that the information contained in the data would have immense long-term value for the company.
The reasons for data loss are not just cyberattacks. It is estimated that in Microsoft 365 systems, about 30.2% of organizations lost data in 2025, which represented a 17.2% increase from 2024. This was due to things such as mistaken deletions or departing employees failing to hand over data properly.
Why “shared responsibility” is not a good stance
The “availability myth” is a bad strategy that’s unfortunately used by many executives today. When this happens, it is believed that data is safe just because the cloud storing it is readily available. Grant Crough, Founder and CISO at LEAP Strategy, described this well when he said, “Microsoft runs the service, but partners and customers still own data protection and recovery.”
Due to not understanding the shared responsibility system well, companies have suffered serious data loss. Modern Microsoft infrastructure is typically designed to protect businesses against hardware failure and not errors that are caused by users. When ransomware targets a system, it changes every copy in a SharePoint library.
The only reliable protection against this is independent backup, which follows the 3-2-1 rule consisting of three copies (two media types and one off-site). Many leaders falsely believe that this is something that Microsoft provides, even though it is not the case.
What the C-Suite must do going forward
For a long time, data management has been focused on within the server room or the IT team. Things need to change, and the boardroom needs to take more responsibility. The C-Suite needs to start focusing on how to make data infinitely available rather than mainly focusing their efforts on recovery from a disaster.
For instance, leaders must focus on things such as the percentage of their data that can be restored to a good state and whether their backups have backups that are immune to strong attacks. If no answer can be given to this, it proves that there is a serious weakness within the business. As the AI race continues to flow, the winners will not be those with the most data; it will be those who have built indestructible protection systems for their data.
Key Takeaways
- Companies can no longer treat data as endlessly renewable. We’re facing a “data liability gap,” — the difference between the data you think you can access and what you can actually recover in a usable format.
- AI systems depend on complete historical datasets to learn and correct their mistakes, so lost or corrupted data can lead to flawed or incorrect conclusions.
- Many executives assume cloud availability equals data protection. In reality, cloud providers run the service, but partners and customers still own data protection and recovery.
Over the past several years, the corporate world has adopted the mantra that data is always renewable. Basically, people have treated storage as a utility and bandwidth as something that will always be there. Backup was viewed in a similar way to insurance. Since the emergence of artificial intelligence, all of this has been proven to be false. As companies now rush to use AI and predictive analytics, terrifying possibilities are arising.
We are currently facing a “data liability gap,” which is the difference between the data a company thinks it can access and what it can actually recover in a usable format. With AI systems being very dependent on old data to learn and correct their own mistakes, permanent data loss is no longer just an operational hazard; it is now something so serious that it may have to be mentioned in year-end reports. If it was lost due to negligence, the staff responsible could be fired due to the reputational risk to the business.
For generations, the C-suite viewed data protection as something akin to data recovery. They aimed to get the systems back online as quickly as possible after the main operational equipment went down. The concept of Recovery Time Objective (RTO) was something that focused on speed before anything else. The most important thing it aimed to do was get the servers back up and running.