What Fraudsters and 'Black Swans' Have in Common, How AI Can Mitigate the Effects of Both
Artificial intelligence (AI) can make predictions to help solve issues as complex as money laundering, but how can we build an AI system that works in the real world, where data is dynamic and the goalposts shift constantly?
Opinions expressed by Entrepreneur contributors are their own.
Artificial intelligence (AI) uses historical data to predict the future. For example, in the case of fraud, historical fraud activity can be used to predict new fraud in real time. Currently, AI is commonly used in anti-money laundering (AML) by tracking the historical data and using anomalies with respect to the normal distribution.
Aside from looking at the facts and drawing conclusions — like a human can — AI can digest large amounts of data and combine it into a single model that can conclude as well as predict.
AI, in other words, involves generalization en masse using a sophisticated algorithm that can predict outputs consistent with the historical data.
Related: When Should You Not Invest in AI?
What about dynamic data?
Predicting based on a set of historical data can work well for certain purposes, but it very much depends on an ideal world where all data is consistent. We know this isn't always the case.
In a theoretical environment or a lab, data is static. In the real world, it tends to be dynamic. If we're thinking about AI as explained above, problems are caused when data shifts and changes — a common occurrence in any real-world business environment.
What happens when data shifts?
If there was ever an example of circumstances changing, the past 18 months have been it.
Hindsight is a wonderful thing: What if we'd known that a severe pandemic was going to hit? How would it have impacted insurance and loans risk, for example? How would it have impacted production models based on millions of data points from 2015 to 2019? Clearly, 2020 caused a huge anomaly, which is often referred to in data as a Black Swan — an "unknown unknown" that despite the best preparations and the most sophisticated data models, could not have been fully predicted.
This has impacted many of the processes we took for granted. It's all very well to use something like natural language processing (NLP) to sort through customer-service emails, but what about a new influx of emails that concern Covid-19, an issue that has not historically been dealt with or even mentioned?
However, Black Swans like a global pandemic are not the only thing that can dramatically affect the business environment. Fraud is changing and evolving all the time, as fraudsters try to attack from different angles and learn new techniques every day.
When it comes to AI applications that are as mission-critical as fraud detection, solutions must fulfill three key criteria: stability, sustainabilty and delivery as a system.
Stability is the word on everyone's lips in 2021 as businesses try to make sure they can stay resilient after a decidedly "unstable" year and adapt their operations to withstand the challenges of the new normal. This is no different in the world of machine learning.
In machine learning, stability is all about how various challenges can be dealt with. While a classic application will be able to take inputs and predict outputs, a truly stable system can do so in spite of environmental factors such as errors or typos in the data, or even bias. A stable system will also be able to note when this isn't happening properly, for whatever reason, and alert us humans.
Robustness issues can often rear their head in the production process: building your proof of concept is a far cry from productizing a stable solution in the real world. Having a clear understanding of the data, as well as possible drifts and changes, is a must — then the development team can validate the robustness of the model as early on as possible.
AI solutions can be thought of almost as a living, breathing entity. You can't just build, deploy and move on; they require constant attention and maintenance over time. Since machine learning is data driven, it's important to understand that data is dynamic and will change over time. When this happens, the solution needs to be able to adapt. Without the ability to change your model, it will become irrelevant very quickly and won't be sustainable.
Problems with changing the model are usually related to research. Engineers use data to train a model, but they need to research where a solution is not known or closed. It's important to have a thorough process where you investigate different directions before the problem is solved — this should be done repeatedly in production as the data changes to ensure sustainability.
Building a system
As already discussed, both stability and sustainability are critical. They involve various challenges, but these can be overcome with the right investment in fundamentals. However, both elements can only work if the leadership team (e.g., the CIO, AI/ML leader) addresses the production process in the right way.
Currently, there is a huge gap between building a model with a great group of researchers and productizing a valuable AI system. To get to that stage, a system must be developed that includes the ability to monitor, retrain the models in production, compare models, collect user feedback, cut through the noise involved in the data input, mitigate bias and more.
Kicking AI into high gear
Now, 2021 will be a huge year for AI adoption, especially in a financial-services system that aims to be robust against unexpected events like a pandemic and the ever-growing threat of fraud.
Currently, we're seeing a lot of interesting proofs of concept and even some early adopters achieving ROI from AI. This year, the technology, experience and talent involved in extracting true value from AI will reach critical mass.
There will be a huge and visible distinction between companies that have built the right strategy, teams, tools and relationships with external vendors, and those that fail to adopt the approach of ensuring AI models work as a "stable, sustainable system."