Best Practices for Achieving Success with Custom Modeling and Machine Learning
One of the most common reasons organizations fail to realize significant improvements in risk management after implementing custom modeling solutions can be described with one phrase: Junk in. Junk out. This article discusses best practices as it relates to data management and other factors that are shown to improve performance when it comes to custom modeling and machine learning or artificial intelligence.
It’s not just breadth of data, but also quality of data, that is important. One of the biggest misconceptions about machine learning (ML) and artificial intelligence (AI) is that you can just flip a switch and let the technology work its magic. Even if models are self-training, they rely on the data that influences the models to be properly labeled and defined. This is a significant operational undertaking that requires diligent post-transaction management to identify when we got it right and when got it wrong.
Think of the Confusion Matrix. The true outcome of a transaction is based on the predicted outcome, influencing whether we accept or decline the order, and the actual outcome. It may be months before we realize an accepted order was in fact missed fraud. It can be very difficult to identify sales insults positives while many to most of these false positives remain permanently unrecognized.
A common pitfall is that organizations focus almost entirely on the true positives (fraud caught), which is what models are optimized to catch. Models need to also be trained to reduce false positives (sales insults) and false negatives (missed fraud) as well. This requires post-transaction operational management, such as updating databases and records when missed fraud is identified. Identifying sales insults is an imperfect science, but methods include recognizing when a customer re-attempts to transact and the result is a legitimate sale, or when they voice a complaint by calling customer service or on social media and it can be definitively linked to a declined order.
Understand that models have a shelf life. Fraud is a moving target. Fraudsters adjust their patterns and behavior in response to models that catch them. Models tend to experience some level of decay, becoming less effective over time. Much of this can be adjusted within the model, whether by supervised machine learning or the call-and-response iterations made by risk management experts who adapt their models to changing behaviors. Organizations should be leveraging both of these.
Fraud models primarily reliant on ML will eventually reach a point where they need to be scrapped and rebuilt. Incremental adjustments will extend a model’s shelf life but not give it immortality. Again thinking back to the confusion matrix, organizations have a very limited view (if any) on the actual outcome of the orders they reject. While an organization may be able to identify some sales insults, it’s likely just a small sliver of them. It seems somewhat counterintuitive, but we need to let models fail sometimes so they can actually learn from their mistakes. Once a model is established and entrenched, it can continue to cause false positives with no reliable way to identify it is even doing something wrong.
More models are better than one. One global model that applies to all product categories and regions where an organization is doing business is destined to fail. At the geographical level, organizations must consider that the data available around customers from one country or region can be vastly different than what is available around customers in another country or region. Models should be leveraging a plethora of risk signals which are derived from a multitude of risk management techniques. Not only are the available techniques and data points going to differ from country to country, but the value and meaning of the signals they provide can differ as well. In short, an organization needs regional-specific models but country-specific models are even better.
The same can be said for a merchant or organization that sells a variety of goods and services. Digital goods and gift cards need to be treated differently than apparel which needs to be treated differently than electronics. Digital goods don’t require a shipping address. Some product categories and SKUs are bigger targets for fraud. These things need to be taken into consideration in such a way that doesn’t muddle up risk-decisioning for other categories and products.
To that end, customer-centric models should be differentiated as well. How models treat a brand new customer, versus a recent repeat buyer versus a long-term customer who has built trust and rapport should be different. This should be siloed such that the actions of never-before-seen buyers don’t make the models stricter against those who are known and trusted. Relying on multiple models doesn’t just make them more effective at preventing fraud, but also at reducing false positives.
Resist the singularity. One day our robot overlords may punish me for writing this, but for now heed the advice. Do not rely solely on ML or AI. The most successful models rely on the human element. This is why most modeling vendors emphasize that their ML solutions are supervised. We recently published an article discussing the need for human supervision and intervention in cybersecurity and transactional risk modeling contexts while providing recent data examples that support why.
Of course humans are imperfect too. Models should be managed by a team, not just one person, such that different perspectives can provide value and recognize human bias that has manifested in the models.
The most effective modeling systems mesh ML and human intervention while leveraging bi-directional feedback. The AI identifies data interaction points a person would have never conceived, but the risk management experts still must perform their due diligence to ensure this feature or data interaction point is relevant and will provide uplift without creating problems. On the other side of coin, humans can identify patterns and risk signals based on real-world experience as well as iterative improvements added in response to fraud events and changing patterns over time. These human defined features are typically the foundations for building models, but modeling analytics should be leveraged to validate these features are performing as planned while AI or ML can identify how to tweak and improve them.