Martin Langosch, Senior Consultant here at Business Data Partners, provides his view on how Data Science can be applied enhancing traditional technologies to combat money laundering.
In 2018, The Independent reported that more than £90b a year is estimated to be laundered through the UK. As a result, the money laundering, Terrorist Financing and Transfer of Funds (Information on the Payer) Regulations 2017 has become a primary focus for financial institutions to tackle these crimes. As the amount of data we produce increases in tandem with the adoption rates of digital currencies, criminal organisations are able to apply increasingly sophisticated methods to introduce illegal money into the economy and finance organised crime or terrorism. The failure for financial service providers to implement solid anti-money laundering systems can lead to heavy fines; UBS Financial was fined $15m USD for AML failures and the FCA imposed a fine of nearly £900k and restrictions on Canara Bank anti-money laundering (AML) system failings. Financial service providers are turning to technical solutions implemented through Data Science to tackle the new wave of money-laundering tactics. Through organising the vast amount of data and producing a complete view of all this data’s movements, financial service institutions are better equipped to manage the threat of money-laundering, but how is this achieved?
What is Money Laundering?
The United Kingdom Government describes money laundering as “exchanging money or assets that were obtained criminally for money or other assets that are ‘clean’. The clean money or assets do not have an obvious link with any criminal activity. Money laundering also includes money that’s used to fund terrorism, however it’s obtained.”

Figure 1
Traditionally money laundering has been described as a three-stage process:
- Criminally derived funds are introduced in the financial system.
- Layering is the process in which the money is ‘washed’ resulting in the money’s source and ownership identifiers being lost.
- ‘Clean’ money is re-introduced into the economy.
(Source: Wilmington Risk & Compliance)
Conventional ways to fight Money Laundering
Traditional AML systems are based on rules engines, helping companies to identify suspicious activity. These rules-based systems create alerts and flag transactions as suspicious based on pre-determined rules. Flagged transactions are then manually checked to determine if a suspicious activity report (SAR) needs to be raised. Once a file of alerts has been logged, specialists are brought in to investigate each report and determine if it is a true case of money laundering or a false positive. A false positive would be a case categorised as being suspicious, but the investigation finds only legitimate activity.
The issue with relying on a rule-based system to tackle money laundering is these traditional systems are rigid and cannot adapt to continuously changing data. Additionally, a great number of false positives are included in alerts which are not identified to be as such until much later in the process. The identification of suspicious transactions is binary, resulting in a True or False classification (=> Money Laundering case or Legitimate). Dealing with this high number of false positives is time-consuming and costly for financial service providers. Therefore, the goal is to find a way to reduce the number of false positives, without relaxing the rules so that real cases of money laundering won’t slip through undetected. This is where Data Science can help.
How can Data Science improve AML solutions?

Figure 2
Data Science combines three main areas (see figure 2); mathematics and statistics, technology, and business understanding. By selecting elements of each area and employing best practice algorithms, technologies, and relevant business practices, Data Scientists are able to exponentially improve current AML systems.
Compared to traditional rule-based AML software, Data Scientists combine common toolkits used in Data Science, such as R and Python, and make use of the most reliable Machine Learning models and data processing technologies available today.
Data Engineers are trained to combine multiple data sources of varying structures and types to provide comprehensive data-sets to Data Scientists. From these data-sets, Data Scientists will apply Machine Learning techniques, in this instance utilising unsupervised machine learning to see patterns in data and find clusters of suspicious activity. This is something that cannot be achieved by using traditional SQL queries.
What are the benefits of using Data Science in AML systems?
- Rather than a binary True or False classification, Data Scientists can create suspicion scores to reduce false positives and concentrate manual investigation on high-risk alerts. This reduces both time and money costs.
- Data Pipelines can be built to break down data silos, combining available data sources to create a better picture of the overall transaction.
- Data Science processes can be stored in scripts with tools like R and Python, which can be stored and executed on Big Data technology or in the Cloud to enable at-scale processing.
- Machine Learning Models adapt to new data, are very flexible, and reduce the need for human input once trained correctly.
- Traditional tools like SQL are good for data warehousing and BI, but not for pattern recognition and fraud detection; here we need Data Science languages like R and Python
- Model reconciliation and regular monitoring ensure changing data won’t affect the quality and everything is working within expected boundaries.
- Model interpretation is key for us to avoid “Black-Box” approaches
- Reinforcement Learning enables the systems to correct its processing rules and reprocess previous data with ease.
- Markup capabilities of R or Jupyter Notebooks can be additionally utilised to combine coding and documentation, resulting in enhanced report back capabilities.
Data Science AML solution in action
Benford’s Law is used in fraud detection, therefore, no surprise that auditors use it often. Benford’s Law is a mathematical theory based on a logarithm of the probability of occurrence of digits in organically grown and non-manipulated data sets.
According to this law (see Figure 3), we see in the graph that the digit “1” appears about 30% of the time, followed by “2” and so on. Data Scientists can apply this theory to suspected money laundering transactions where breaching a particular threshold would raise suspicion or scrutiny.

Figure 3
As a result, criminals would likely break up larger payments into many small transactions. In this example, a threshold of £5,000 could be avoided by placing payments of up to £4,999 only. But splitting up payments would create unusual and unnatural amounts, which can be flagged by using Benford’s law.

Figure 4
One issue that faces Benford’s law is the potential for inaccuracies with low volumes of data. However, Data Science can help here. Because Data Scientists have the capabilities to merge various data sources with different data types across many data silos, the volume of data used stays above the threshold. This is just one example of how statistics, embedded in Data Science, can help to improve AML systems.
The Future of Anti-Money Laundering is Data Science
As of 2017 Banks globally have paid $321 billion in fines since 2008 for an abundance of regulatory failings from money laundering to market manipulation and terrorist financing, according to data from Boston Consulting Group reports Bloomberg. With growing amounts of data, reliance on online transactions, and the increase of digital-currency adoption, banks must address money laundering to not only stop it, but to also avoid fines. Data Science is an ideal addition to traditional AML systems to halt this threat where it stands. Not only is an AML Data Science solution more comprehensive and accurate, the reduction of false positives will save money and human resources. A better understanding and a clear overview of attributes and paths to identify money laundering will be extremely valuable for your AML experts. Findings can be applied to other parts of the business, including online banking and marketing, resulting in a clear picture of the transactions your business manages.
Business Data Partners is a Data Management and Analytics consultancy with extensive experience in AML environments with-in UK financial institutions. If you would like to speak to us about our experience and how we can help you, get in touch today.