When talking about transactional data, anyone usually implies financial transactions. However, according to terminology, Transactional data describe an event having changes in a specific timeline and also includes a temporal dimension, a numerical value, and always relating to one or more objects.  This article will discuss anomaly detection and transactional data by utilizing TAZI's partner's financial transactions (vendor invoices in a specific period). Our project aims to determine the anomaly points in financial data and support a hard-coded rule-based system based on business knowledge. 

Before explaining data details, it is essential to understand anomaly detection and how it is used to detect fraudulent transactions in the financial industry. Anomaly detection typically refers to the process of identifying outliers in data largely having normal ‘data points. Therefore, It is critical to find transactions generated by different flows in data and represent different transactional patterns compared to normal data. With the progress of machine learning algorithms focusing on anomalies, anomaly detection cases in the financial industry are getting more popular day by day. 

When we look at the project data, we see that there are invoices from different vendors during the vendor lifecycle. In the details, invoice transactions have vendor-related (vendor terms, vendor sites, vendor creation), invoice-term related (invoice date, invoice term, invoice due date) and payment-related (payment amount, remaining amount) features. 

At this point, it would be great to explain a rule-based detection mechanism based on business knowledge. This detection system runs on raw transactional data. From a business perspective, there are three different controls mentioned below.

  • Invoice due date/payment date consistency  (Rule 1)
  • Vendor term/invoice term consistency (Rule 2)
  • Invoice term/due date consistency (Rule 3)

Besides the rule-based detection mechanism, we built a machine learning based anomaly detection system by using TAZI platform. During this process, we engineered detailed new features (aggregated behavioral features considering the whole vendor lifecycle in the system by using payment-invoice date and payment amount) from raw transactional data to strengthen our model. After the feature engineering phase, TAZI ran the related algorithms holistically to detect the points having different patterns.

In this project, TAZI’s anomaly detection proved successful by suggesting checking 5770 extra data points that haven’t been detected by Rule 1. Similarly, TAZI anomaly detection system caught 4121 data points not showing normal patterns for Rule 2 and 2547 points for Rule 3. 

It means that TAZI created specific anomaly detection systems that directly work with rule-based systems and TAZI is not a competitor for rule-based anomaly detection systems in financial industries. By looking at the project value, it is evident that TAZI is supportive of business knowledge in terms of anomaly detection.