Fraud Detection Framework

Developed a Predictive Fraud Detection framework combining Time Series Analysis, Rule Based Heuristics, and K means Clustering, enabling a major telecom client to identify and eliminate 50% of fraudulent clone accounts in their network, thus preventing an annual revenue loss of approximately $7 million.

Introduction

Client is the advanced analytics team of a US based fortune 80 Telecom giant with a customer base of more than 30M. The presence of clones* in the network comes with a significant revenue loss, network congestion and customer dissatisfaction for the organization.
Client had a legacy (heuristic rule based) model in place to detect clones, which had achieved significant value by killing the illegitimate devices but, there were newer IPV6 and Perfect Clones (clones that are undetectable by clone and geographic features) that were not detected by the model. Moreover, the model did not consider Time Series Features.

Problem Statement

To create a Time Series based heuristic model and use the output as additional features for the existing model to detect the Perfect Clones.
The clones detected would then undergo a planned disruption in their telecom service to minimize their population in the network.

Solution Approach

Implemented Last In, First Out (LIFO) based Time Series model on the ADS (MAC, CMTS & Timestamp level) to flag the MAC & CMTS instances as Clones & Legitimates.
Enhanced the existing heuristic-based model (which used payment information, geographic features & clone features as input) with the Time Series features; to create reliable scoring system.
Further enhanced the scoring system, by applying K-Means clustering as an additional layer of validation to the current detection model.

Constraints

Limited time series data: Data was available only for 1 year (i.e., 2020). Hence, we flagged the ones which couldn’t be determined as Unknown, Disqualified and Investigate based on set rules.
There was an additional risk of disrupting the services of a false positive (legitimate customer).
A few instances were observed where the set rule flagged different occurrences of the same MAC as a Clone for the first occurrence and a Legitimate for the other occurrence. This was solved by adding a confidence level flag where all such instances were given “Low” confidence level.

Impact

Identified 55k additional clones, hence, preventing an annual revenue loss of around $7M.
Prevented extra indirect costs such as infrastructure, operational and overhead expenses.
Improved network performance by decreasing network congestion, leading to enhanced customer satisfaction.

Interesting Fact: Physical door knocks were conducted in collaboration with the local police department at the addresses of 4 MACs, which were mapped to significantly higher number of CMTS’.

*Clones: When hackers copy MAC (Media Access Control) addresses to use telecom services without paying.