All Projects → safe-graph → DGFraud-TF2

safe-graph / DGFraud-TF2

Licence: Apache-2.0 license
A Deep Graph-based Toolbox for Fraud Detection in TensorFlow 2.X

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to DGFraud-TF2

CARE-GNN
Code for CIKM 2020 paper Enhancing Graph Neural Network-based Fraud Detectors against Camouflaged Fraudsters
Stars: ✭ 121 (+44.05%)
Mutual labels:  fraud-prevention, datamining, fraud-detection, graphneuralnetwork
Dgfraud
A Deep Graph-based Toolbox for Fraud Detection
Stars: ✭ 281 (+234.52%)
Mutual labels:  graph-algorithms, toolkit, datascience, anomaly-detection
Pyod
A Python Toolbox for Scalable Outlier Detection (Anomaly Detection)
Stars: ✭ 5,083 (+5951.19%)
Mutual labels:  outlier-detection, fraud-detection, anomaly-detection
xgboost-smote-detect-fraud
Can we predict accurately on the skewed data? What are the sampling techniques that can be used. Which models/techniques can be used in this scenario? Find the answers in this code pattern!
Stars: ✭ 59 (-29.76%)
Mutual labels:  datascience, fraud-prevention, fraud-detection
deepAD
Detection of Accounting Anomalies in the Latent Space using Adversarial Autoencoder Neural Networks - A lab we prepared for the KDD'19 Workshop on Anomaly Detection in Finance that will walk you through the detection of interpretable accounting anomalies using adversarial autoencoder neural networks. The majority of the lab content is based on J…
Stars: ✭ 65 (-22.62%)
Mutual labels:  fraud-prevention, fraud-detection, anomaly-detection
pytod
TOD: GPU-accelerated Outlier Detection via Tensor Operations
Stars: ✭ 131 (+55.95%)
Mutual labels:  outlier-detection, anomaly-detection
Awesome Ts Anomaly Detection
List of tools & datasets for anomaly detection on time-series data.
Stars: ✭ 2,027 (+2313.1%)
Mutual labels:  outlier-detection, anomaly-detection
Anomaly Detection
anomaly detection with anomalize and Google Trends data
Stars: ✭ 38 (-54.76%)
Mutual labels:  datascience, anomaly-detection
Openuba
A robust, and flexible open source User & Entity Behavior Analytics (UEBA) framework used for Security Analytics. Developed with luv by Data Scientists & Security Analysts from the Cyber Security Industry. [PRE-ALPHA]
Stars: ✭ 127 (+51.19%)
Mutual labels:  datascience, anomaly-detection
outliertree
(Python, R, C++) Explainable outlier/anomaly detection through decision tree conditioning
Stars: ✭ 40 (-52.38%)
Mutual labels:  outlier-detection, anomaly-detection
ETL-Starter-Kit
📁 Extract, Transform, Load (ETL) 👷 refers to a process in database usage and especially in data warehousing. This repository contains a starter kit featuring ETL related work.
Stars: ✭ 21 (-75%)
Mutual labels:  datascience, datamining
nl4dv
A python toolkit to create Visualizations (Vis) using natural language (NL) or add an NL interface to existing Vis.
Stars: ✭ 63 (-25%)
Mutual labels:  toolkit, datascience
f anogan pytorch
Code for reproducing f-AnoGAN in Pytorch
Stars: ✭ 28 (-66.67%)
Mutual labels:  outlier-detection, anomaly-detection
Midas
Go implementation of MIDAS: Microcluster-Based Detector of Anomalies in Edge Streams
Stars: ✭ 180 (+114.29%)
Mutual labels:  graph-algorithms, anomaly-detection
Anomaly Detection Resources
Anomaly detection related books, papers, videos, and toolboxes
Stars: ✭ 5,306 (+6216.67%)
Mutual labels:  outlier-detection, anomaly-detection
Datastream.io
An open-source framework for real-time anomaly detection using Python, ElasticSearch and Kibana
Stars: ✭ 814 (+869.05%)
Mutual labels:  datascience, anomaly-detection
Ugfraud
An Unsupervised Graph-based Toolbox for Fraud Detection
Stars: ✭ 38 (-54.76%)
Mutual labels:  graph-algorithms, anomaly-detection
kenchi
A scikit-learn compatible library for anomaly detection
Stars: ✭ 36 (-57.14%)
Mutual labels:  outlier-detection, anomaly-detection
ADRepository-Anomaly-detection-datasets
ADRepository: Real-world anomaly detection datasets
Stars: ✭ 77 (-8.33%)
Mutual labels:  outlier-detection, anomaly-detection
DCSO
Supplementary material for KDD 2018 workshop "DCSO: Dynamic Combination of Detector Scores for Outlier Ensembles"
Stars: ✭ 20 (-76.19%)
Mutual labels:  outlier-detection, anomaly-detection



travis-ci Tensorflow Python PRs GitHub release

A Deep Graph-based Toolbox for Fraud Detection in TensorFlow 2.X

Introduction | Useful Resources | Installation | Datasets | User Guide | Implemented Models | How to Contribute

Introduction

DGFraud-TF2 is a Graph Neural Network (GNN) based toolbox for fraud detection. It is the Tensorflow 2.X version of DGFraud, which is implemented using TF 1.X. It integrates the implementation & comparison of state-of-the-art GNN-based fraud detection models. The introduction of implemented models can be found here.

We welcome contributions to this repo like adding new fraud detectors and extending the features of the toolbox.

If you use the toolbox in your project, please cite the paper below and the algorithms you used:

CIKM'20 (PDF)

@inproceedings{dou2020enhancing,
  title={Enhancing Graph Neural Network-based Fraud Detectors against Camouflaged Fraudsters},
  author={Dou, Yingtong and Liu, Zhiwei and Sun, Li and Deng, Yutong and Peng, Hao and Yu, Philip S},
  booktitle={Proceedings of the 29th ACM International Conference on Information and Knowledge Management (CIKM'20)},
  year={2020}
}

Useful Resources

Installation

git clone https://github.com/safe-graph/DGFraud-TF2.git
cd DGFraud-TF2
python setup.py install

Requirements

* python>=3.6
* tensorflow>=2.0
* numpy>=1.16.4
* scipy>=1.2.0

Datasets

DBLP

We uses the pre-processed DBLP dataset from Jhy1993/HAN You can run the FdGars, Player2Vec, GeniePath and GEM based on the DBLP dataset. Unzip the archive before using the dataset:

cd dataset
unzip DBLP4057_GAT_with_idx_tra200_val_800.zip

Example dataset

We implement example graphs for SemiGNN, GAS and GEM in data_loader.py. Because those models require unique graph structures or node types, which cannot be found in opensource datasets.

Yelp dataset

For GraphConsis and GraphSAGE, we preprocessed Yelp Spam Review Dataset with reviews as nodes and three relations as edges.

The dataset with .mat format is located at /dataset/YelpChi.zip. The .mat file includes:

  • net_rur, net_rtr, net_rsr: three sparse matrices representing three homo-graphs defined in GraphConsis paper;
  • features: a sparse matrix of 32-dimension handcrafted features;
  • label: a numpy array with the ground truth of nodes. 1 represents spam and 0 represents benign.

The YelpChi data preprocessing details can be found in our CIKM'20 paper. To get the complete metadata of the Yelp dataset, please email to [email protected] for inquiry.

User Guide

Running the example code

You can find the implemented models in algorithms directory. For example, you can run Player2Vec using:

python Player2Vec_main.py 

You can specify parameters for models when running the code.

Running on your datasets

Have a look at the load_data_dblp() function in utils/utils.py for an example.

In order to use your own data, you have to provide:

  • adjacency matrices or adjlists (for GAS);
  • a feature matrix
  • a label matrix then split feature matrix and label matrix into testing data and training data.

You can specify a dataset as follows:

python xx_main.py --dataset your_dataset 

or by editing xx_main.py

The structure of code

The repository is organized as follows:

  • algorithms/ contains the implemented models and the corresponding example code;
  • layers/ contains all GNN layers used by implemented models;
  • dataset/ contains the necessary dataset files;
  • utils/ contains:
    • loading and splitting the data (data_loader.py);
    • contains various utilities (utils.py).

Implemented Models

Model Source

Model Paper Venue Reference
SemiGNN A Semi-supervised Graph Attentive Network for Financial Fraud Detection ICDM 2019 BibTex
Player2Vec Key Player Identification in Underground Forums over Attributed Heterogeneous Information Network Embedding Framework CIKM 2019 BibTex
GAS Spam Review Detection with Graph Convolutional Networks CIKM 2019 BibTex
FdGars FdGars: Fraudster Detection via Graph Convolutional Networks in Online App Review System WWW 2019 BibTex
GeniePath GeniePath: Graph Neural Networks with Adaptive Receptive Paths AAAI 2019 BibTex
GEM Heterogeneous Graph Neural Networks for Malicious Account Detection CIKM 2018 BibTex
GraphSAGE Inductive Representation Learning on Large Graphs NIPS 2017 BibTex
GraphConsis Alleviating the Inconsistency Problem of Applying Graph Neural Network to Fraud Detection SIGIR 2020 BibTex
HACUD Cash-Out User Detection Based on Attributed Heterogeneous Information Network with a Hierarchical Attention Mechanism AAAI 2019 BibTex

Model Comparison

Model Application Graph Type Base Model
SemiGNN Financial Fraud Heterogeneous GAT, LINE, DeepWalk
Player2Vec Cyber Criminal Heterogeneous GAT, GCN
GAS Opinion Fraud Heterogeneous GCN, GAT
FdGars Opinion Fraud Homogeneous GCN
GeniePath Financial Fraud Homogeneous GAT
GEM Financial Fraud Heterogeneous GCN
GraphSAGE Opinion Fraud Homogeneous GraphSAGE
GraphConsis Opinion Fraud Heterogeneous GraphSAGE
HACUD Financial Fraud Heterogeneous GAT

How to Contribute

You are welcomed to contribute to this open-source toolbox. Currently, you can create PR or email to [email protected] for inquiry.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].