All Projects → AvinashSingh786 → Fraud-Analysis

AvinashSingh786 / Fraud-Analysis

Licence: MIT license
Insurance fraud claims analysis project

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Fraud-Analysis

xgboost-smote-detect-fraud
Can we predict accurately on the skewed data? What are the sampling techniques that can be used. Which models/techniques can be used in this scenario? Find the answers in this code pattern!
Stars: ✭ 59 (+59.46%)
Mutual labels:  fraud-detection
Data-Science-101
Notes and tutorials on how to use python, pandas, seaborn, numpy, matplotlib, scipy for data science.
Stars: ✭ 19 (-48.65%)
Mutual labels:  exploratory-data-analysis
learnr
Exploratory, Inferential and Predictive data analysis. Feel free to show your ❤️ by giving a star ⭐
Stars: ✭ 64 (+72.97%)
Mutual labels:  exploratory-data-analysis
keystroke-dynamics
Demo to show keystroke dynamics / keystroke biometrics
Stars: ✭ 25 (-32.43%)
Mutual labels:  fraud-detection
student-grade-analytics
Analyse academic and non-academic information of students and predict grades
Stars: ✭ 17 (-54.05%)
Mutual labels:  exploratory-data-analysis
leila
Librería para la evaluación de calidad de datos, e interacción con el portal de datos.gov.co
Stars: ✭ 56 (+51.35%)
Mutual labels:  exploratory-data-analysis
Breast-cancer-risk-prediction
Classification of Breast Cancer diagnosis Using Support Vector Machines
Stars: ✭ 143 (+286.49%)
Mutual labels:  exploratory-data-analysis
SentryPeer
A distributed peer to peer list of bad actor IP addresses and phone numbers collected via a SIP Honeypot.
Stars: ✭ 108 (+191.89%)
Mutual labels:  fraud-detection
CARE-GNN
Code for CIKM 2020 paper Enhancing Graph Neural Network-based Fraud Detectors against Camouflaged Fraudsters
Stars: ✭ 121 (+227.03%)
Mutual labels:  fraud-detection
How-to-score-0.8134-in-Titanic-Kaggle-Challenge
Solution of the Titanic Kaggle competition
Stars: ✭ 114 (+208.11%)
Mutual labels:  exploratory-data-analysis
Data-Science-Series
For all those who're struggling to find a good hands-on resource (with case studies) to master their Data Science skills, Here's all what you need!
Stars: ✭ 48 (+29.73%)
Mutual labels:  exploratory-data-analysis
Kaggle
Kaggle Kernels (Python, R, Jupyter Notebooks)
Stars: ✭ 26 (-29.73%)
Mutual labels:  exploratory-data-analysis
furniture
The furniture R package contains table1 for publication-ready simple and stratified descriptive statistics, tableC for publication-ready correlation matrixes, and other tables #rstats
Stars: ✭ 43 (+16.22%)
Mutual labels:  exploratory-data-analysis
Data-Analyst-Nanodegree
This repo consists of the projects that I completed as a part of the Udacity's Data Analyst Nanodegree's curriculum.
Stars: ✭ 13 (-64.86%)
Mutual labels:  exploratory-data-analysis
IDVerification
"Very simple but works well" Computer Vision based ID verification solution provided by LibraX.
Stars: ✭ 44 (+18.92%)
Mutual labels:  fraud-detection
dqlab-career-track
A collection of scripts written to complete DQLab Data Analyst Career Track 📊
Stars: ✭ 53 (+43.24%)
Mutual labels:  exploratory-data-analysis
Feature-Engineering-for-Fraud-Detection
Implementation of feature engineering from Feature engineering strategies for credit card fraud
Stars: ✭ 31 (-16.22%)
Mutual labels:  fraud-detection
adenine
ADENINE: A Data ExploratioN PipelINE
Stars: ✭ 15 (-59.46%)
Mutual labels:  exploratory-data-analysis
predict-fraud-using-auto-ai
Use AutoAI to detect fraud
Stars: ✭ 27 (-27.03%)
Mutual labels:  fraud-detection
MemStream
MemStream: Memory-Based Streaming Anomaly Detection
Stars: ✭ 58 (+56.76%)
Mutual labels:  fraud-detection

Fraud Analysis

A project on fraud analysis of insurance claims.

Requirements

- Python3
    - mimesis     4.1.2 (elizabeth is no longer available)
    - faker       5.0.1
    - matplotlib  3.2.0
    - numpy       1.19.1
    - pandas      1.1.5
    - sklearn     0.0
    - pydotplus   2.0.2
    - openpyxl    3.0.5
- Grahviz
    - Windows -> https://graphviz.org/download/
    - Unix -> sudo apt-get install graphviz

Note: This repository isn't maintained often, use at your own discretion. Error handling was not a priority when scripting.

Installation

pip install -r requirements.txt

After the required packages are installed you can simply run the scripts from the phases below.

Phases

Create Database

Create the database with meaningful random data.

python createDatabase.py

Cleaning the data

Cleaning invalid data/rows that does not meet a defined criteria. Fills in data using samples or a mean/median.

python cleaning.py

EDA

Exploratory Data Analysis - understand the data and data types as well as some statistics and graphing to see the distribution, correlation, anomalies and outliers of the data.

python eda.py

PPDM

Privacy Preserving Data Mining - suppress, generalize, anatomization, perturbation, categorize, k-anonymity is done in order to preserve privacy so that sensitive attributes cannot identify a person without having the entire dataset. This makes the data safer in an instance the data is leaked, it makes it harder to impersonate someone.

python ppdm.py

Machine learning

Machine leaning was used to detect fraudulent insurance claims. This uses a simple decision tree classifier and was trained with 70/30 train/test ratio. The accuracy of the prediction was ~99% with 73117 training elements and 18280 testing elements. The tree can be seen in insurance.pdf.

python machine_learning.py
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].