All Projects → Stream-AD → Midas

Stream-AD / Midas

Licence: apache-2.0
Anomaly Detection on Dynamic (time-evolving) Graphs in Real-time and Streaming manner. Detecting intrusions (DoS and DDoS attacks), frauds, fake rating anomalies.

Projects that are alternatives of or similar to Midas

MStream
Anomaly Detection on Time-Evolving Streams in Real-time. Detecting intrusions (DoS and DDoS attacks), frauds, fake rating anomalies.
Stars: ✭ 68 (-88.49%)
Mutual labels:  intrusion-detection, anomaly-detection
Pysad
Streaming Anomaly Detection Framework in Python (Outlier Detection for Streaming Data)
Stars: ✭ 87 (-85.28%)
Mutual labels:  anomaly-detection, intrusion-detection
Deep Svdd Pytorch
A PyTorch implementation of the Deep SVDD anomaly detection method
Stars: ✭ 320 (-45.85%)
Mutual labels:  anomaly-detection
Flightsim
A utility to generate malicious network traffic and evaluate controls
Stars: ✭ 525 (-11.17%)
Mutual labels:  intrusion-detection
Curve
An Integrated Experimental Platform for time series data anomaly detection.
Stars: ✭ 408 (-30.96%)
Mutual labels:  anomaly-detection
Luminaire
Luminaire is a python package that provides ML driven solutions for monitoring time series data.
Stars: ✭ 316 (-46.53%)
Mutual labels:  anomaly-detection
Anomaly Detection Resources
Anomaly detection related books, papers, videos, and toolboxes
Stars: ✭ 5,306 (+797.8%)
Mutual labels:  anomaly-detection
Wazuh Ruleset
Wazuh - Ruleset
Stars: ✭ 305 (-48.39%)
Mutual labels:  intrusion-detection
Deep Learning For Hackers
Machine Learning tutorials with TensorFlow 2 and Keras in Python (Jupyter notebooks included) - (LSTMs, Hyperameter tuning, Data preprocessing, Bias-variance tradeoff, Anomaly Detection, Autoencoders, Time Series Forecasting, Object Detection, Sentiment Analysis, Intent Recognition with BERT)
Stars: ✭ 586 (-0.85%)
Mutual labels:  anomaly-detection
Outlier Exposure
Deep Anomaly Detection with Outlier Exposure (ICLR 2019)
Stars: ✭ 343 (-41.96%)
Mutual labels:  anomaly-detection
Agentsmith Hids
By Kprobe technology Open Source Host-based Intrusion Detection System(HIDS), from E_Bwill.
Stars: ✭ 513 (-13.2%)
Mutual labels:  intrusion-detection
Credit Card Fraud Detection Using Autoencoders In Keras
iPython notebook and pre-trained model that shows how to build deep Autoencoder in Keras for Anomaly Detection in credit card transactions data
Stars: ✭ 337 (-42.98%)
Mutual labels:  anomaly-detection
Keras Anomaly Detection
Anomaly detection implemented in Keras
Stars: ✭ 335 (-43.32%)
Mutual labels:  anomaly-detection
Wdbgark
WinDBG Anti-RootKit Extension
Stars: ✭ 450 (-23.86%)
Mutual labels:  anomaly-detection
Osquery
SQL powered operating system instrumentation, monitoring, and analytics.
Stars: ✭ 18,475 (+3026.06%)
Mutual labels:  intrusion-detection
Loghub
A large collection of system log datasets for AI-powered log analytics
Stars: ✭ 551 (-6.77%)
Mutual labels:  anomaly-detection
Ano pred cvpr2018
Official implementation of Paper Future Frame Prediction for Anomaly Detection -- A New Baseline, CVPR 2018
Stars: ✭ 309 (-47.72%)
Mutual labels:  anomaly-detection
Ossec Hids
OSSEC is an Open Source Host-based Intrusion Detection System that performs log analysis, file integrity checking, policy monitoring, rootkit detection, real-time alerting and active response.
Stars: ✭ 3,580 (+505.75%)
Mutual labels:  intrusion-detection
Maltrail
Malicious traffic detection system
Stars: ✭ 4,296 (+626.9%)
Mutual labels:  intrusion-detection
Telemanom
A framework for using LSTMs to detect anomalies in multivariate time series data. Includes spacecraft anomaly data and experiments from the Mars Science Laboratory and SMAP missions.
Stars: ✭ 589 (-0.34%)
Mutual labels:  anomaly-detection

MIDAS

C++ implementation of

The old implementation is in another branch OldImplementation, it should be considered as being archived and will hardly receive feature updates.

Table of Contents

Features

  • Finds Anomalies in Dynamic/Time-Evolving Graph: (Intrusion Detection, Fake Ratings, Financial Fraud)
  • Detects Microcluster Anomalies (suddenly arriving groups of suspiciously similar edges e.g. DoS attack)
  • Theoretical Guarantees on False Positive Probability
  • Constant Memory (independent of graph size)
  • Constant Update Time (real-time anomaly detection to minimize harm)
  • Up to 55% more accurate and 929 times faster than the state of the art approaches
  • Experiments are performed using the following datasets:

Demo

If you use Windows:

  1. Open a Visual Studio developer command prompt, we want their toolchain
  2. cd to the project root MIDAS/
  3. cmake -DCMAKE_BUILD_TYPE=Release -GNinja -S . -B build/release
  4. cmake --build build/release --target Demo
  5. cd to MIDAS/build/release/
  6. .\Demo.exe

If you use Linux/macOS:

  1. Open a terminal
  2. cd to the project root MIDAS/
  3. cmake -DCMAKE_BUILD_TYPE=Release -S . -B build/release
  4. cmake --build build/release --target Demo
  5. cd to MIDAS/build/release/
  6. ./Demo

The demo runs on MIDAS/data/DARPA/darpa_processed.csv, which has 4.5M records, with the filtering core (MIDAS-F).

The scores will be exported to MIDAS/temp/Score.txt, higher means more anomalous.

All file paths are absolute and "hardcoded" by CMake, but it's suggested NOT to run by double clicking on the executable file.

Requirements

Core

  • C++11
  • C++ standard libraries

Demo (if experimental ROC-AUC impl)

  • C++ standard libraries

Demo (if sklearn ROC-AUC impl)

  • Python 3 (MIDAS/util/EvaluateScore.py)
    • pandas: I/O
    • scikit-learn: Compute ROC-AUC

Experiment

  • (Optional) Intel TBB: Parallelization
  • (Optional) OpenMP: Parallelization

Other python utility scripts

  • Python 3
    • pandas
    • scikit-learn

Customization

Switch to sklearn ROC-AUC Implementation

In MIDAS/example/Demo.cpp.
Comment out section "Evaluate scores (experimental)"
Uncomment section "Write output scores" and "Evaluate scores".

Different CMS Size / Decay Factor / Threshold

Those are arguments of cores' constructors, which are at MIDAS/example/Demo.cpp:67-69.

Switch Cores

Cores are instantiated at MIDAS/example/Demo.cpp:67-69, uncomment the chosen one.

Custom Dataset + Demo.cpp

You need to prepare three files:

  • Meta file
    • Only includes an integer N, the number of records in the dataset
    • Use its path for pathMeta
    • E.g. MIDAS/data/DARPA/darpa_shape.txt
  • Data file
    • A header-less csv format file of shape [N,3]
    • Columns are sources, destinations, timestamps
    • Use its path for pathData
    • E.g. MIDAS/data/DARPA/darpa_processed.csv
  • Label file
    • A header-less csv format file of shape [N,1]
    • The corresponding label for data records
      • 0 means normal record
      • 1 means anomalous record
    • Use its path for pathGroundTruth
    • E.g. MIDAS/data/DARPA/darpa_ground_truth.csv

Custom Dataset + Custom Runner

  1. Include the header MIDAS/src/NormalCore.hpp, MIDAS/src/RelationalCore.hpp or MIDAS/src/FilteringCore.hpp
  2. Instantiate cores with required parameters
  3. Call operator() on individual data records, it returns the anomaly score for the input record

Other Files

example/

Experiment.cpp

The code we used for experiments.
It will try to use Intel TBB or OpenMP for parallelization.
You should comment all but only one runner function call in the main() as most results are exported to MIDAS/temp/Experiiment.csv together with many intermediate files.

Reproducible.cpp

Similar to Demo.cpp, but with all random parameters hardcoded and always produce the same result.
It's for other developers and us to test if the implementation in other languages can produce acceptable results.

util/

DeleteTempFile.py, EvaluateScore.py and ReproduceROC.py will show their usage and a short description when executed without any argument.

AUROC.hpp

Experimental ROC-AUC implementation in C++11. More info at this repo.

PreprocessData.py

The code to process the raw dataset into an easy-to-read format.
Datasets are always assumed to be in a folder in MIDAS/data/.
It can process the following dataset(s)

  • DARPA/darpa_original.csv -> DARPA/darpa_processed.csv, DARPA/darpa_ground_truth.csv, DARPA/darpa_shape.txt

In Other Languages

  1. Python: Rui Liu's MIDAS.Python, Ritesh Kumar's pyMIDAS
  2. Python (pybind): Wong Mun Hou's MIDAS
  3. Golang: Steve Tan's midas
  4. Ruby: Andrew Kane's midas
  5. Rust: Scott Steele's midas_rs
  6. R: Tobias Heidler's MIDASwrappeR
  7. Java: Joshua Tokle's MIDAS-Java
  8. Julia: Ashrya Agrawal's MIDAS.jl

Online Coverage

  1. ACM TechNews
  2. AIhub
  3. Hacker News
  4. KDnuggets
  5. Microsoft
  6. Towards Data Science

Citation

If you use this code for your research, please consider citing our arXiv preprint

@misc{bhatia2020realtime,
    title={Real-Time Streaming Anomaly Detection in Dynamic Graphs},
    author={Siddharth Bhatia and Rui Liu and Bryan Hooi and Minji Yoon and Kijung Shin and Christos Faloutsos},
    year={2020},
    eprint={2009.08452},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

or our AAAI paper

@inproceedings{bhatia2020midas,
    title="MIDAS: Microcluster-Based Detector of Anomalies in Edge Streams",
    author="Siddharth {Bhatia} and Bryan {Hooi} and Minji {Yoon} and Kijung {Shin} and Christos {Faloutsos}",
    booktitle="AAAI 2020 : The Thirty-Fourth AAAI Conference on Artificial Intelligence",
    year="2020"
}
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].