All Projects → NetManAIOps → Omnianomaly

NetManAIOps / Omnianomaly

Licence: mit
KDD 2019: Robust Anomaly Detection for Multivariate Time Series through Stochastic Recurrent Neural Network

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Omnianomaly

Body reconstruction references
Paper, dataset and code collection on human body reconstruction
Stars: ✭ 96 (-53.85%)
Mutual labels:  dataset, code
Awesome-Deepfakes-Detection
A list of tools, papers and code related to Deepfake Detection.
Stars: ✭ 30 (-85.58%)
Mutual labels:  code, dataset
Img2poem
Stars: ✭ 238 (+14.42%)
Mutual labels:  dataset, code
Bertqa Attention On Steroids
BertQA - Attention on Steroids
Stars: ✭ 112 (-46.15%)
Mutual labels:  dataset, code
Semantic Segmentation Suite
Semantic Segmentation Suite in TensorFlow. Implement, train, and test new Semantic Segmentation models easily!
Stars: ✭ 2,395 (+1051.44%)
Mutual labels:  dataset
Arbitrary Text To Image Papers
A collection of arbitrary text to image papers with code (constantly updating)
Stars: ✭ 196 (-5.77%)
Mutual labels:  code
Dali
DALI: a large Dataset of synchronised Audio, LyrIcs and vocal notes.
Stars: ✭ 193 (-7.21%)
Mutual labels:  dataset
Hdltex
HDLTex: Hierarchical Deep Learning for Text Classification
Stars: ✭ 191 (-8.17%)
Mutual labels:  dataset
Charlatan
Create fake data in R
Stars: ✭ 209 (+0.48%)
Mutual labels:  dataset
Computervisiondatasets
Stars: ✭ 207 (-0.48%)
Mutual labels:  dataset
Golang Notes
Go source code analysis(zh-cn)
Stars: ✭ 3,137 (+1408.17%)
Mutual labels:  code
Scc
Sloc, Cloc and Code: scc is a very fast accurate code counter with complexity calculations and COCOMO estimates written in pure Go
Stars: ✭ 2,943 (+1314.9%)
Mutual labels:  code
Tech.ml.dataset
A Clojure high performance data processing system
Stars: ✭ 205 (-1.44%)
Mutual labels:  dataset
Codeigniter Phpstorm
PhpStorm Code Completion to CodeIgniter
Stars: ✭ 194 (-6.73%)
Mutual labels:  code
Mini Imagenet Tools
Tools for generating mini-ImageNet dataset and processing batches
Stars: ✭ 209 (+0.48%)
Mutual labels:  dataset
Korean Hate Speech
Korean HateSpeech Dataset
Stars: ✭ 192 (-7.69%)
Mutual labels:  dataset
Trump Lies
Tutorial: Web scraping in Python with Beautiful Soup
Stars: ✭ 201 (-3.37%)
Mutual labels:  dataset
Covid19za
Coronavirus COVID-19 (2019-nCoV) Data Repository and Dashboard for South Africa
Stars: ✭ 208 (+0%)
Mutual labels:  dataset
Awesome Json Datasets
A curated list of awesome JSON datasets that don't require authentication.
Stars: ✭ 2,421 (+1063.94%)
Mutual labels:  dataset
Blog
A set of various projects based on ESP8266, ESP32, ATtiny13, ATtiny85, ATtiny2313, ATmega8, ATmega328, ATmega32, STM32 and more.
Stars: ✭ 198 (-4.81%)
Mutual labels:  code

OmniAnomaly

Anomaly Detection for Multivariate Time Series through Modeling Temporal Dependence of Stochastic Variables

OmniAnomaly is a stochastic recurrent neural network model which glues Gated Recurrent Unit (GRU) and Variational auto-encoder (VAE), its core idea is to learn the normal patterns of multivariate time series and uses the reconstruction probability to do anomaly judgment.

Getting Started

Clone the repo

git clone https://github.com/smallcowbaby/OmniAnomaly && cd OmniAnomaly

Get data

SMD (Server Machine Dataset) is in folder ServerMachineDataset.

You can get the public datasets (SMAP and MSL) using:

wget https://s3-us-west-2.amazonaws.com/telemanom/data.zip && unzip data.zip && rm data.zip

cd data && wget https://raw.githubusercontent.com/khundman/telemanom/master/labeled_anomalies.csv

Install dependencies (with python 3.5, 3.6)

(virtualenv is recommended)

pip install -r requirements.txt

Preprocess the data

python data_preprocess.py <dataset>

where <dataset> is one of SMAP, MSL or SMD.

Run the code

python main.py

If you want to change the default configuration, you can edit ExpConfig in main.py or overwrite the config in main.py using command line args. For example:

python main.py --dataset='MSL' --max_epoch=20

Data

Dataset Information

Dataset name Number of entities Number of dimensions Training set size Testing set size Anomaly ratio(%)
SMAP 55 25 135183 427617 13.13
MSL 27 55 58317 73729 10.72
SMD 28 38 708405 708420 4.16

SMAP and MSL

SMAP (Soil Moisture Active Passive satellite) and MSL (Mars Science Laboratory rover) are two public datasets from NASA.

For more details, see: https://github.com/khundman/telemanom

SMD

SMD (Server Machine Dataset) is a new 5-week-long dataset. We collected it from a large Internet company. This dataset contains 3 groups of entities. Each of them is named by machine-<group_index>-<index>.

SMD is made up by data from 28 different machines, and the 28 subsets should be trained and tested separately. For each of these subsets, we divide it into two parts of equal length for training and testing. We provide labels for whether a point is an anomaly and the dimensions contribute to every anomaly.

Thus SMD is made up by the following parts:

  • train: The former half part of the dataset.
  • test: The latter half part of the dataset.
  • test_label: The label of the test set. It denotes whether a point is an anomaly.
  • interpretation_label: The lists of dimensions contribute to each anomaly.

concatenate

Processing

With the default configuration, main.py follows these steps:

  • Train the model with training set, and validate at a fixed frequency. Early stop method is applied by default.
  • Test the model on both training set and testing set, and save anomaly score in train_score.pkl and test_score.pkl.
  • Find the best F1 score on the testing set, and print the results.
  • Init POT model on train_score to find the threshold of anomaly score, and using this threshold to predict on the testing set.

Training loss

The figure below are the training loss of our model on MSL and SMAP, which indicates that our model can converge well on these two datasets.

image image

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].