All Projects → Y-oHr-N → kenchi

Y-oHr-N / kenchi

Licence: BSD-3-Clause license
A scikit-learn compatible library for anomaly detection

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to kenchi

Awesome Ts Anomaly Detection
List of tools & datasets for anomaly detection on time-series data.
Stars: ✭ 2,027 (+5530.56%)
Mutual labels:  data-mining, outlier-detection, anomaly-detection
Pyod
A Python Toolbox for Scalable Outlier Detection (Anomaly Detection)
Stars: ✭ 5,083 (+14019.44%)
Mutual labels:  data-mining, outlier-detection, anomaly-detection
Anomaly Detection Resources
Anomaly detection related books, papers, videos, and toolboxes
Stars: ✭ 5,306 (+14638.89%)
Mutual labels:  data-mining, outlier-detection, anomaly-detection
ADRepository-Anomaly-detection-datasets
ADRepository: Real-world anomaly detection datasets
Stars: ✭ 77 (+113.89%)
Mutual labels:  outlier-detection, anomaly-detection, novelty-detection
TextClassification
基于scikit-learn实现对新浪新闻的文本分类,数据集为100w篇文档,总计10类,测试集与训练集1:1划分。分类算法采用SVM和Bayes,其中Bayes作为baseline。
Stars: ✭ 86 (+138.89%)
Mutual labels:  data-mining, scikit-learn
imbalanced-ensemble
Class-imbalanced / Long-tailed ensemble learning in Python. Modular, flexible, and extensible. | 模块化、灵活、易扩展的类别不平衡/长尾机器学习库
Stars: ✭ 199 (+452.78%)
Mutual labels:  data-mining, scikit-learn
multiscorer
A module for allowing the use of multiple metric functions in scikit's cross_val_score
Stars: ✭ 21 (-41.67%)
Mutual labels:  data-mining, scikit-learn
deviation-network
Source code of the KDD19 paper "Deep anomaly detection with deviation networks", weakly/partially supervised anomaly detection, few-shot anomaly detection
Stars: ✭ 94 (+161.11%)
Mutual labels:  outlier-detection, anomaly-detection
PracticalMachineLearning
A collection of ML related stuff including notebooks, codes and a curated list of various useful resources such as books and softwares. Almost everything mentioned here is free (as speech not free food) or open-source.
Stars: ✭ 60 (+66.67%)
Mutual labels:  data-mining, scikit-learn
Sktime
A unified framework for machine learning with time series
Stars: ✭ 4,741 (+13069.44%)
Mutual labels:  data-mining, scikit-learn
Matrixprofile
A Python 3 library making time series data mining tasks, utilizing matrix profile algorithms, accessible to everyone.
Stars: ✭ 141 (+291.67%)
Mutual labels:  data-mining, anomaly-detection
Suod
(MLSys' 21) An Acceleration System for Large-scare Unsupervised Heterogeneous Outlier Detection (Anomaly Detection)
Stars: ✭ 245 (+580.56%)
Mutual labels:  data-mining, anomaly-detection
A-Detector
⭐ An anomaly-based intrusion detection system.
Stars: ✭ 69 (+91.67%)
Mutual labels:  scikit-learn, anomaly-detection
outliertree
(Python, R, C++) Explainable outlier/anomaly detection through decision tree conditioning
Stars: ✭ 40 (+11.11%)
Mutual labels:  outlier-detection, anomaly-detection
Algorithmic-Trading
Algorithmic trading using machine learning.
Stars: ✭ 102 (+183.33%)
Mutual labels:  data-mining, scikit-learn
DCSO
Supplementary material for KDD 2018 workshop "DCSO: Dynamic Combination of Detector Scores for Outlier Ensembles"
Stars: ✭ 20 (-44.44%)
Mutual labels:  outlier-detection, anomaly-detection
XGBOD
Supplementary material for IJCNN paper "XGBOD: Improving Supervised Outlier Detection with Unsupervised Representation Learning"
Stars: ✭ 59 (+63.89%)
Mutual labels:  outlier-detection, anomaly-detection
Orange3
🍊 📊 💡 Orange: Interactive data analysis
Stars: ✭ 3,152 (+8655.56%)
Mutual labels:  data-mining, scikit-learn
Model Describer
model-describer : Making machine learning interpretable to humans
Stars: ✭ 22 (-38.89%)
Mutual labels:  data-mining, scikit-learn
Python Machine Learning Book
The "Python Machine Learning (1st edition)" book code repository and info resource
Stars: ✭ 11,428 (+31644.44%)
Mutual labels:  data-mining, scikit-learn

kenchi

This is a scikit-learn compatible library for anomaly detection.

Dependencies

Installation

You can install via pip

pip install kenchi

or conda.

conda install -c y_ohr_n kenchi

Algorithms

  • Outlier detection
    1. FastABOD [8]
    2. LOF [2] (scikit-learn wrapper)
    3. KNN [1], [12]
    4. OneTimeSampling [14]
    5. HBOS [5]
  • Novelty detection
    1. OCSVM [13] (scikit-learn wrapper)
    2. MiniBatchKMeans
    3. IForest [10] (scikit-learn wrapper)
    4. PCA
    5. GMM (scikit-learn wrapper)
    6. KDE [11] (scikit-learn wrapper)
    7. SparseStructureLearning [6]

Examples

import matplotlib.pyplot as plt
import numpy as np
from kenchi.datasets import load_pima
from kenchi.outlier_detection import *
from kenchi.pipeline import make_pipeline
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

np.random.seed(0)

scaler = StandardScaler()

detectors = [
    FastABOD(novelty=True, n_jobs=-1), OCSVM(),
    MiniBatchKMeans(), LOF(novelty=True, n_jobs=-1),
    KNN(novelty=True, n_jobs=-1), IForest(n_jobs=-1),
    PCA(), KDE()
]

# Load the Pima Indians diabetes dataset.
X, y = load_pima(return_X_y=True)
X_train, X_test, _, y_test = train_test_split(X, y)

# Get the current Axes instance
ax = plt.gca()

for det in detectors:
    # Fit the model according to the given training data
    pipeline = make_pipeline(scaler, det).fit(X_train)

    # Plot the Receiver Operating Characteristic (ROC) curve
    pipeline.plot_roc_curve(X_test, y_test, ax=ax)

# Display the figure
plt.show()
https://raw.githubusercontent.com/HazureChi/kenchi/master/docs/images/readme.png

References

[1]Angiulli, F., and Pizzuti, C., "Fast outlier detection in high dimensional spaces," In Proceedings of PKDD, pp. 15-27, 2002.
[2]Breunig, M. M., Kriegel, H.-P., Ng, R. T., and Sander, J., "LOF: identifying density-based local outliers," In Proceedings of SIGMOD, pp. 93-104, 2000.
[3]Dua, D., and Karra Taniskidou, E., "UCI Machine Learning Repository," 2017.
[4]Goix, N., "How to evaluate the quality of unsupervised anomaly detection algorithms?" In ICML Anomaly Detection Workshop, 2016.
[5]Goldstein, M., and Dengel, A., "Histogram-based outlier score (HBOS): A fast unsupervised anomaly detection algorithm," KI: Poster and Demo Track, pp. 59-63, 2012.
[6]Ide, T., Lozano, C., Abe, N., and Liu, Y., "Proximity-based anomaly detection using sparse structure learning," In Proceedings of SDM, pp. 97-108, 2009.
[7]Kriegel, H.-P., Kroger, P., Schubert, E., and Zimek, A., "Interpreting and unifying outlier scores," In Proceedings of SDM, pp. 13-24, 2011.
[8]Kriegel, H.-P., Schubert, M., and Zimek, A., "Angle-based outlier detection in high-dimensional data," In Proceedings of SIGKDD, pp. 444-452, 2008.
[9]Lee, W. S, and Liu, B., "Learning with positive and unlabeled examples using weighted Logistic Regression," In Proceedings of ICML, pp. 448-455, 2003.
[10]Liu, F. T., Ting, K. M., and Zhou, Z.-H., "Isolation forest," In Proceedings of ICDM, pp. 413-422, 2008.
[11]Parzen, E., "On estimation of a probability density function and mode," Ann. Math. Statist., 33(3), pp. 1065-1076, 1962.
[12]Ramaswamy, S., Rastogi, R., and Shim, K., "Efficient algorithms for mining outliers from large data sets," In Proceedings of SIGMOD, pp. 427-438, 2000.
[13]Scholkopf, B., Platt, J. C., Shawe-Taylor, J. C., Smola, A. J., and Williamson, R. C., "Estimating the Support of a High-Dimensional Distribution," Neural Computation, 13(7), pp. 1443-1471, 2001.
[14]Sugiyama, M., and Borgwardt, K., "Rapid distance-based outlier detection via sampling," Advances in NIPS, pp. 467-475, 2013.
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].