All Projects → c-bata → Outlier Utils

c-bata / Outlier Utils

Licence: mit
Utility library for detecting and removing outliers from normally distributed datasets using the Smirnov-Grubbs test.

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Outlier Utils

Probtopdf
Turn online textbook into Exam-friendly, offline, searchable PDF
Stars: ✭ 27 (-22.86%)
Mutual labels:  statistics
Facsimile
Facsimile Simulation Library
Stars: ✭ 20 (-42.86%)
Mutual labels:  statistics
Unrealnetworkprofiler
A modern WPF based Network Profiler for Unreal Engine.
Stars: ✭ 29 (-17.14%)
Mutual labels:  statistics
Homer
HOMER - 100% Open-Source SIP / VoIP Packet Capture & Monitoring
Stars: ✭ 855 (+2342.86%)
Mutual labels:  statistics
Analytics
Simple, open-source, lightweight (< 1 KB) and privacy-friendly web analytics alternative to Google Analytics.
Stars: ✭ 9,469 (+26954.29%)
Mutual labels:  statistics
Census Data Aggregator
Combine U.S. census data responsibly
Stars: ✭ 28 (-20%)
Mutual labels:  statistics
R actuarial
El objetivo de este repositorio es brindar un apoyo a la comunidad interesada en mejorar sus técnicas en el lenguaje de programación R o emprenderlo desde un punto de vista muy aplicado. Un repositorio con códigos de R para aplicaciones actuariales: probabilidad, estadística, riesgo y finanzas.
Stars: ✭ 25 (-28.57%)
Mutual labels:  statistics
Benchee
Easy and extensible benchmarking in Elixir providing you with lots of statistics!
Stars: ✭ 971 (+2674.29%)
Mutual labels:  statistics
Pandas Profiling
Create HTML profiling reports from pandas DataFrame objects
Stars: ✭ 8,329 (+23697.14%)
Mutual labels:  statistics
Scikit Extremes
scikit-extremes is a basic statistical package to perform univariate extreme value calculations using Python
Stars: ✭ 31 (-11.43%)
Mutual labels:  statistics
Dramaanalysis
An R package for analysis of dramatic texts
Stars: ✭ 10 (-71.43%)
Mutual labels:  statistics
Hotcold
Smart touch typing learning with instant key glow indications, live statistics, live graphs and dynamic course creation.
Stars: ✭ 12 (-65.71%)
Mutual labels:  statistics
Asterisk Cdr Viewer
Simple and fast viewer for asterisk CDRs / recordings
Stars: ✭ 29 (-17.14%)
Mutual labels:  statistics
Dominhhai.github.io
My Blog
Stars: ✭ 8 (-77.14%)
Mutual labels:  statistics
Uc Davis Cs Exams Analysis
📈 Regression and Classification with UC Davis student quiz data and exam data
Stars: ✭ 33 (-5.71%)
Mutual labels:  statistics
Socrat
A Dynamic Web Toolbox for Interactive Data Processing, Analysis, and Visualization
Stars: ✭ 26 (-25.71%)
Mutual labels:  statistics
Pystan2
PyStan, the Python interface to Stan
Stars: ✭ 915 (+2514.29%)
Mutual labels:  statistics
Statistics
A stab at a very simple statistics class for Objective-C
Stars: ✭ 34 (-2.86%)
Mutual labels:  statistics
Clistats
A command line interface tool to compute statistics from a file or the command line.
Stars: ✭ 33 (-5.71%)
Mutual labels:  statistics
Regexanalyzer
Regular Expression Analyzer and Composer for Node.js / XPCOM / Browser Javascript, PHP, Python
Stars: ✭ 29 (-17.14%)
Mutual labels:  statistics

============= outlier-utils

.. image:: https://travis-ci.org/c-bata/outlier-utils.svg?branch=master :target: https://travis-ci.org/c-bata/outlier-utils

Utility library for detecting and removing outliers from normally distributed datasets using the Smirnov-Grubbs_ test.

Requirements

  • Python_ (version 2.7, 3.4 and 3.5)
  • SciPy_
  • NumPy_

Overview

Both the two-sided and the one-sided version of the test are supported. The former allows extracting outliers from both ends of the dataset, whereas the latter only considers min/max outliers. When running a test, every outlier will be removed until none can be found in the dataset. The output of the test is flexible enough to match several use cases. By default, the outlier-free data will be returned, but the test can also return the outliers themselves or their indices in the original dataset.

Examples

  • Two-sided Grubbs test with a Pandas series input

::

from outliers import smirnov_grubbs as grubbs import pandas as pd data = pd.Series([1, 8, 9, 10, 9]) grubbs.test(data, alpha=0.05) 1 8 2 9 3 10 4 9 dtype: int64

  • Two-sided Grubbs test with a NumPy array input

::

import numpy as np data = np.array([1, 8, 9, 10, 9]) grubbs.test(data, alpha=0.05) array([ 8, 9, 10, 9])

  • One-sided (min) test returning outlier indices

::

grubbs.min_test_indices([8, 9, 10, 1, 9], alpha=0.05) [3]

  • One-sided (max) tests returning outliers

::

grubbs.max_test_outliers([8, 9, 10, 1, 9], alpha=0.05) [] grubbs.max_test_outliers([8, 9, 10, 50, 9], alpha=0.05) [50]

.. _Smirnov-Grubbs: https://en.wikipedia.org/wiki/Grubbs%27_test_for_outliers .. _SciPy: https://www.scipy.org/ .. _NumPy: http://www.numpy.org/ .. _Python: https://www.python.org/

License

This software is licensed under the MIT License.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].