All Projects â†’ sheriferson â†’ Simplestatistics

sheriferson / Simplestatistics

Licence: mit
🎲 Simple statistical functions implemented in readable Python.

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Simplestatistics

Wikipediatrend
A convenience R package for getting Wikipedia article access statistics (and more).
Stars: ✭ 73 (-17.05%)
Mutual labels:  statistics
Tsv Utils
eBay's TSV Utilities: Command line tools for large, tabular data files. Filtering, statistics, sampling, joins and more.
Stars: ✭ 1,215 (+1280.68%)
Mutual labels:  statistics
Dstat
Versatile resource statistics tool (the real one, not the Red Hat clone)
Stars: ✭ 1,255 (+1326.14%)
Mutual labels:  statistics
Fermat.js
Mathematics and statistics library for TypeScript.
Stars: ✭ 74 (-15.91%)
Mutual labels:  statistics
Linqstatistics
Linq extensions to calculate basic statistics
Stars: ✭ 78 (-11.36%)
Mutual labels:  statistics
Bat.jl
A Bayesian Analysis Toolkit in Julia
Stars: ✭ 82 (-6.82%)
Mutual labels:  statistics
Fecon236
Tools for financial economics. Curated wrapper over Python ecosystem. Source code for fecon235 Jupyter notebooks.
Stars: ✭ 72 (-18.18%)
Mutual labels:  statistics
Wp Ulike
WP ULike enables you to add Ajax Like button into your WordPress and allowing your visitors to like and unlike posts,comments, BuddyPress activities & bbPress Topics
Stars: ✭ 84 (-4.55%)
Mutual labels:  statistics
Superseriousstats
superseriousstats is a fast and efficient program to create statistics out of various types of chat logs
Stars: ✭ 78 (-11.36%)
Mutual labels:  statistics
Weightedcalcs
Pandas-based utility to calculate weighted means, medians, distributions, standard deviations, and more.
Stars: ✭ 83 (-5.68%)
Mutual labels:  statistics
Projpred
Projection predictive variable selection
Stars: ✭ 76 (-13.64%)
Mutual labels:  statistics
Github Traffic
Get the Github traffic for the specified repository
Stars: ✭ 77 (-12.5%)
Mutual labels:  statistics
Awesome time series in python
This curated list contains python packages for time series analysis
Stars: ✭ 1,245 (+1314.77%)
Mutual labels:  statistics
Volume approximation
Practical volume computation and sampling in high dimensions
Stars: ✭ 75 (-14.77%)
Mutual labels:  statistics
Memcache Info
Simple and efficient way to show information about Memcache.
Stars: ✭ 84 (-4.55%)
Mutual labels:  statistics
Categoricalarrays.jl
Arrays for working with categorical data (both nominal and ordinal)
Stars: ✭ 71 (-19.32%)
Mutual labels:  statistics
Openml R
R package to interface with OpenML
Stars: ✭ 81 (-7.95%)
Mutual labels:  statistics
Pypistats
Command-line interface to PyPI Stats API to get download stats for Python packages
Stars: ✭ 86 (-2.27%)
Mutual labels:  statistics
Pumas.jl
Pharmaceutical Modeling and Simulation for Nonlinear Mixed Effects (NLME), Quantiative Systems Pharmacology (QsP), Physiologically-Based Pharmacokinetics (PBPK) models mixed with machine learning
Stars: ✭ 84 (-4.55%)
Mutual labels:  statistics
Orgstat
Statistics visualizer for org-mode
Stars: ✭ 83 (-5.68%)
Mutual labels:  statistics

simplestatistics

Circle CI codecov Documentation Status PyPI version

simple-statistics for Python.

simplestatistics is compatible with Python 3.

Version 0.4.0 was the last version to not use Python 3 specific features. Going forward, simplestatistics will adopt Python 3 features (e.g., type hints).

Installation

Install the current PyPI release:

pip install simplestatistics

Or install the development version from GitHub:

pip install git+https://github.com/sheriferson/simplestatistics

Usage

>>> import simplestatistics as ss
>>> ss.mean([1, 2, 3])
2.0
>>> ss.t_test([1, 2, 2.4, 3, 0.9], 2)
-0.3461277235039042

Documentation

You can read the documentation online.

Or you can generate it yourself:

Inside simplestatistics/.

make html

Documentation will be generated in _build/html/.

Tests

If you want coverage reports, you need to have coverage installed:

pip install coverage
nosetests --with-coverage --cover-package=simplestatistics --with-doctest

Otherwise, to just run the tests:

nosetests --with-doctest

The code adheres to PEP8 guidelines except for the following checkers:

  • invalid-name
  • len-as-condition
  • superfluous-parens
  • unidiomatic-typecheck

To lint the code, make sure you have [pylint] installed (pip install pylint), cd into the simplestatistics/statistics directory, then run:

pylint -d 'invalid-name, len-as-condition, superfluous-parens, unidiomatic-typecheck' *.py

Functions and examples

Descriptive statistics

Function Example
Min min([-3, 0, 3])
Max max([1, 2, 3])
Sum sum([1, 2, 3.5])
Quantiles quantile([3, 6, 7, 8, 8, 9, 10, 13, 15, 16, 20], [0.25, 0.75])
Product product([1.25, 2.75], [2.5, 3.40])

Measures of central tendency

Function Example
Mean mean([1, 2, 3])
Median median([10, 2, -5, -1])
Mode mode([2, 1, 3, 2, 1])
Geometric mean geometric_mean([1, 10])
Harmonic mean harmonic_mean([1, 2, 4])
Root mean square root_mean_square([1, -1, 1, -1])
Add to mean add_to_mean(40, 4, (10, 12))
Skewness skew([1, 2, 5])
Kurtosis kurtosis([1, 2, 3, 4, 5])

Measures of dispersion

Function Example
Sample and population variance variance([1, 2, 3], sample = True)
Sample and population Standard deviation standard_deviation([1, 2, 3], sample = True)
Sample and population Coefficient of variation coefficient_of_variation([1, 2, 3], sample = True)
Interquartile range interquartile_range([1, 3, 5, 7])
Sum of Nth power deviations sum_nth_power_deviations([-1, 0, 2, 4], 3)
Sample and population Standard scores (z-scores) z_scores([-2, -1, 0, 1, 2], sample = True)

Linear regression

Function Example
Simple linear regression linear_regression([1, 2, 3, 4, 5], [4, 4.5, 5.5, 5.3, 6])
Linear regression line function generator linear_regression_line([.5, 9.5])([1, 2, 3])

Similarity

Function Example
Correlation correlate([2, 1, 0, -1, -2, -3, -4, -5], [0, 1, 1, 2, 3, 2, 4, 5])
Covariance covariance([1,2,3,4,5,6], [6,5,4,3,2,1])

Distributions

Function Example
Factorial factorial(20) or factorial([1, 5, 20])
Choose choose(5, 3)
Normal distribution normal(4, 8, 2) or normal([1, 4], 8, 2)
Binomial distribution binomial(4, 12, 0.2) or binomial([3,4,5], 12, 0.5)
Bernoulli distribution bernoulli(0.25)
Poisson distribution poisson(3, [0, 1, 2, 3])
Gamma function gamma_function([1, 2, 3, 4, 5])
Beta distribution beta([.1, .2, .3], 5, 2)
One-sample t-test t_test([1, 2, 3, 4, 5, 6], 3.385)
Chi Squared Distribution Table chi_squared_dist_table(k = 10, p = .01)

Classifiers

Function Example
Naive Bayesian classifier See documentation for examples of how to train and classify.
Perceptron See documentation for examples of how to train and classify.

Errors

Function Example
Gauss error function error_function(1)

Hyperbolic functions

Function Example
sinh sinh(2)
cosh cosh(2.5)
tanh tanh(.2)

Spirit and rules

  • Everything should be implemented in raw, organic, locally sourced Python.
  • Use libraries only if you have to and only when unrelated to the math/statistics. For example, from functools import reduce to make reduce available for those using python3. That's okay, because it's about making Python work and not about making the stats easier.
  • It's okay to use operators and functions if they correspond to regular calculator buttons. For example, all calculators have a built-in square root function, so there is no need to implement that ourselves, we can use math.sqrt(). Anything beyond that, like mean, median, we have to write ourselves.

Pull requests are welcome!

Contributors

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].