All Projects → erdogant → distfit

erdogant / distfit

Licence: other
distfit is a python library for probability density fitting.

Programming Languages

Jupyter Notebook
11667 projects
python
139335 projects - #7 most used programming language
shell
77523 projects

Projects that are alternatives of or similar to distfit

Termplotlib
Plotting on the command line
Stars: ✭ 294 (+17.6%)
Mutual labels:  pypi, plot
terminalplot
No description or website provided.
Stars: ✭ 40 (-84%)
Mutual labels:  pypi, plot
MyJWT
A cli for cracking, testing vulnerabilities on Json Web Token(JWT)
Stars: ✭ 92 (-63.2%)
Mutual labels:  pypi
Glimma
Glimma R package
Stars: ✭ 48 (-80.8%)
Mutual labels:  plot
mu-server
A lightweight modern webserver for Java
Stars: ✭ 31 (-87.6%)
Mutual labels:  sse
yavdb
Yet Another Vulnerability Database
Stars: ✭ 14 (-94.4%)
Mutual labels:  pypi
maptalks.plot
🎨 | maptalks plot
Stars: ✭ 19 (-92.4%)
Mutual labels:  plot
Cn2an
📦 快速转化「中文数字」和「阿拉伯数字」~ (最新特性:分数,日期、温度等转化)
Stars: ✭ 249 (-0.4%)
Mutual labels:  pypi
mkdocs-rss-plugin
MkDocs plugin to generate a RSS feeds for created and updated pages, using git log and YAML frontmatter (page.meta).
Stars: ✭ 43 (-82.8%)
Mutual labels:  pypi
PyGLM
Fast OpenGL Mathematics (GLM) for Python
Stars: ✭ 167 (-33.2%)
Mutual labels:  pypi
Random-Plex-Movie
Python App which chooses a random movie from your Plex Library.
Stars: ✭ 17 (-93.2%)
Mutual labels:  pypi
gagar
Standalone graphical agar.io Python client/bot using GTK and agarnet
Stars: ✭ 21 (-91.6%)
Mutual labels:  pypi
ternary-logic
Support for ternary logic in SSE, XOP, AVX2 and x86 programs
Stars: ✭ 21 (-91.6%)
Mutual labels:  sse
simd-byte-lookup
SIMDized check which bytes are in a set
Stars: ✭ 23 (-90.8%)
Mutual labels:  sse
sigstar
add significance stars to MATLAB plots
Stars: ✭ 33 (-86.8%)
Mutual labels:  plot
Turbo-Transpose
Transpose: SIMD Integer+Floating Point Compression Filter
Stars: ✭ 50 (-80%)
Mutual labels:  sse
publib
Produce publication-level quality images on top of Matplotlib
Stars: ✭ 34 (-86.4%)
Mutual labels:  plot
Goodreads visualization
A Jupyter notebook where I play with my Goodreads data
Stars: ✭ 51 (-79.6%)
Mutual labels:  plot
rush
R One-Liners from the Shell
Stars: ✭ 44 (-82.4%)
Mutual labels:  plot
vfxwindow
Python Qt Window class for compatibility between VFX programs
Stars: ✭ 80 (-68%)
Mutual labels:  pypi

Python Pypi Docs LOC Downloads Downloads License Forks Issues Project Status DOI Medium Colab Donate

Read the Medium Blog for more information

distfit is a python package for probability density fitting of univariate distributions for random variables. With the random variable as an input, distfit can find the best fit for parametric, non-parametric, and discrete distributions.

  • For the parametric approach, the distfit library can determine the best fit across 89 theoretical distributions. To score the fit, one of the scoring statistics for the good-of-fitness test can be used used, such as RSS/SSE, Wasserstein, Kolmogorov-Smirnov (KS), or Energy. After finding the best-fitted theoretical distribution, the loc, scale, and arg parameters are returned, such as mean and standard deviation for normal distribution.

  • For the non-parametric approach, the distfit library contains two methods, the quantile and percentile method. Both methods assume that the data does not follow a specific probability distribution. In the case of the quantile method, the quantiles of the data are modeled whereas for the percentile method, the percentiles are modeled.

  • In case the dataset contains discrete values, the distift library contains the option for discrete fitting. The best fit is then derived using the binomial distribution.

⭐️ Star this repo if you like it ⭐️

Documentation pages

On the documentation pages you can find detailed information about the distfit library with many examples.

Installation

Install distfit from PyPI
pip install distfit
Install from github source (beta version)
 install git+https://github.com/erdogant/distfit
Check version
import distfit
print(distfit.__version__)
The following functions are available after installation:
# Import library
from distfit import distfit

dfit = distfit()        # Initialize 
dfit.fit_transform(X)   # Fit distributions on empirical data X
dfit.predict(y)         # Predict the probability of the resonse variables
dfit.plot()             # Plot the best fitted distribution (y is included if prediction is made)

Examples

Example: Quick start to find best fit for your input data
# [distfit] >INFO> fit
# [distfit] >INFO> transform
# [distfit] >INFO> [norm      ] [0.00 sec] [RSS: 0.00108326] [loc=-0.048 scale=1.997]
# [distfit] >INFO> [expon     ] [0.00 sec] [RSS: 0.404237] [loc=-6.897 scale=6.849]
# [distfit] >INFO> [pareto    ] [0.00 sec] [RSS: 0.404237] [loc=-536870918.897 scale=536870912.000]
# [distfit] >INFO> [dweibull  ] [0.06 sec] [RSS: 0.0115552] [loc=-0.031 scale=1.722]
# [distfit] >INFO> [t         ] [0.59 sec] [RSS: 0.00108349] [loc=-0.048 scale=1.997]
# [distfit] >INFO> [genextreme] [0.17 sec] [RSS: 0.00300806] [loc=-0.806 scale=1.979]
# [distfit] >INFO> [gamma     ] [0.05 sec] [RSS: 0.00108459] [loc=-1862.903 scale=0.002]
# [distfit] >INFO> [lognorm   ] [0.32 sec] [RSS: 0.00121597] [loc=-110.597 scale=110.530]
# [distfit] >INFO> [beta      ] [0.10 sec] [RSS: 0.00105629] [loc=-16.364 scale=32.869]
# [distfit] >INFO> [uniform   ] [0.00 sec] [RSS: 0.287339] [loc=-6.897 scale=14.437]
# [distfit] >INFO> [loggamma  ] [0.12 sec] [RSS: 0.00109042] [loc=-370.746 scale=55.722]
# [distfit] >INFO> Compute confidence intervals [parametric]
# [distfit] >INFO> Compute significance for 9 samples.
# [distfit] >INFO> Multiple test correction method applied: [fdr_bh].
# [distfit] >INFO> Create PDF plot for the parametric method.
# [distfit] >INFO> Mark 5 significant regions
# [distfit] >INFO> Estimated distribution: beta [loc:-16.364265, scale:32.868811]

Example: Plot summary of the tested distributions

After we have a fitted model, we can make some predictions using the theoretical distributions. After making some predictions, we can plot again but now the predictions are automatically included.

Example: Make predictions using the fitted distribution

Example: Test for one specific distributions

The full list of distributions is listed here: https://erdogant.github.io/distfit/pages/html/Parametric.html

Example: Test for multiple distributions

The full list of distributions is listed here: https://erdogant.github.io/distfit/pages/html/Parametric.html

Example: Fit discrete distribution
from scipy.stats import binom
# Generate random numbers

# Set parameters for the test-case
n = 8
p = 0.5

# Generate 10000 samples of the distribution of (n, p)
X = binom(n, p).rvs(10000)
print(X)

# [5 1 4 5 5 6 2 4 6 5 4 4 4 7 3 4 4 2 3 3 4 4 5 1 3 2 7 4 5 2 3 4 3 3 2 3 5
#  4 6 7 6 2 4 3 3 5 3 5 3 4 4 4 7 5 4 5 3 4 3 3 4 3 3 6 3 3 5 4 4 2 3 2 5 7
#  5 4 8 3 4 3 5 4 3 5 5 2 5 6 7 4 5 5 5 4 4 3 4 5 6 2...]

# Import distfit
from distfit import distfit

# Initialize for discrete distribution fitting
dfit = distfit(method='discrete')

# Run distfit to and determine whether we can find the parameters from the data.
dfit.fit_transform(X)

# [distfit] >fit..
# [distfit] >transform..
# [distfit] >Fit using binomial distribution..
# [distfit] >[binomial] [SSE: 7.79] [n: 8] [p: 0.499959] [chi^2: 1.11]
# [distfit] >Compute confidence interval [discrete]

Example: Make predictions on unseen data for discrete distribution

Example: Generate samples based on the fitted distribution

Contributors

Setting up and maintaining distfit has been possible thanks to users and contributors. Thanks:

Citation

Please cite distfit in your publications if this is useful for your research. See column right for citation information.

Maintainer

  • Erdogan Taskesen, github: erdogant
  • Contributions are welcome.
  • If you wish to buy me a Coffee for this work, it is very appreciated :)
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].