All Projects → jsvine → Weightedcalcs

jsvine / Weightedcalcs

Licence: mit
Pandas-based utility to calculate weighted means, medians, distributions, standard deviations, and more.

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Weightedcalcs

Choochoo
Training Diary
Stars: ✭ 186 (+124.1%)
Mutual labels:  statistics, pandas
veridical-flow
Making it easier to build stable, trustworthy data-science pipelines.
Stars: ✭ 28 (-66.27%)
Mutual labels:  statistics, pandas
hdfe
No description or website provided.
Stars: ✭ 22 (-73.49%)
Mutual labels:  statistics, pandas
Algorithmic-Trading
I have been deeply interested in algorithmic trading and systematic trading algorithms. This Repository contains the code of what I have learnt on the way. It starts form some basic simple statistics and will lead up to complex machine learning algorithms.
Stars: ✭ 47 (-43.37%)
Mutual labels:  statistics, pandas
Dataframe Go
DataFrames for Go: For statistics, machine-learning, and data manipulation/exploration
Stars: ✭ 487 (+486.75%)
Mutual labels:  statistics, pandas
Machine Learning With Python
Practice and tutorial-style notebooks covering wide variety of machine learning techniques
Stars: ✭ 2,197 (+2546.99%)
Mutual labels:  statistics, pandas
Fecon236
Tools for financial economics. Curated wrapper over Python ecosystem. Source code for fecon235 Jupyter notebooks.
Stars: ✭ 72 (-13.25%)
Mutual labels:  statistics, pandas
Sweetviz
Visualize and compare datasets, target values and associations, with one line of code.
Stars: ✭ 1,851 (+2130.12%)
Mutual labels:  statistics, pandas
Stats Maths With Python
General statistics, mathematical programming, and numerical/scientific computing scripts and notebooks in Python
Stars: ✭ 381 (+359.04%)
Mutual labels:  statistics, pandas
fairlens
Identify bias and measure fairness of your data
Stars: ✭ 51 (-38.55%)
Mutual labels:  statistics, pandas
Data-Analyst-Nanodegree
Kai Sheng Teh - Udacity Data Analyst Nanodegree
Stars: ✭ 42 (-49.4%)
Mutual labels:  statistics, pandas
Fecon235
Notebooks for financial economics. Keywords: Jupyter notebook pandas Federal Reserve FRED Ferbus GDP CPI PCE inflation unemployment wage income debt Case-Shiller housing asset portfolio equities SPX bonds TIPS rates currency FX euro EUR USD JPY yen XAU gold Brent WTI oil Holt-Winters time-series forecasting statistics econometrics
Stars: ✭ 708 (+753.01%)
Mutual labels:  statistics, pandas
Pingouin
Statistical package in Python based on Pandas
Stars: ✭ 651 (+684.34%)
Mutual labels:  statistics, pandas
Pandas Profiling
Create HTML profiling reports from pandas DataFrame objects
Stars: ✭ 8,329 (+9934.94%)
Mutual labels:  statistics, pandas
Volume approximation
Practical volume computation and sampling in high dimensions
Stars: ✭ 75 (-9.64%)
Mutual labels:  statistics
Tsv Utils
eBay's TSV Utilities: Command line tools for large, tabular data files. Filtering, statistics, sampling, joins and more.
Stars: ✭ 1,215 (+1363.86%)
Mutual labels:  statistics
Wikipediatrend
A convenience R package for getting Wikipedia article access statistics (and more).
Stars: ✭ 73 (-12.05%)
Mutual labels:  statistics
Locopy
locopy: Loading/Unloading to Redshift and Snowflake using Python.
Stars: ✭ 73 (-12.05%)
Mutual labels:  pandas
Bat.jl
A Bayesian Analysis Toolkit in Julia
Stars: ✭ 82 (-1.2%)
Mutual labels:  statistics
Superseriousstats
superseriousstats is a fast and efficient program to create statistics out of various types of chat logs
Stars: ✭ 78 (-6.02%)
Mutual labels:  statistics

Version Build status Code coverage Support Python versions

weightedcalcs

weightedcalcs is a pandas-based Python library for calculating weighted means, medians, standard deviations, and more.

Features

  • Plays well with pandas.
  • Support for weighted means, medians, quantiles, standard deviations, and distributions.
  • Support for grouped calculations, using DataFrameGroupBy objects.
  • Raises an error when your data contains null-values.
  • Full test coverage.

Installation

pip install weightedcalcs

Usage

Getting started

Every weighted calculation in weightedcalcs begins with an instance of the weightedcalcs.Calculator class. Calculator takes one argument: the name of your weighting variable. So if you're analyzing a survey where the weighting variable is called "resp_weight", you'd do this:

import weightedcalcs as wc
calc = wc.Calculator("resp_weight")

Types of calculations

Currently, weightedcalcs.Calculator supports the following calculations:

  • calc.mean(my_data, value_var): The weighted arithmetic average of value_var.
  • calc.quantile(my_data, value_var, q): The weighted quantile of value_var, where q is between 0 and 1.
  • calc.median(my_data, value_var): The weighted median of value_var, equivalent to .quantile(...) where q=0.5.
  • calc.std(my_data, value_var): The weighted standard deviation of value_var.
  • calc.distribution(my_data, value_var): The weighted proportions of value_var, interpreting value_var as categories.
  • calc.count(my_data): The weighted count of all observations, i.e., the total weight.
  • calc.sum(my_data, value_var): The weighted sum of value_var.

The obj parameter above should one of the following:

  • A pandas DataFrame object
  • A pandas DataFrame.groupby object
  • A plain Python dictionary where the keys are column names and the values are equal-length lists.

Basic example

Below is a basic example of using weightedcalcs to find what percentage of Wyoming residents are married, divorced, et cetera:

import pandas as pd
import weightedcalcs as wc

# Load the 2015 American Community Survey person-level responses for Wyoming
responses = pd.read_csv("examples/data/acs-2015-pums-wy-simple.csv")

# `PWGTP` is the weighting variable used in the ACS's person-level data
calc = wc.Calculator("PWGTP")

# Get the distribution of marriage-status responses
calc.distribution(responses, "marriage_status").round(3).sort_values(ascending=False)

# -- Output --
# marriage_status
# Married                                0.425
# Never married or under 15 years old    0.421
# Divorced                               0.097
# Widowed                                0.046
# Separated                              0.012
# Name: PWGTP, dtype: float64

More examples

See this notebook to see examples of other calculations, including grouped calculations.

Max Ghenis has created a version of the example notebook that can be run directly in your browser, via Google Colab.

Weightedcalcs in the wild

Other Python weighted-calculation libraries

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].