All Projects → predictive-analytics-lab → Data Science Types

predictive-analytics-lab / Data Science Types

Licence: apache-2.0
Mypy stubs, i.e., type information, for numpy, pandas and matplotlib

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Data Science Types

Data Analysis
主要是爬虫与数据分析项目总结,外加建模与机器学习,模型的评估。
Stars: ✭ 142 (-21.11%)
Mutual labels:  pandas, numpy, matplotlib
Opendatawrangling
공공데이터 분석
Stars: ✭ 148 (-17.78%)
Mutual labels:  pandas, numpy, matplotlib
Pynamical
Pynamical is a Python package for modeling and visualizing discrete nonlinear dynamical systems, chaos, and fractals.
Stars: ✭ 458 (+154.44%)
Mutual labels:  pandas, numpy, matplotlib
Machine Learning Projects
This repository consists of all my Machine Learning Projects.
Stars: ✭ 135 (-25%)
Mutual labels:  pandas, numpy, matplotlib
Mlcourse.ai
Open Machine Learning Course
Stars: ✭ 7,963 (+4323.89%)
Mutual labels:  pandas, numpy, matplotlib
Stats Maths With Python
General statistics, mathematical programming, and numerical/scientific computing scripts and notebooks in Python
Stars: ✭ 381 (+111.67%)
Mutual labels:  pandas, numpy, matplotlib
Data Science For Marketing Analytics
Achieve your marketing goals with the data analytics power of Python
Stars: ✭ 127 (-29.44%)
Mutual labels:  pandas, numpy, matplotlib
The-Data-Visualization-Workshop
A New, Interactive Approach to Learning Data Visualization
Stars: ✭ 59 (-67.22%)
Mutual labels:  numpy, pandas, matplotlib
Machine Learning Alpine
Alpine Container for Machine Learning
Stars: ✭ 30 (-83.33%)
Mutual labels:  pandas, numpy, matplotlib
Pythondatasciencehandbook
The book was written and tested with Python 3.5, though other Python versions (including Python 2.7) should work in nearly all cases.
Stars: ✭ 31,995 (+17675%)
Mutual labels:  pandas, numpy, matplotlib
Ml Cheatsheet
A constantly updated python machine learning cheatsheet
Stars: ✭ 136 (-24.44%)
Mutual labels:  pandas, numpy, matplotlib
Stock Market Analysis And Prediction
Stock Market Analysis and Prediction is the project on technical analysis, visualization and prediction using data provided by Google Finance.
Stars: ✭ 112 (-37.78%)
Mutual labels:  pandas, numpy, matplotlib
Python for data analysis 2nd chinese version
《利用Python进行数据分析·第2版》
Stars: ✭ 4,049 (+2149.44%)
Mutual labels:  pandas, numpy, matplotlib
Data Science Ipython Notebooks
Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
Stars: ✭ 22,048 (+12148.89%)
Mutual labels:  pandas, numpy, matplotlib
Ai Learn
人工智能学习路线图,整理近200个实战案例与项目,免费提供配套教材,零基础入门,就业实战!包括:Python,数学,机器学习,数据分析,深度学习,计算机视觉,自然语言处理,PyTorch tensorflow machine-learning,deep-learning data-analysis data-mining mathematics data-science artificial-intelligence python tensorflow tensorflow2 caffe keras pytorch algorithm numpy pandas matplotlib seaborn nlp cv等热门领域
Stars: ✭ 4,387 (+2337.22%)
Mutual labels:  pandas, numpy, matplotlib
Mexican Government Report
Text Mining on the 2019 Mexican Government Report, covering from extracting text from a PDF file to plotting the results.
Stars: ✭ 473 (+162.78%)
Mutual labels:  pandas, numpy, matplotlib
data-analysis-using-python
Data Analysis Using Python: A Beginner’s Guide Featuring NYC Open Data
Stars: ✭ 81 (-55%)
Mutual labels:  numpy, pandas, matplotlib
Python-Matematica
Explorando aspectos fundamentais da matemática com Python e Jupyter
Stars: ✭ 41 (-77.22%)
Mutual labels:  numpy, pandas, matplotlib
Baby Names Analysis
Data ETL & Analysis on the dataset 'Baby Names from Social Security Card Applications - National Data'.
Stars: ✭ 557 (+209.44%)
Mutual labels:  pandas, numpy, matplotlib
Abu
阿布量化交易系统(股票,期权,期货,比特币,机器学习) 基于python的开源量化交易,量化投资架构
Stars: ✭ 8,589 (+4671.67%)
Mutual labels:  pandas, numpy, matplotlib

Mypy type stubs for NumPy, pandas, and Matplotlib

Join the chat at https://gitter.im/data-science-types/community

⚠️ this project has mostly stopped development ⚠️

The pandas team and the numpy team are both in the process of integrating type stubs into their codebases, and we don't see the point of competing with them.


This is a PEP-561-compliant stub-only package which provides type information for matplotlib, numpy and pandas. The mypy type checker (or pytype or PyCharm) can recognize the types in these packages by installing this package.

NOTE: This is a work in progress

Many functions are already typed, but a lot is still missing (NumPy and pandas are huge libraries). Chances are, you will see a message from Mypy claiming that a function does not exist when it does exist. If you encounter missing functions, we would be delighted for you to send a PR. If you are unsure of how to type a function, we can discuss it.

Installing

You can get this package from PyPI:

pip install data-science-types

To get the most up-to-date version, install it directly from GitHub:

pip install git+https://github.com/predictive-analytics-lab/data-science-types

Or clone the repository somewhere and do pip install -e ..

Examples

These are the kinds of things that can be checked:

Array creation

import numpy as np

arr1: np.ndarray[np.int64] = np.array([3, 7, 39, -3])  # OK
arr2: np.ndarray[np.int32] = np.array([3, 7, 39, -3])  # Type error
arr3: np.ndarray[np.int32] = np.array([3, 7, 39, -3], dtype=np.int32)  # OK
arr4: np.ndarray[float] = np.array([3, 7, 39, -3], dtype=float)  # Type error: the type of ndarray can not be just "float"
arr5: np.ndarray[np.float64] = np.array([3, 7, 39, -3], dtype=float)  # OK

Operations

import numpy as np

arr1: np.ndarray[np.int64] = np.array([3, 7, 39, -3])
arr2: np.ndarray[np.int64] = np.array([4, 12, 9, -1])

result1: np.ndarray[np.int64] = np.divide(arr1, arr2)  # Type error
result2: np.ndarray[np.float64] = np.divide(arr1, arr2)  # OK

compare: np.ndarray[np.bool_] = (arr1 == arr2)

Reductions

import numpy as np

arr: np.ndarray[np.float64] = np.array([[1.3, 0.7], [-43.0, 5.6]])

sum1: int = np.sum(arr)  # Type error
sum2: np.float64 = np.sum(arr)  # OK
sum3: float = np.sum(arr)  # Also OK: np.float64 is a subclass of float
sum4: np.ndarray[np.float64] = np.sum(arr, axis=0)  # OK

# the same works with np.max, np.min and np.prod

Philosophy

The goal is not to recreate the APIs exactly. The main goal is to have useful checks on our code. Often the actual APIs in the libraries is more permissive than the type signatures in our stubs; but this is (usually) a feature and not a bug.

Contributing

We always welcome contributions. All pull requests are subject to CI checks. We check for compliance with Mypy and that the file formatting conforms to our Black specification.

You can install these dev dependencies via

pip install -e '.[dev]'

This will also install NumPy, pandas, and Matplotlib to be able to run the tests.

Running CI locally (recommended)

We include a script for running the CI checks that are triggered when a PR is opened. To test these out locally, you need to install the type stubs in your environment. Typically, you would do this with

pip install -e .

Then use the check_all.sh script to run all tests:

./check_all.sh

Below we describe how to run the various checks individually, but check_all.sh should be easier to use.

Checking compliance with Mypy

The settings for Mypy are specified in the mypy.ini file in the repository. Just running

mypy tests

from the base directory should take these settings into account. We enforce 0 Mypy errors.

Formatting with black

We use Black to format the stub files. First, install black and then run

black .

from the base directory.

Pytest

python -m pytest -vv tests/

Flake8

flake8 *-stubs

License

Apache 2.0

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].