All Projects → zhuzilin → Np_ml

zhuzilin / Np_ml

Licence: mit
A tool library of classical machine learning algorithms with only numpy.

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Np ml

Py
Repository to store sample python programs for python learning
Stars: ✭ 4,154 (+2207.78%)
Mutual labels:  numpy
Scientific Visualization Book
An open access book on scientific visualization using python and matplotlib
Stars: ✭ 6,336 (+3420%)
Mutual labels:  numpy
Pyhf
pure-Python HistFactory implementation with tensors and autodiff
Stars: ✭ 171 (-5%)
Mutual labels:  numpy
Ta
Technical Analysis Library using Pandas and Numpy
Stars: ✭ 2,649 (+1371.67%)
Mutual labels:  numpy
Micropython Ulab
a numpy-like fast vector module for micropython, circuitpython, and their derivatives
Stars: ✭ 166 (-7.78%)
Mutual labels:  numpy
Pysurvival
Open source package for Survival Analysis modeling
Stars: ✭ 169 (-6.11%)
Mutual labels:  numpy
Kalman Filter
Kalman Filter implementation in Python using Numpy only in 30 lines.
Stars: ✭ 161 (-10.56%)
Mutual labels:  numpy
Data Science Types
Mypy stubs, i.e., type information, for numpy, pandas and matplotlib
Stars: ✭ 180 (+0%)
Mutual labels:  numpy
Xarray
N-D labeled arrays and datasets in Python
Stars: ✭ 2,353 (+1207.22%)
Mutual labels:  numpy
Amadia
Astus' Mathematical Display Application : A GUI for Mathematics (Calculator, LaTeX Converter, Plotter, ... )
Stars: ✭ 172 (-4.44%)
Mutual labels:  numpy
Pytubes
A module for getting data into python from large data sources
Stars: ✭ 164 (-8.89%)
Mutual labels:  numpy
Funsor
Functional tensors for probabilistic programming
Stars: ✭ 164 (-8.89%)
Mutual labels:  numpy
Psi4numpy
Combining Psi4 and Numpy for education and development.
Stars: ✭ 170 (-5.56%)
Mutual labels:  numpy
Pywt
We're moving. Please visit https://github.com/PyWavelets
Stars: ✭ 161 (-10.56%)
Mutual labels:  numpy
Mars
Mars is a tensor-based unified framework for large-scale data computation which scales numpy, pandas, scikit-learn and Python functions.
Stars: ✭ 2,308 (+1182.22%)
Mutual labels:  numpy
Mobulaop
A Simple & Flexible Cross Framework Operators Toolkit
Stars: ✭ 161 (-10.56%)
Mutual labels:  numpy
Panthera
Data-frames & arrays on Clojure
Stars: ✭ 168 (-6.67%)
Mutual labels:  numpy
Andrew Ng Notes
This is Andrew NG Coursera Handwritten Notes.
Stars: ✭ 180 (+0%)
Mutual labels:  numpy
Tensorflow Ml Nlp
텐서플로우와 머신러닝으로 시작하는 자연어처리(로지스틱회귀부터 트랜스포머 챗봇까지)
Stars: ✭ 176 (-2.22%)
Mutual labels:  numpy
Ditching Excel For Python
Functionalities in Excel translated to Python
Stars: ✭ 172 (-4.44%)
Mutual labels:  numpy

NP_ML

Introduction

Classical machine learning algorithms implemented with pure numpy.

The repo to help you understand the ml algorithms instead of blindly using APIs.

Directory

Algorithm List

Classify

  • Perceptron

For perceptron, the example used the UCI/iris dataset. Since the basic perceptron is a binary classifier, the example used the data for versicolor and virginica. Also, since the iris dataset is not linear separable, the result may vary much.

Figure: versicolor and virginica. Hard to distinguish... Right?

Perceptron result on the Iris dataset.

  • K Nearest Neightbor (KNN)

For KNN, the example also used the UCI/iris dataset.

KNN result on the Iris dataset.

  • Naive Bayes

For naive bayes, the example used the UCI/SMS Spam Collection Dataset to do spam filtering.

For this example only, for tokenizing, nltk is used. And the result is listed below:

preprocessing data...
100%|#####################################################################| 5572/5572 [00:00<00:00, 8656.12it/s]
finish preprocessing data.

100%|#####################################################################| 1115/1115 [00:00<00:00, 55528.96it/s]
accuracy:  0.9757847533632287

We got 97.6% accuracy! That's nice!

And we try two examples, a typical ham and a typical spam. The result show as following.

example ham:
Po de :-):):-):-):-). No need job aha.
predict result:
ham

example spam:
u r a winner U ave been specially selected 2 receive 澹1000 cash or a 4* holiday (flights inc) speak to a 
live operator 2 claim 0871277810710p/min (18 )
predict result:
spam
  • Decision Tree

For decision tree, the example used the UCI/tic-tac-toe dataset. The input is the status of 9 block and the result is whether x win.

tic tac toe.

Here, we use ID3 and CART to generate a one layer tree.

For the ID3, we have:

root
├── 4 == b : True
├── 4 == o : False
└── 4 == x : True
accuracy = 0.385

And for CART, we have:

root
├── 4 == o : False
└── 4 != o : True
accuracy = 0.718

In both of them, feature_4 is the status of the center block. We could find out that the center block matters!!! And in ID3, the tree has to give a result for 'b', which causes the low accuracy.

  • Random Forest
  • SVM
  • AdaBoost
  • HMM

Cluster

  • Kmeans

For kmeans, we use the make_blob() function in sklearn to produce toy dataset.

Kmeans result on the blob dataset.

  • Affinity Propagation

You can think affinity propagation as an cluster algorithm that generate cluster number automatically.

Kmeans result on the blob dataset.

Manifold Learning

In manifold learning, we all use the simple curve-s data to show the difference between algorithms.

Curve S data.

  • PCA

The most popular way to reduce dimension.

PCA visualization.

  • LLE

A manifold learning method using only local information.

LLE visualization.

NLP

  • LDA

Time Series Analysis

  • AR

Usage

  • Installation

If you want to use the visual example, please install the package by:

  $ git clone https://github.com/zhuzilin/NP_ML
  $ cd NP_ML
  $ python setup.py install
  • Examples in section "Algorithm List"

Run the script in NP_ML/example/ . For example:

  $ cd example/
  $ python affinity_propagation.py

(Mac/Linux user may face some issue with the data directory. Please change them in the correspondent script).

  • Examples for Statistical Learning Method(《统计学习方法》)

Run the script in NP_ML/example/StatisticalLearningMethod/ .For example:

  $ cd example/StatisticalLearningMethod
  $ python adaboost.py

Reference

Classical ML algorithms was validated by naive examples in Statistical Learning Method(《统计学习方法》)

Time series models was validated by example in Bus 41202

Something Else

Currently, this repo will only implement algorithms that do not need gradient descent. Those would be arranged in another repo in which I would implement those using framework like pytorch. Coming soon:)

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].