All Projects → yell → mnist-challenge

yell / mnist-challenge

Licence: MIT license
My solution to TUM's Machine Learning MNIST challenge 2016-2017 [winner]

Programming Languages

Jupyter Notebook
11667 projects
python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to mnist-challenge

models-by-example
By-hand code for models and algorithms. An update to the 'Miscellaneous-R-Code' repo.
Stars: ✭ 43 (-36.76%)
Mutual labels:  pca, logistic-regression, gaussian-processes
Machine-Learning-Models
In This repository I made some simple to complex methods in machine learning. Here I try to build template style code.
Stars: ✭ 30 (-55.88%)
Mutual labels:  pca, logistic-regression, k-nn
Ml code
A repository for recording the machine learning code
Stars: ✭ 75 (+10.29%)
Mutual labels:  mnist, pca, logistic-regression
Deeplearning
Deep Learning From Scratch
Stars: ✭ 66 (-2.94%)
Mutual labels:  mnist, logistic-regression
Miscellaneous R Code
Code that might be useful to others for learning/demonstration purposes, specifically along the lines of modeling and various algorithms. Now almost entirely superseded by the models-by-example repo.
Stars: ✭ 146 (+114.71%)
Mutual labels:  pca, gaussian-processes
ml
经典机器学习算法的极简实现
Stars: ✭ 130 (+91.18%)
Mutual labels:  pca, logistic-regression
Machine learning basics
Plain python implementations of basic machine learning algorithms
Stars: ✭ 3,557 (+5130.88%)
Mutual labels:  logistic-regression, k-nn
Neural Tangents
Fast and Easy Infinite Neural Networks in Python
Stars: ✭ 1,357 (+1895.59%)
Mutual labels:  kernel, gaussian-processes
Mnist Classification
Pytorch、Scikit-learn实现多种分类方法,包括逻辑回归(Logistic Regression)、多层感知机(MLP)、支持向量机(SVM)、K近邻(KNN)、CNN、RNN,极简代码适合新手小白入门,附英文实验报告(ACM模板)
Stars: ✭ 109 (+60.29%)
Mutual labels:  mnist, logistic-regression
Fun-with-MNIST
Playing with MNIST. Machine Learning. Generative Models.
Stars: ✭ 23 (-66.18%)
Mutual labels:  mnist, pca
Statistical-Learning-using-R
This is a Statistical Learning application which will consist of various Machine Learning algorithms and their implementation in R done by me and their in depth interpretation.Documents and reports related to the below mentioned techniques can be found on my Rpubs profile.
Stars: ✭ 27 (-60.29%)
Mutual labels:  logistic-regression, k-nn
Quick-Data-Science-Experiments-2017
Quick-Data-Science-Experiments
Stars: ✭ 19 (-72.06%)
Mutual labels:  pca, logistic-regression
Isl Python
Solutions to labs and excercises from An Introduction to Statistical Learning, as Jupyter Notebooks.
Stars: ✭ 108 (+58.82%)
Mutual labels:  pca, logistic-regression
Machine Learning With Python
Python code for common Machine Learning Algorithms
Stars: ✭ 3,334 (+4802.94%)
Mutual labels:  pca, logistic-regression
VisualML
Interactive Visual Machine Learning Demos.
Stars: ✭ 104 (+52.94%)
Mutual labels:  pca, logistic-regression
Tensorflow Mnist Cnn
MNIST classification using Convolutional NeuralNetwork. Various techniques such as data augmentation, dropout, batchnormalization, etc are implemented.
Stars: ✭ 182 (+167.65%)
Mutual labels:  mnist, data-augmentation
machine learning course
Artificial intelligence/machine learning course at UCF in Spring 2020 (Fall 2019 and Spring 2019)
Stars: ✭ 47 (-30.88%)
Mutual labels:  logistic-regression, data-augmentation
playing with vae
Comparing FC VAE / FCN VAE / PCA / UMAP on MNIST / FMNIST
Stars: ✭ 53 (-22.06%)
Mutual labels:  mnist, pca
Universal Head 3DMM
This is a Project Page of 'Towards a complete 3D morphable model of the human head'
Stars: ✭ 138 (+102.94%)
Mutual labels:  pca, gaussian-processes
lgpr
R-package for interpretable nonparametric modeling of longitudinal data using additive Gaussian processes. Contains functionality for inferring covariate effects and assessing covariate relevances. Various models can be specified using a convenient formula syntax.
Stars: ✭ 22 (-67.65%)
Mutual labels:  gaussian-processes

ML MNIST Challenge

This contest was offered within TU Munich's course Machine Learning (IN2064).
The goal was to implement k-NN, Neural Network, Logistic Regression and Gaussian Process Classifier in python from scratch and achieve minimal average test error among these classifiers on well-known MNIST dataset, without ensemble learning.

Results

Algorithm
Description
Test Error, %
k-NN 3-NN, Euclidean distance, uniform weights.
Preprocessing: Feature vectors extracted from NN.
1.13
k-NN2 3-NN, Euclidean distance, uniform weights.
Preprocessing: Augment (training) data (×9) by using random rotations,
shifts, Gaussian blur and dropout pixels; PCA-35 whitening and multiplying
each feature vector by e11.6 · s, where s – normalized explained
variance by the respective principal axis. (equivalent to applying PCA
whitening with accordingly weighted Euclidean distance).
2.06
NN MLP 784-1337-D(0.05)-911-D(0.1)-666-333-128-10 (D – dropout);
hidden activations – LeakyReLU(0.01), output – softmax; loss – categorical
cross-entropy; 1024 batches; 42 epochs; optimizer – Adam (learning rate
5 · 10–5, rest – defaults from paper).
Preprocessing: Augment (training) data (×5) by using random rotations,
shifts, Gaussian blur.
1.04
LogReg 32 batches; 91 epoch; L2-penalty, λ = 3.16 · 10–4; optimizer – Adam (learning
rate 10–3, rest – defaults from paper)
Preprocessing: Feature vectors extracted from NN.
1.01
GPC 794 random data points were used for training; σn = 0; RBF kernel (σf = 0.4217,
γ = 1/2l2 = 0.0008511); Newton iterations for Laplace approximation till
ΔLog-Marginal-Likelihood ≤ 10–7; solve linear systems iteratively using CG with
10–7 tolerance; for prediction generate 2000 samples for each test point.
Preprocessing: Feature vectors extracted from NN.
1.59

Visualizations

1 And more available in experiments/plots/.

How to install

git clone https://github.com/yell/mnist-challenge
cd mnist-challenge/
pip install -r requirements.txt

After installation, tests can be run with:

make test

How to run

Check main.py to reproduce training and testing the final models:

usage: main.py [-h] [--load-nn] model

positional arguments:
  model       which model to run, {'gp', 'knn', 'knn-without-nn', 'logreg',
              'nn'}

optional arguments:
  -h, --help  show this help message and exit
  --load-nn   whether to use pretrained neural network, ignored if 'knn-
              without-nn' is used (default: False)

Experiments

Check also this notebook to see what I've tried.
Note: the approach RBM + LogReg gave only at most 91.8% test accuracy since RBM takes too long to train with given pure python code, thus it was only trained on small subset of data (and still underfitted). However, with properly trained RBM on the whole training set, this approach can give 1.83% test error (see my Boltzmann machines project)

Features

  • Apart from specified algorithms, there are also PCA and RBM implementations
  • Most of the classes contain doctests so they are easy to understand
  • All randomness in algorithms or functions is reproducible (seeds)
  • Support of simple readable serialization (JSON)
  • There are also some infrastructure for model selection, feature selection, data augmentation, metrics, plots etc.)
  • Support for MNIST or Fashion MNIST (both have the same structure thus both can be loaded using the same routine), haven't tried the latter yet, though

System

All computations and time measurements were made on laptop i7-5500U CPU @ 2.40GHz x 4 12GB RAM

Possible future work

Here the list of what can also be tried regarding these particular 4 ML algorithms (didn't have time to check it, or it was forbidden by the rules, e.g. ensemble learning):

  • Model averaging for k-NN: train a group of k-NNs with different parameter k (say, 2, 4, ..., 128) and average their predictions;
  • More sophisticated metrics (say, from scipy.spatial.distance) for k-NN;
  • Weighting metrics according to some other functions of explained variance from PCA;
  • NCA;
  • Different kernels or compound kernels for k-NN;
  • Commitee of MLPs, CNN, commitee of CNNs or more advanced NNs;
  • Unsupervised pretraining for MLP/CNN;
  • Different kernels or compound kernels for GPCs;
  • 10 one-vs-rest GPCs;
  • Use derivatives of Log-Marginal-Likelihood for multiclass Laplace approximation w.r.t kernel parameters for more efficient gradient-based optimization;
  • Model averaging for GPCs: train a collection of GPCs on different parts of the data and then average their predictions (or bagging);
  • IVM.
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].