All Projects → OSSpk → Handwritten-Digits-Classification-Using-KNN-Multiclass_Perceptron-SVM

OSSpk / Handwritten-Digits-Classification-Using-KNN-Multiclass_Perceptron-SVM

Licence: MIT license
🏆 A Comparative Study on Handwritten Digits Recognition using Classifiers like K-Nearest Neighbours (K-NN), Multiclass Perceptron/Artificial Neural Network (ANN) and Support Vector Machine (SVM) discussing the pros and cons of each algorithm and providing the comparison results in terms of accuracy and efficiecy of each algorithm.

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Handwritten-Digits-Classification-Using-KNN-Multiclass Perceptron-SVM

100 Days Of Ml Code
100 Days of ML Coding
Stars: ✭ 33,641 (+79997.62%)
Mutual labels:  svm, machine-learning-algorithms, logistic-regression, support-vector-machines
Statistical-Learning-using-R
This is a Statistical Learning application which will consist of various Machine Learning algorithms and their implementation in R done by me and their in depth interpretation.Documents and reports related to the below mentioned techniques can be found on my Rpubs profile.
Stars: ✭ 27 (-35.71%)
Mutual labels:  machine-learning-algorithms, logistic-regression, supervised-machine-learning
Machine learning basics
Plain python implementations of basic machine learning algorithms
Stars: ✭ 3,557 (+8369.05%)
Mutual labels:  machine-learning-algorithms, logistic-regression, k-nearest-neighbours
GDLibrary
Matlab library for gradient descent algorithms: Version 1.0.1
Stars: ✭ 50 (+19.05%)
Mutual labels:  svm, machine-learning-algorithms, logistic-regression
Amazon-Fine-Food-Review
Machine learning algorithm such as KNN,Naive Bayes,Logistic Regression,SVM,Decision Trees,Random Forest,k means and Truncated SVD on amazon fine food review
Stars: ✭ 28 (-33.33%)
Mutual labels:  svm, logistic-regression, knn
ml
经典机器学习算法的极简实现
Stars: ✭ 130 (+209.52%)
Mutual labels:  svm, logistic-regression, knn
Heart disease prediction
Heart Disease prediction using 5 algorithms
Stars: ✭ 43 (+2.38%)
Mutual labels:  machine-learning-algorithms, logistic-regression, k-nearest-neighbours
Breast-Cancer-Scikitlearn
simple tutorial on Machine Learning with Scikitlearn
Stars: ✭ 33 (-21.43%)
Mutual labels:  svm, logistic-regression, knn
AIML-Projects
Projects I completed as a part of Great Learning's PGP - Artificial Intelligence and Machine Learning
Stars: ✭ 85 (+102.38%)
Mutual labels:  logistic-regression, support-vector-machines, supervised-machine-learning
ICC-2019-WC-prediction
Predicting the winner of 2019 cricket world cup using random forest algorithm
Stars: ✭ 41 (-2.38%)
Mutual labels:  logistic-regression, support-vector-machines
Machine-Learning-Models
In This repository I made some simple to complex methods in machine learning. Here I try to build template style code.
Stars: ✭ 30 (-28.57%)
Mutual labels:  svm, logistic-regression
Clustering-Python
Python Clustering Algorithms
Stars: ✭ 23 (-45.24%)
Mutual labels:  machine-learning-algorithms, knn
Loan-Approval-Prediction
Loan Application Data Analysis
Stars: ✭ 61 (+45.24%)
Mutual labels:  logistic-regression, accuracy-analysis
Emotion-recognition-from-tweets
A comprehensive approach on recognizing emotion (sentiment) from a certain tweet. Supervised machine learning.
Stars: ✭ 17 (-59.52%)
Mutual labels:  support-vector-machines, sigmoid-function
introduction-to-machine-learning
A document covering machine learning basics. 🤖📊
Stars: ✭ 17 (-59.52%)
Mutual labels:  svm, knn
AI Learning Hub
AI Learning Hub for Machine Learning, Deep Learning, Computer Vision and Statistics
Stars: ✭ 53 (+26.19%)
Mutual labels:  machine-learning-algorithms, perceptron-learning-algorithm
Fall-Detection-Dataset
FUKinect-Fall dataset was created using Kinect V1. The dataset includes walking, bending, sitting, squatting, lying and falling actions performed by 21 subjects between 19-72 years of age.
Stars: ✭ 16 (-61.9%)
Mutual labels:  svm, knn
Trajectory-Analysis-and-Classification-in-Python-Pandas-and-Scikit-Learn
Formed trajectories of sets of points.Experimented on finding similarities between trajectories based on DTW (Dynamic Time Warping) and LCSS (Longest Common SubSequence) algorithms.Modeled trajectories as strings based on a Grid representation.Benchmarked KNN, Random Forest, Logistic Regression classification algorithms to classify efficiently t…
Stars: ✭ 41 (-2.38%)
Mutual labels:  logistic-regression, knn
info-retrieval
Information Retrieval in High Dimensional Data (class deliverables)
Stars: ✭ 33 (-21.43%)
Mutual labels:  svm, logistic-regression
MachineLearning
Implementations of machine learning algorithm by Python 3
Stars: ✭ 16 (-61.9%)
Mutual labels:  machine-learning-algorithms, perceptron-learning-algorithm

🏆 A Comparative Study on Handwritten Digits Recognition using Classifiers like K-NN, Multiclass Perceptron and SVM

views Open Source Love svg1 GitHub Forks GitHub Issues contributions welcome

For the full report, refer to the file named Detailed Report.pdf.

Problem Statement

The task at hand is to classify handwritten digits using supervised machine learning methods. The digits belong to classes of 0 to 9.

“Given a query instance (a digit) in the form of an image, our machine learning model must correctly classify its appropriate class.”

Dataset

MNIST Handwritten Digits dataset is used for this task. It contains images of digits taken from a variety of scanned documents, normalized in size and centered. Each image is a 28 by 28 pixel square (784 pixels total). The dataset contains 60,000 images for model training and 10,000 images for the evaluation of the model.

Methodology

We have used supervised machine learning models to predict the digits. Since this is a comparative study hence we will first describe the K-Nearest Neighbors Classifier as the baseline method which will then be compared to Multiclass Perceptron Classifier and SVM Classifier.

1) K-Nearest Neighbors Classifier – Our Baseline Method

k-Nearest Neighbors (k-NN) is an algorithm, which:

  • finds a group of k objects in the training set that are closest to the test object, and
  • bases the assignment of a label on the predominance of a class in this neighborhood.

When we used the K-NN method the following pros and cons were observed:

Pros

  • K-NN executes quickly for small training data sets.
  • No assumptions about data — useful, for example, for nonlinear data
  • Simple algorithm — to explain and understand/interpret
  • Versatile — useful for classification or regression
  • Training phase is extremely quick because it doesn’t learn any data

Cons

  • Computationally expensive — because the algorithm compares the test data with all examples in training data and then finalizes the label
  • The value of K is unknown and can be predicted using cross validation techniques
  • High memory requirement – because all the training data is stored
  • Prediction stage might be slow if training data is large

2) Multiclass Perceptron Classifier:

A multiclass perceptron classifier can be made using multiple binary class classifiers trained with 1 vs all strategy. In this strategy, while training a perceptron the training labels are such that e.g. for the classifier 2 vs all, the labels with 2 will be labeled as 1 and rest will be labeled as 0 for Sigmoid Unit while for Rosenblatt’s perceptron the labels would be 1 and -1 respectively for positive and negative examples.

Now all we have to do is to train (learn the weights for) 10 classifiers separately and then feed the query instance to all these classifiers (as shown in figure above). The label of classifier with highest confidence will then be assigned to the query instance.

How Multiclass Perceptron mitigates the limitations of K-NN:

As we already discussed, K-NN stores all the training data and when a new query instance comes it compares its similarity with all the training data which makes it expensive both computationally and memory-wise. There is no learning involved as such. On the other hand, Multiclass perceptron takes some time in learning phase but after its training is done, it learns the new weights which can be saved and then used. Now, when a query instance comes, it only has to take to dot product of that instance with the weights learned and there comes the output (after applying activation function).

  • The prediction phase is extremely fast as compared to that of K-NN.
  • Also, it’s a lot more efficient in terms of computation (during prediction phase) and memory (because now it only has to store the weights instead of all the training data).

3) SVM Classifier using Histogram of Oriented Gradients (HOG) Features:

Just for comparison purposes, we have also used a third supervised machine learning technique named Support Vector Machine Classifier. The model isn’t implemented. Its imported directly from scikit learn module of python and used.

In K-NN and Multiclass Perceptron Classifier we trained our models on raw images directly instead of computing some features from the input image and training the model on those computed measurements/features.

A feature descriptor is a representation of an image that simplifies the image by extracting useful information and throwing away extraneous information. Now we are going to compute the Histogram of Oriented Gradients as features from the digit images and we will train the SVM Classifier on that. The HOG descriptor technique counts occurrences of gradient orientation in localized portions of an image - detection window.

Analysis

Now the final phase. After running the experiment with different algorithms, the results are summarized. First comparing the techniques on basis of Accuracy:

Accuracy (Performance):

When we compare the K-NN method with Multiclass Perceptron and SVM on basis of accuracy then its accuracy is similar to that of other two classifiers which means despite its simplicity K-NN is really a good classifier.

Prediction Time (Efficiency):

Our Observations:

One of the main limitations of K-NN was that it was computationally expensive. Its prediction time was large because whenever a new query instance came it had to compare its similarity with all the training data and then sort the neighbors according to their confidence and then separating the top k neighbors and choosing the label of the most occurred neighbor in top k. In all this process, it takes a comparable amount of time.

While for Multiclass Perceptron Classifier we observed it will mitigate this limitation in efficiency such that its prediction time will be short because now it will only compute the dot product in the prediction phase. The majority of time is spent only once in its learning phase. Then it’s ready to predict the test instances.

Results:

Conclusion:

When the times were calculated for the prediction phases of K-NN, Multiclass Perceptron and SVM, the Multiclass Perceptron clearly stands out with the shortest prediction time while on the other side, K-NN took a large time in predicting the test instances. Hence Multiclass Perceptron clearly leaves K-NN behind in terms of efficiency in Prediction Time and also in terms of computation and memory load. Thus, it mitigates the limitations of our baseline method K-NN.


How to Run Code

The code files are in running condition and are directly executable.

(To install all the necessary packages at once, install Anaconda)



Hey there, I'm Haris Ultimate Facebook Scraper (UFS) - Maker of Things

Creator of Ultimate Facebook Scraper (one of the best software to collect Facebook data for research & analysis)


🌐 Connect

🤝 Consulting / Coaching

Stuck with some problem? Need help in solution development, guidance, training or capacity building? I am a Full Stack Engineer turned Project Manager with years of technical and leadership experience in a diverse range of technologies and domains. Let me know what problem you are facing at [email protected] and we can schedule a consultation meeting to help you get through it.

👨‍💻 Technical Skills & Expertise

  • Development of Web Applications, Mobile Applications, and Desktop Applications
  • Development of Machine Learning/Deep Learning models, and deployment
  • Web Scraping, Browser Automation, Python Scripting

❤️ Support / Donations

If you or your company use any of my projects, like what I’m doing or have benefited from my projects in any way then kindly consider backing my efforts.

For donations, you can follow these simple steps:

1) Free signup at TransferWise using this link: https://transferwise.com/invite/u/harism95. (Signing up through this link will save you from any transcation fee on the donation)

2) Select the amount e.g (15$) and choose the receiving/recipient's currency to be PKR. It supports multiple payment options (credit card, debit card, wire transfer etc)

3) Then it will show my info as the recipient, select it. If my name isn't shown, then type my email [email protected] in recipients.

4) Choose the reason for transfer to the one that suits you the most (in this case it could be 'General expenses') and in the reference section, you can mention 'Support'

If you face any issue in sending donation then feel free to get in touch with me at [email protected]

Thank you for your contribution!

Author

You can get in touch with me on my LinkedIn Profile: LinkedIn Link

You can also follow my GitHub Profile to stay updated about my latest projects: GitHub Follow

If you liked the repo then kindly support it by giving it a star and share in your circles so more people can benefit from the effort.

Contributions Welcome

forthebadge

If you find any bug in the code or have any improvements in mind then feel free to generate a pull request.

Issues

GitHub Issues

If you face any issue, you can create a new issue in the Issues Tab and I will be glad to help you out.

License

MIT

Copyright (c) 2018-present, harismuneer

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].