RobinSmits / KaggleBengaliAIHandwrittenGraphemeClassification

Licence: MIT License

Some parts of my code for the Computer Vision Kaggle Bengali AI Handwritten Grapheme Classification competition

Programming Languages

Jupyter Notebook

11667 projects

python

139335 projects - #7 most used programming language

Projects that are alternatives of or similar to KaggleBengaliAIHandwrittenGraphemeClassification

Efficientnet

Implementation of EfficientNet model. Keras and TensorFlow Keras.

Stars: ✭ 1,920 (+7580%)

Mutual labels: classification, efficientnet

AdaptiveRandomForest

Repository for the AdaptiveRandomForest algorithm implemented in MOA 2016-04

Stars: ✭ 28 (+12%)

Mutual labels: classification

CNN-SoilTextureClassification

1-dimensional convolutional neural networks (CNN) for the classification of soil texture based on hyperspectral data

Stars: ✭ 35 (+40%)

Mutual labels: classification

text2class

Multi-class text categorization using state-of-the-art pre-trained contextualized language models, e.g. BERT

Stars: ✭ 15 (-40%)

Mutual labels: classification

COVID-CXNet

COVID-CXNet: Diagnosing COVID-19 in Frontal Chest X-ray Images using Deep Learning. Preprint available on arXiv: https://arxiv.org/abs/2006.13807

Stars: ✭ 48 (+92%)

Mutual labels: classification

wymlp

tiny fast portable real-time deep neural network for regression and classification within 50 LOC.

Stars: ✭ 36 (+44%)

Mutual labels: classification

stg

Python/R library for feature selection in neural nets. ("Feature selection using Stochastic Gates", ICML 2020)

Stars: ✭ 47 (+88%)

Mutual labels: classification

Credit

An example project that predicts risk of credit card default using a Logistic Regression classifier and a 30,000 sample dataset.

Stars: ✭ 18 (-28%)

Mutual labels: classification

ruimtehol

R package to Embed All the Things! using StarSpace

Stars: ✭ 95 (+280%)

Mutual labels: classification

Python-Machine-Learning

Python Machine Learning Algorithms

Stars: ✭ 80 (+220%)

Mutual labels: classification

projection-pursuit

An implementation of multivariate projection pursuit regression and univariate classification

Stars: ✭ 24 (-4%)

Mutual labels: classification

shellnet

ShellNet: Efficient Point Cloud Convolutional Neural Networks using Concentric Shells Statistics

Stars: ✭ 80 (+220%)

Mutual labels: classification

data-science-notes

Open-source project hosted at https://makeuseofdata.com to crowdsource a robust collection of notes related to data science (math, visualization, modeling, etc)

Stars: ✭ 52 (+108%)

Mutual labels: classification

Skin-cancer-recoginition

Recognizing and localizing melanoma from other skin disease

Stars: ✭ 28 (+12%)

Mutual labels: classification

CvT

This is an official implementation of CvT: Introducing Convolutions to Vision Transformers.

Stars: ✭ 262 (+948%)

Mutual labels: classification

well-classified-examples-are-underestimated

Code for the AAAI 2022 publication "Well-classified Examples are Underestimated in Classification with Deep Neural Networks"

Stars: ✭ 21 (-16%)

Mutual labels: classification

Combining-EfficientNet-and-Vision-Transformers-for-Video-Deepfake-Detection

Code for Video Deepfake Detection model from "Combining EfficientNet and Vision Transformers for Video Deepfake Detection" available on Arxiv and was submitted to ICIAP 2021.

Stars: ✭ 39 (+56%)

Mutual labels: efficientnet

serverless-transformers-on-aws-lambda

Deploy transformers serverless on AWS Lambda

Stars: ✭ 100 (+300%)

Mutual labels: classification

FineGrainedVisualRecognition

Fine grained visual recognition tensorflow baseline on CUB, Stanford Cars, Dogs, Aircrafts, and Flower102.

Stars: ✭ 19 (-24%)

Mutual labels: classification

Cat-Dog-CNN-Classifier

Convolutional Neural Network to classify images as either cat or dog, along with using attention heatmaps for localization. Written in python with keras.

Stars: ✭ 17 (-32%)

Mutual labels: classification

View All Similar Projects ➔

Kaggle Bengali.AI Handwritten Grapheme Classification

In this repository you can find some of the code I use for the Kaggle Bengali.AI competition.

Kernel

In the folder 'KaggleKernelEfficientNetB3' you can find the part of the code I used to train the models used in my inference kernel as posted on Kaggle.

The model scored 0.9703 on the Public Board and 0.9182 on the Private Board

To be able to train the model you need to first download the dataset as available for the Kaggle competition.

The model consists of an EfficientNet B3 pre-trained (on Imagenet..) model with a generalized mean pool and custom head layer. For image preprocessing I just invert, normalize and scale the image ... nothing else. No form of augmentation is used.

The code should be able to run on any Python 3.6/3.7 environment. Major packages used were:

Tensorflow 2.1.0
Keras 2.3.1
Efficientnet 1.0.0
Opencv-python
Iterative-stratification 0.1.6

I've trained the model for 80 epochs and picked some model weight files todo an ensemble in the inference kernel.

I first tested the training part by using 5/6th of the training data for training and 1/6th for validation. Based on the validation and some leaderboard submissions I found that the highest scoring epochs were between epoch 60 - 70. In the final training (as is used in this code) I use a different distribution of the training data for every epoch. The downside of this is that validation doesn't tell you everything anymore. The major benefit is however that it increases the score about 0.005 to 0.008 compared to the use of the fixed training set. This way it get close to what a Cross Validation ensemble would do...just without the training needed for that.

The model weight files as used in the inference kernel are available in folder 'KaggleKernelEfficientNetB3\model_weights'. It contains the following 6 files (for each file mentioned the LB score when used as a single file to generate the submission):

Train1_model_57.h5 Public LB: 0.9668 - Private LB: 0.9132
Train1_model_59.h5 Public LB: 0.9681 - Private LB: 0.9151
Train1_model_64.h5 Public LB: 0.9679 - Private LB: 0.9167
Train1_model_66.h5 Public LB: 0.9685 - Private LB: 0.9157
Train1_model_68.h5 Public LB: 0.9691 - Private LB: 0.9167
Train1_model_70.h5 Public LB: 0.9700 - Private LB: 0.9174

To start training the model you need to use the train.py file and at least verify/modify the following values:

DATA_DIR (this should be the directory with the Bengali.AI dataset)
TRAIN_DIR (this should be the directory where you want to store the generated training images)
GENERATE_IMAGES (whether or not the training images must be pre-generated..should be done initially)

To run the inference on the models you can use the Jupyter Notebook 'keras-efficientnet-b3-training-inference.ipynb'. Note that you do need to modify some paths to the model weight files.

Competition Final Submission

In the folder 'KaggleFinalSubmission' you can find the part of the code I used together with my Kaggle team to train the models used our final submission(s). In the final submissions we also used some SE-ResNeXt models in an ensemble. The same guidelines apply to this model as mentioned above.

Just the EfficientNet B3 model however already gave the same score as the multi-model ensemble. The model scored 0.9739 on the Public Board and 0.9393 on the Private Board.

With our submission we achieved a 47th place out of 2000+ participants.

Use the final submission inference kernel and the provided model weights to try it out. To be able to train the model you need to first download the dataset as available for the Kaggle competition.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

RobinSmits / KaggleBengaliAIHandwrittenGraphemeClassification

Programming Languages

Labels

Projects that are alternatives of or similar to KaggleBengaliAIHandwrittenGraphemeClassification

Kaggle Bengali.AI Handwritten Grapheme Classification

Kernel

Competition Final Submission