All Projects → RobinSmits → KaggleBengaliAIHandwrittenGraphemeClassification

RobinSmits / KaggleBengaliAIHandwrittenGraphemeClassification

Licence: MIT License
Some parts of my code for the Computer Vision Kaggle Bengali AI Handwritten Grapheme Classification competition

Programming Languages

Jupyter Notebook
11667 projects
python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to KaggleBengaliAIHandwrittenGraphemeClassification

Efficientnet
Implementation of EfficientNet model. Keras and TensorFlow Keras.
Stars: ✭ 1,920 (+7580%)
Mutual labels:  classification, efficientnet
AdaptiveRandomForest
Repository for the AdaptiveRandomForest algorithm implemented in MOA 2016-04
Stars: ✭ 28 (+12%)
Mutual labels:  classification
CNN-SoilTextureClassification
1-dimensional convolutional neural networks (CNN) for the classification of soil texture based on hyperspectral data
Stars: ✭ 35 (+40%)
Mutual labels:  classification
text2class
Multi-class text categorization using state-of-the-art pre-trained contextualized language models, e.g. BERT
Stars: ✭ 15 (-40%)
Mutual labels:  classification
COVID-CXNet
COVID-CXNet: Diagnosing COVID-19 in Frontal Chest X-ray Images using Deep Learning. Preprint available on arXiv: https://arxiv.org/abs/2006.13807
Stars: ✭ 48 (+92%)
Mutual labels:  classification
wymlp
tiny fast portable real-time deep neural network for regression and classification within 50 LOC.
Stars: ✭ 36 (+44%)
Mutual labels:  classification
stg
Python/R library for feature selection in neural nets. ("Feature selection using Stochastic Gates", ICML 2020)
Stars: ✭ 47 (+88%)
Mutual labels:  classification
Credit
An example project that predicts risk of credit card default using a Logistic Regression classifier and a 30,000 sample dataset.
Stars: ✭ 18 (-28%)
Mutual labels:  classification
ruimtehol
R package to Embed All the Things! using StarSpace
Stars: ✭ 95 (+280%)
Mutual labels:  classification
Python-Machine-Learning
Python Machine Learning Algorithms
Stars: ✭ 80 (+220%)
Mutual labels:  classification
projection-pursuit
An implementation of multivariate projection pursuit regression and univariate classification
Stars: ✭ 24 (-4%)
Mutual labels:  classification
shellnet
ShellNet: Efficient Point Cloud Convolutional Neural Networks using Concentric Shells Statistics
Stars: ✭ 80 (+220%)
Mutual labels:  classification
data-science-notes
Open-source project hosted at https://makeuseofdata.com to crowdsource a robust collection of notes related to data science (math, visualization, modeling, etc)
Stars: ✭ 52 (+108%)
Mutual labels:  classification
Skin-cancer-recoginition
Recognizing and localizing melanoma from other skin disease
Stars: ✭ 28 (+12%)
Mutual labels:  classification
CvT
This is an official implementation of CvT: Introducing Convolutions to Vision Transformers.
Stars: ✭ 262 (+948%)
Mutual labels:  classification
well-classified-examples-are-underestimated
Code for the AAAI 2022 publication "Well-classified Examples are Underestimated in Classification with Deep Neural Networks"
Stars: ✭ 21 (-16%)
Mutual labels:  classification
Combining-EfficientNet-and-Vision-Transformers-for-Video-Deepfake-Detection
Code for Video Deepfake Detection model from "Combining EfficientNet and Vision Transformers for Video Deepfake Detection" available on Arxiv and was submitted to ICIAP 2021.
Stars: ✭ 39 (+56%)
Mutual labels:  efficientnet
serverless-transformers-on-aws-lambda
Deploy transformers serverless on AWS Lambda
Stars: ✭ 100 (+300%)
Mutual labels:  classification
FineGrainedVisualRecognition
Fine grained visual recognition tensorflow baseline on CUB, Stanford Cars, Dogs, Aircrafts, and Flower102.
Stars: ✭ 19 (-24%)
Mutual labels:  classification
Cat-Dog-CNN-Classifier
Convolutional Neural Network to classify images as either cat or dog, along with using attention heatmaps for localization. Written in python with keras.
Stars: ✭ 17 (-32%)
Mutual labels:  classification

Kaggle Bengali.AI Handwritten Grapheme Classification

In this repository you can find some of the code I use for the Kaggle Bengali.AI competition.

Kernel

In the folder 'KaggleKernelEfficientNetB3' you can find the part of the code I used to train the models used in my inference kernel as posted on Kaggle.

The model scored 0.9703 on the Public Board and 0.9182 on the Private Board

To be able to train the model you need to first download the dataset as available for the Kaggle competition.

The model consists of an EfficientNet B3 pre-trained (on Imagenet..) model with a generalized mean pool and custom head layer. For image preprocessing I just invert, normalize and scale the image ... nothing else. No form of augmentation is used.

The code should be able to run on any Python 3.6/3.7 environment. Major packages used were:

  • Tensorflow 2.1.0
  • Keras 2.3.1
  • Efficientnet 1.0.0
  • Opencv-python
  • Iterative-stratification 0.1.6

I've trained the model for 80 epochs and picked some model weight files todo an ensemble in the inference kernel.

I first tested the training part by using 5/6th of the training data for training and 1/6th for validation. Based on the validation and some leaderboard submissions I found that the highest scoring epochs were between epoch 60 - 70. In the final training (as is used in this code) I use a different distribution of the training data for every epoch. The downside of this is that validation doesn't tell you everything anymore. The major benefit is however that it increases the score about 0.005 to 0.008 compared to the use of the fixed training set. This way it get close to what a Cross Validation ensemble would do...just without the training needed for that.

The model weight files as used in the inference kernel are available in folder 'KaggleKernelEfficientNetB3\model_weights'. It contains the following 6 files (for each file mentioned the LB score when used as a single file to generate the submission):

  • Train1_model_57.h5 Public LB: 0.9668 - Private LB: 0.9132
  • Train1_model_59.h5 Public LB: 0.9681 - Private LB: 0.9151
  • Train1_model_64.h5 Public LB: 0.9679 - Private LB: 0.9167
  • Train1_model_66.h5 Public LB: 0.9685 - Private LB: 0.9157
  • Train1_model_68.h5 Public LB: 0.9691 - Private LB: 0.9167
  • Train1_model_70.h5 Public LB: 0.9700 - Private LB: 0.9174

To start training the model you need to use the train.py file and at least verify/modify the following values:

  • DATA_DIR (this should be the directory with the Bengali.AI dataset)
  • TRAIN_DIR (this should be the directory where you want to store the generated training images)
  • GENERATE_IMAGES (whether or not the training images must be pre-generated..should be done initially)

To run the inference on the models you can use the Jupyter Notebook 'keras-efficientnet-b3-training-inference.ipynb'. Note that you do need to modify some paths to the model weight files.

Competition Final Submission

In the folder 'KaggleFinalSubmission' you can find the part of the code I used together with my Kaggle team to train the models used our final submission(s). In the final submissions we also used some SE-ResNeXt models in an ensemble. The same guidelines apply to this model as mentioned above.

Just the EfficientNet B3 model however already gave the same score as the multi-model ensemble. The model scored 0.9739 on the Public Board and 0.9393 on the Private Board.

With our submission we achieved a 47th place out of 2000+ participants.

Use the final submission inference kernel and the provided model weights to try it out. To be able to train the model you need to first download the dataset as available for the Kaggle competition.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].