Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Health Check ✔ is a Machine Learning Web Application made using Flask that can predict mainly three diseases i.e. Diabetes, Heart Disease, and Cancer.

Stars: ✭ 35 (-23.91%)

Mutual labels: kaggle

Text2gender

Predict the author's gender from their text.

Stars: ✭ 14 (-69.57%)

Mutual labels: text-classification

Kaggle Seizure Prediction

solution for the American Epilepsy Society Seizure Prediction Challenge

Stars: ✭ 44 (-4.35%)

Mutual labels: kaggle

Few Shot Text Classification

Few-shot binary text classification with Induction Networks and Word2Vec weights initialization

Stars: ✭ 32 (-30.43%)

Mutual labels: text-classification

Textcnn

TextCNN by TensorFlow 2.0.0 ( tf.keras mainly ).

Stars: ✭ 37 (-19.57%)

Mutual labels: text-classification

Nlp xiaojiang

自然语言处理（nlp），小姜机器人（闲聊检索式chatbot），BERT句向量-相似度（Sentence Similarity），XLNET句向量-相似度（text xlnet embedding），文本分类（Text classification），实体提取（ner，bert+bilstm+crf），数据增强（text augment, data enhance），同义句同义词生成，句子主干提取（mainpart），中文汉语短文本相似度，文本特征工程，keras-http-service调用

Stars: ✭ 954 (+1973.91%)

Mutual labels: text-classification

Kaggle Dae

kaggleのporto-seguro-safe-driver-prediction, michaelのsolver

Stars: ✭ 29 (-36.96%)

Mutual labels: kaggle

Nlp Experiments In Pytorch

PyTorch repository for text categorization and NER experiments in Turkish and English.

Stars: ✭ 35 (-23.91%)

Mutual labels: text-classification

Text gcn

Graph Convolutional Networks for Text Classification. AAAI 2019

Stars: ✭ 945 (+1954.35%)

Mutual labels: text-classification

Ml Classify Text Js

Machine learning based text classification in JavaScript using n-grams and cosine similarity

Stars: ✭ 38 (-17.39%)

Mutual labels: text-classification

Keras Textclassification

中文长文本分类、短句子分类、多标签分类、两句子相似度（Chinese Text Classification of Keras NLP, multi-label classify, or sentence classify, long or short），字词句向量嵌入层（embeddings）和网络层（graph）构建基类，FastText，TextCNN，CharCNN，TextRNN, RCNN, DCNN, DPCNN, VDCNN, CRNN, Bert, Xlnet, Albert, Attention, DeepMoji, HAN, 胶囊网络-CapsuleNet, Transformer-encode, Seq2seq, SWEM, LEAM, TextGCN

Stars: ✭ 914 (+1886.96%)

Mutual labels: text-classification

Omnicat Bayes

Naive Bayes text classification implementation as an OmniCat classifier strategy. (#ruby #naivebayes)

Stars: ✭ 30 (-34.78%)

Mutual labels: text-classification

Tensorflow Sentiment Analysis On Amazon Reviews Data

Implementing different RNN models (LSTM,GRU) & Convolution models (Conv1D, Conv2D) on a subset of Amazon Reviews data with TensorFlow on Python 3. A sentiment analysis project.

Stars: ✭ 34 (-26.09%)

Mutual labels: text-classification

Kaggle Ndsb

Code for National Data Science Bowl. 10th place.

Stars: ✭ 45 (-2.17%)

Mutual labels: kaggle

View All Similar Projects ➔

BERT-toxicity-classification

This repo show how to train bert model on Jigsaw Unintended Bias in Toxicity Classification
star me and i will keep update the code
this repo is modified from google open source code for bert , thank Jon Mischo advice here

LB Score

2019-04-06: 0.91216
2019-04-07: 0.91455(add text clean method reference here)

How to output the prediction on test data by finetuning bert model

prepare

download the pretrain model
download the data and unzip to input folder
split the train and dev data(for convenience, i just tyde this command and not recommanded)

cat train.csv | tail -n 1000 > dev_1000.csv

train model

run run_classifier.py

python run_classifier.py \
  --data_dir=input/ --vocab_file=uncased_L-12_H-768_A-12/vocab.txt \
  --bert_config_file=uncased_L-12_H-768_A-12/bert_config.json \
  --init_checkpoint=uncased_L-12_H-768_A-12/bert_model.ckpt \
  --task_name=toxic \
  --do_train=True \
  --do_eval=True \
  --do_predict=True \
  --output_dir=model_output/

the model will train 10 epochs, but you can stop it depend on your time
the checkpoint will be saved on the model_output, also the prediton on the test data(see model_output/test_result.tsv)

generate the submission

run encode.py
upload the output/sub.csv to kaggle

What is the different with official code**

add csv handler(line 243 in run_classifier.py)
add ToxicProcessor(line 264 in run_classifier.py)

To do

text clean and OOV
CV
average different checkpoint prediction

like this repo? you can buy me a cup of coffee

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 46

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (0) 🔗