Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → ShaneTian → Textcnn

ShaneTian / Textcnn

Licence: gpl-3.0

TextCNN by TensorFlow 2.0.0 ( tf.keras mainly ).

Programming Languages

139335 projects - #7 most used programming language

1442 projects

Labels

text-classification

Projects that are alternatives of or similar to Textcnn

Nlp In Practice

Starter code to solve real world text data problems. Includes: Gensim Word2Vec, phrase embeddings, Text Classification with Logistic Regression, word count with pyspark, simple text preprocessing, pre-trained embeddings and more.

Stars: ✭ 790 (+2035.14%)

Mutual labels: text-classification

Text classification

all kinds of text classification models and more with deep learning

Stars: ✭ 7,179 (+19302.7%)

Mutual labels: text-classification

自然语言处理（nlp），小姜机器人（闲聊检索式chatbot），BERT句向量-相似度（Sentence Similarity），XLNET句向量-相似度（text xlnet embedding），文本分类（Text classification），实体提取（ner，bert+bilstm+crf），数据增强（text augment, data enhance），同义句同义词生成，句子主干提取（mainpart），中文汉语短文本相似度，文本特征工程，keras-http-service调用

Stars: ✭ 954 (+2478.38%)

Mutual labels: text-classification

Text Classification Benchmark

文本分类基准测试

Stars: ✭ 18 (-51.35%)

Mutual labels: text-classification

Bert language understanding

Pre-training of Deep Bidirectional Transformers for Language Understanding: pre-train TextCNN

Stars: ✭ 933 (+2421.62%)

Mutual labels: text-classification

Keras Textclassification

中文长文本分类、短句子分类、多标签分类、两句子相似度（Chinese Text Classification of Keras NLP, multi-label classify, or sentence classify, long or short），字词句向量嵌入层（embeddings）和网络层（graph）构建基类，FastText，TextCNN，CharCNN，TextRNN, RCNN, DCNN, DPCNN, VDCNN, CRNN, Bert, Xlnet, Albert, Attention, DeepMoji, HAN, 胶囊网络-CapsuleNet, Transformer-encode, Seq2seq, SWEM, LEAM, TextGCN

Stars: ✭ 914 (+2370.27%)

Mutual labels: text-classification

Tf Rnn Attention

Tensorflow implementation of attention mechanism for text classification tasks.

Stars: ✭ 735 (+1886.49%)

Mutual labels: text-classification

Tensorflow Sentiment Analysis On Amazon Reviews Data

Implementing different RNN models (LSTM,GRU) & Convolution models (Conv1D, Conv2D) on a subset of Amazon Reviews data with TensorFlow on Python 3. A sentiment analysis project.

Stars: ✭ 34 (-8.11%)

Mutual labels: text-classification

Nlp tensorflow project

Use tensorflow to achieve some NLP project, eg: classification chatbot ner attention QAetc.

Stars: ✭ 27 (-27.03%)

Mutual labels: text-classification

Naive Bayes text classification implementation as an OmniCat classifier strategy. (#ruby #naivebayes)

Stars: ✭ 30 (-18.92%)

Mutual labels: text-classification

Text Mining in Python

Stars: ✭ 18 (-51.35%)

Mutual labels: text-classification

Concise Ipython Notebooks For Deep Learning

Ipython Notebooks for solving problems like classification, segmentation, generation using latest Deep learning algorithms on different publicly available text and image data-sets.

Stars: ✭ 23 (-37.84%)

Mutual labels: text-classification

Graph Convolutional Networks for Text Classification. AAAI 2019

Stars: ✭ 945 (+2454.05%)

Mutual labels: text-classification

基于金融-司法领域(兼有闲聊性质)的聊天机器人，其中的主要模块有信息抽取、NLU、NLG、知识图谱等，并且利用Django整合了前端展示,目前已经封装了nlp和kg的restful接口

Stars: ✭ 791 (+2037.84%)

Mutual labels: text-classification

Few Shot Text Classification

Few-shot binary text classification with Induction Networks and Word2Vec weights initialization

Stars: ✭ 32 (-13.51%)

Mutual labels: text-classification

基于Pytorch和torchtext的自然语言处理深度学习框架。

Stars: ✭ 739 (+1897.3%)

Mutual labels: text-classification

Predict the author's gender from their text.

Stars: ✭ 14 (-62.16%)

Mutual labels: text-classification

Nlp Experiments In Pytorch

PyTorch repository for text categorization and NER experiments in Turkish and English.

Stars: ✭ 35 (-5.41%)

Mutual labels: text-classification

Easy Deep Learning With Allennlp

🔮Deep Learning for text made easy with AllenNLP

Stars: ✭ 32 (-13.51%)

Mutual labels: text-classification

Cnn Question Classification Keras

Chinese Question Classifier (Keras Implementation) on BQuLD

Stars: ✭ 28 (-24.32%)

Mutual labels: text-classification

View All Similar Projects ➔

TextCNN

TextCNN by TensorFlow 2.0.0 ( tf.keras mainly ).

Software environments

tensorflow-gpu 2.0.0-alpha0
python 3.6.7
pandas 0.24.2
numpy 1.16.2

Data

Vocabulary size: 3407
Number of classes: 18
Train/Test split: 20351/2261

Model architecture

Model: "model"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_data (InputLayer)         [(None, 128)]        0                                            
__________________________________________________________________________________________________
embedding (Embedding)           (None, 128, 512)     1744384     input_data[0][0]                 
__________________________________________________________________________________________________
add_channel (Reshape)           (None, 128, 512, 1)  0           embedding[0][0]                  
__________________________________________________________________________________________________
convolution_3 (Conv2D)          (None, 126, 1, 128)  196736      add_channel[0][0]                
__________________________________________________________________________________________________
convolution_4 (Conv2D)          (None, 125, 1, 128)  262272      add_channel[0][0]                
__________________________________________________________________________________________________
convolution_5 (Conv2D)          (None, 124, 1, 128)  327808      add_channel[0][0]                
__________________________________________________________________________________________________
max_pooling_3 (MaxPooling2D)    (None, 1, 1, 128)    0           convolution_3[0][0]              
__________________________________________________________________________________________________
max_pooling_4 (MaxPooling2D)    (None, 1, 1, 128)    0           convolution_4[0][0]              
__________________________________________________________________________________________________
max_pooling_5 (MaxPooling2D)    (None, 1, 1, 128)    0           convolution_5[0][0]              
__________________________________________________________________________________________________
concatenate (Concatenate)       (None, 1, 1, 384)    0           max_pooling_3[0][0]              
                                                                 max_pooling_4[0][0]              
                                                                 max_pooling_5[0][0]              
__________________________________________________________________________________________________
flatten (Flatten)               (None, 384)          0           concatenate[0][0]                
__________________________________________________________________________________________________
dropout (Dropout)               (None, 384)          0           flatten[0][0]                    
__________________________________________________________________________________________________
dense (Dense)                   (None, 18)           6930        dropout[0][0]                    
==================================================================================================
Total params: 2,538,130
Trainable params: 2,538,130
Non-trainable params: 0
__________________________________________________________________________________________________

Model parameters

Padding size: 128
Embedding size: 512
Num channel: 1
Filter size: [3, 4, 5]
Num filters: 128
Dropout rate: 0.5
Regularizers lambda: 0.01
Batch size: 64
Epochs: 10
Fraction validation: 0.05 (1018 samples)
Total parameters: 2,538,130

Run

Train result

Use 20351 samples after 10 epochs:

Loss	Accuracy	Val loss	Val accuracy
0.1609	0.9683	0.3648	0.9185

Test result

Use 2261 samples:

Accuracy	Macro-Precision	Macro-Recall	Macro-F1
0.9363	0.9428	0.9310	0.9360

Images

Accuracy

Loss

Confusion matrix

Usage

usage: train.py [-h] [-t TEST_SAMPLE_PERCENTAGE] [-p PADDING_SIZE]
                [-e EMBED_SIZE] [-f FILTER_SIZES] [-n NUM_FILTERS]
                [-d DROPOUT_RATE] [-c NUM_CLASSES] [-l REGULARIZERS_LAMBDA]
                [-b BATCH_SIZE] [--epochs EPOCHS]
                [--fraction_validation FRACTION_VALIDATION]
                [--results_dir RESULTS_DIR]

This is the TextCNN train project.

optional arguments:
  -h, --help            show this help message and exit
  -t TEST_SAMPLE_PERCENTAGE, --test_sample_percentage TEST_SAMPLE_PERCENTAGE
                        The fraction of test data.(default=0.1)
  -p PADDING_SIZE, --padding_size PADDING_SIZE
                        Padding size of sentences.(default=128)
  -e EMBED_SIZE, --embed_size EMBED_SIZE
                        Word embedding size.(default=512)
  -f FILTER_SIZES, --filter_sizes FILTER_SIZES
                        Convolution kernel sizes.(default=3,4,5)
  -n NUM_FILTERS, --num_filters NUM_FILTERS
                        Number of each convolution kernel.(default=128)
  -d DROPOUT_RATE, --dropout_rate DROPOUT_RATE
                        Dropout rate in softmax layer.(default=0.5)
  -c NUM_CLASSES, --num_classes NUM_CLASSES
                        Number of target classes.(default=18)
  -l REGULARIZERS_LAMBDA, --regularizers_lambda REGULARIZERS_LAMBDA
                        L2 regulation parameter.(default=0.01)
  -b BATCH_SIZE, --batch_size BATCH_SIZE
                        Mini-Batch size.(default=64)
  --epochs EPOCHS       Number of epochs.(default=10)
  --fraction_validation FRACTION_VALIDATION
                        The fraction of validation.(default=0.05)
  --results_dir RESULTS_DIR
                        The results dir including log, model, vocabulary and
                        some images.(default=./results/)

usage: test.py [-h] [-p PADDING_SIZE] [-c NUM_CLASSES] results_dir

This is the TextCNN test project.

positional arguments:
  results_dir           The results dir including log, model, vocabulary and
                        some images.

optional arguments:
  -h, --help            show this help message and exit
  -p PADDING_SIZE, --padding_size PADDING_SIZE
                        Padding size of sentences.(default=128)
  -c NUM_CLASSES, --num_classes NUM_CLASSES
                        Number of target classes.(default=18)

You need to know...

You need to alter load_data_and_write_to_file function in data_helper.py to match you data file;
This code used single channel input, you can use two channels from embedding vector, one is static and the other is dynamic. Maybe it is greater;
The model is saved by hdf5 file;
Tensorboard is available.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 37

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (0) 🔗