Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → Jverma → cnn-text-classification-keras

Jverma / cnn-text-classification-keras

Licence: other

Convolutional Neural Network for Text Classification in Keras

Programming Languages

139335 projects - #7 most used programming language

Labels

machine-learning deep-learning text-classification neural-networks convolutional-neural-networks

Projects that are alternatives of or similar to cnn-text-classification-keras

medical-diagnosis-cnn-rnn-rcnn

分别使用rnn/cnn/rcnn来实现根据患者描述，进行疾病诊断

Stars: ✭ 39 (+178.57%)

Mutual labels: text-classification

support-tickets-classification

This case study shows how to create a model for text analysis and classification and deploy it as a web service in Azure cloud in order to automatically classify support tickets. This project is a proof of concept made by Microsoft (Commercial Software Engineering team) in collaboration with Endava http://endava.com/en

Stars: ✭ 142 (+914.29%)

Mutual labels: text-classification

Nodejs binding for fasttext representation and classification.

Stars: ✭ 39 (+178.57%)

Mutual labels: text-classification

text-classification-svm

The missing SVM-based text classification module implementing HanLP's interface

Stars: ✭ 46 (+228.57%)

Mutual labels: text-classification

fake-news-detection

This repo is a collection of AWESOME things about fake news detection, including papers, code, etc.

Stars: ✭ 34 (+142.86%)

Mutual labels: text-classification

BERT, LDA, and TFIDF based keyword extraction in Python

Stars: ✭ 33 (+135.71%)

Mutual labels: text-classification

synaptic-simple-trainer

A ready to go text classification trainer based on synaptic (https://github.com/cazala/synaptic)

Stars: ✭ 19 (+35.71%)

Mutual labels: text-classification

TextUnderstandingTsetlinMachine

Using the Tsetlin Machine to learn human-interpretable rules for high-accuracy text categorization with medical applications

Stars: ✭ 48 (+242.86%)

Mutual labels: text-classification

Filipino-Text-Benchmarks

Open-source benchmark datasets and pretrained transformer models in the Filipino language.

Stars: ✭ 22 (+57.14%)

Mutual labels: text-classification

Kaggle-Twitter-Sentiment-Analysis

Kaggle Twitter Sentiment Analysis Competition

Stars: ✭ 18 (+28.57%)

Mutual labels: text-classification

DaDengAndHisPython

【微信公众号：大邓和他的python】, Python语法快速入门https://www.bilibili.com/video/av44384851 Python网络爬虫快速入门https://www.bilibili.com/video/av72010301, 我的联系邮箱[email protected]

Stars: ✭ 59 (+321.43%)

Mutual labels: text-classification

Evidence-based Explanation Dataset (AACL-IJCNLP 2020)

Stars: ✭ 16 (+14.29%)

Mutual labels: text-classification

OpenTC is a text classification engine using several algorithms in machine learning

Stars: ✭ 27 (+92.86%)

Mutual labels: text-classification

Binary-Text-Classification-Doc2vec-SVM

A Python implementation of a binary text classifier using Doc2Vec and SVM

Stars: ✭ 16 (+14.29%)

Mutual labels: text-classification

Code for paper "Hierarchical Text Classification with Reinforced Label Assignment" EMNLP 2019

Stars: ✭ 116 (+728.57%)

Mutual labels: text-classification

Multi-class text categorization using state-of-the-art pre-trained contextualized language models, e.g. BERT

Stars: ✭ 15 (+7.14%)

Mutual labels: text-classification

policy-data-analyzer

Building a model to recognize incentives for landscape restoration in environmental policies from Latin America, the US and India. Bringing NLP to the world of policy analysis through an extensible framework that includes scraping, preprocessing, active learning and text analysis pipelines.

Stars: ✭ 22 (+57.14%)

Mutual labels: text-classification

Lbl2Vec learns jointly embedded label, document and word vectors to retrieve documents with predefined topics from an unlabeled document corpus.

Stars: ✭ 25 (+78.57%)

Mutual labels: text-classification

[CIKM 2018] Weakly-Supervised Neural Text Classification

Stars: ✭ 67 (+378.57%)

Mutual labels: text-classification

Machine Learning (EE 5184) in NTU

Stars: ✭ 66 (+371.43%)

Mutual labels: text-classification

View All Similar Projects ➔

cnn-text-classification-keras

Convolutional Neural Network for Text Classification in Keras

This is a Keras implementation of Yoon Kim's paper Convolution Neural Networks for Sentence Classification with the addition that this code also works for the Glove vectors and Fasttext vectors.

Requirements:

numpy
keras
cPickle

Usage:

Download the pre-trained Google word2vec word embedding vectors as a binary file from here
Pre-process the text data

from text_processing_util import TextProcessing

tp = TextProcessing(texts, labels, EMBEDDING_DIM, MAX_SEQUENCE_LENGTH, MAX_NB_WORDS, VALIDATION_SPLIT)

where

- texts: a list of sentences.
- labels: a list of labels corresponding to the sentences in the list texts.
- MAX_SEQUENCE_LENGTH: maximum length of the sentence to be considered, longer sentences will be terminated at this length.(default is 50)
- MAX_NB_WORDS: maximum number of words to be used in the model (default is 10000).
- EMBEDDING_DIM: dimension of the word vectors (default is 300 for google word2vec).
- VALIDATION_SPLIT: fraction of data to be used for validation. (default is 0.2).

Split into train and test data.

x_train, y_train, x_val, y_val, word_index = tp.preprocess()

Build the embeddings index.

embeddings_index = tp.build_embedding_index_from_word2vec(path_to_wordvec_file, word_index)

Serialize the data after the processing.

import cPickle

cPickle.dump([word_index, embeddings_index], open('tokenization_and_embedding.p', 'wb'))

Get labels index.

labels_index = tp.labels_index

Build the CNN model

from text_cnn import kimCNN

model = kimCNN(EMBEDDING_DIM, MAX_SEQUENCE_LENGTH, MAX_NB_WORDS, embeddings_index, word_index, labels_index=labels_index)

Fit the model

model.fit(x=x_train, y=y_train, batch_size=50, epochs=25 , validation_data=(x_val, y_val))

For a detailed example see example.py. This is the same example used in Kim's paper and the original theano code.

References:

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 14

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (0) 🔗