All Projects → bigboNed3 → Chinese_ulmfit

bigboNed3 / Chinese_ulmfit

中文ULMFiT 情感分析 文本分类

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Chinese ulmfit

Sarcasm Detection
Detecting Sarcasm on Twitter using both traditonal machine learning and deep learning techniques.
Stars: ✭ 73 (-64.9%)
Mutual labels:  text-classification, sentiment-analysis
Rnn Text Classification Tf
Tensorflow Implementation of Recurrent Neural Network (Vanilla, LSTM, GRU) for Text Classification
Stars: ✭ 114 (-45.19%)
Mutual labels:  text-classification, sentiment-analysis
Hierarchical Attention Networks
TensorFlow implementation of the paper "Hierarchical Attention Networks for Document Classification"
Stars: ✭ 75 (-63.94%)
Mutual labels:  text-classification, sentiment-analysis
Sentiment analysis albert
sentiment analysis、文本分类、ALBERT、TextCNN、classification、tensorflow、BERT、CNN、text classification
Stars: ✭ 61 (-70.67%)
Mutual labels:  text-classification, sentiment-analysis
Rcnn Text Classification
Tensorflow Implementation of "Recurrent Convolutional Neural Network for Text Classification" (AAAI 2015)
Stars: ✭ 127 (-38.94%)
Mutual labels:  text-classification, sentiment-analysis
Deep Atrous Cnn Sentiment
Deep-Atrous-CNN-Text-Network: End-to-end word level model for sentiment analysis and other text classifications
Stars: ✭ 64 (-69.23%)
Mutual labels:  text-classification, sentiment-analysis
Tia
Your Advanced Twitter stalking tool
Stars: ✭ 98 (-52.88%)
Mutual labels:  text-classification, sentiment-analysis
Ml Classify Text Js
Machine learning based text classification in JavaScript using n-grams and cosine similarity
Stars: ✭ 38 (-81.73%)
Mutual labels:  text-classification, sentiment-analysis
Cluedatasetsearch
搜索所有中文NLP数据集,附常用英文NLP数据集
Stars: ✭ 2,112 (+915.38%)
Mutual labels:  text-classification, sentiment-analysis
Dan Jurafsky Chris Manning Nlp
My solution to the Natural Language Processing course made by Dan Jurafsky, Chris Manning in Winter 2012.
Stars: ✭ 124 (-40.38%)
Mutual labels:  text-classification, sentiment-analysis
Textblob Ar
Arabic support for textblob
Stars: ✭ 60 (-71.15%)
Mutual labels:  text-classification, sentiment-analysis
Spark Nlp
State of the Art Natural Language Processing
Stars: ✭ 2,518 (+1110.58%)
Mutual labels:  sentiment-analysis, text-classification
Text Classification Keras
📚 Text classification library with Keras
Stars: ✭ 53 (-74.52%)
Mutual labels:  text-classification, sentiment-analysis
Text Analytics With Python
Learn how to process, classify, cluster, summarize, understand syntax, semantics and sentiment of text data with the power of Python! This repository contains code and datasets used in my book, "Text Analytics with Python" published by Apress/Springer.
Stars: ✭ 1,132 (+444.23%)
Mutual labels:  text-classification, sentiment-analysis
Meta Learning Bert
Meta learning with BERT as a learner
Stars: ✭ 52 (-75%)
Mutual labels:  text-classification, sentiment-analysis
Doc2vec
📓 Long(er) text representation and classification using Doc2Vec embeddings
Stars: ✭ 92 (-55.77%)
Mutual labels:  text-classification, sentiment-analysis
Omnicat Bayes
Naive Bayes text classification implementation as an OmniCat classifier strategy. (#ruby #naivebayes)
Stars: ✭ 30 (-85.58%)
Mutual labels:  text-classification, sentiment-analysis
Tensorflow Sentiment Analysis On Amazon Reviews Data
Implementing different RNN models (LSTM,GRU) & Convolution models (Conv1D, Conv2D) on a subset of Amazon Reviews data with TensorFlow on Python 3. A sentiment analysis project.
Stars: ✭ 34 (-83.65%)
Mutual labels:  text-classification, sentiment-analysis
Context
ConText v4: Neural networks for text categorization
Stars: ✭ 120 (-42.31%)
Mutual labels:  text-classification, sentiment-analysis
Onnxt5
Summarization, translation, sentiment-analysis, text-generation and more at blazing speed using a T5 version implemented in ONNX.
Stars: ✭ 143 (-31.25%)
Mutual labels:  text-classification, sentiment-analysis

中文ULMFiT

Universal Language Model Fine-tuning for Text Classification

下载预训练的模型

创建虚拟环境(可以配置清华conda源

conda env create -f env.yml

解压中文维基百科语料

python -m gensim.scripts.segment_wiki -i -f /data/zhwiki-latest-pages-articles.xml.bz2 -o tmp/wiki2018-11-14.json.gz

分词维基百科语料

python preprocessing.py segment-wiki --input_file=tmp/wiki2018-11-14.json.gz --output_file=tmp/wiki2018-11-14.words.pkl

分词领域语料

python preprocessing.py segment-csv --input_file=data/ch_auto.csv --output_file=tmp/ch_auto.words.pkl --label_file=tmp/ch_auto.labels.npy

tokenize维基百科语料

python preprocessing.py tokenize --input_file=tmp/wiki2018-11-14.words.pkl --output_file=tmp/wiki2018-11-14.ids.npy --mapping_file=tmp/wiki2018-11-14.mapping.pkl

tokenize领域语料

python preprocessing.py tokenize --input_file=tmp/ch_auto.words.pkl --output_file=tmp/ch_auto.ids.npy --mapping_file=tmp/ch_auto.mapping.pkl

预训练

python pretraining.py --input_file=tmp/wiki2018-11-14.ids.npy --mapping_file=tmp/wiki2018-11-14.mapping.pkl --dir_path=tmp

微调

python finetuning.py --input_file=tmp/ch_auto.ids.npy --mapping_file=tmp/ch_auto.mapping.pkl --pretrain_model_file=tmp/models/wiki2018-11-14.h5 --pretrain_mapping_file=tmp/wiki2018-11-14.mapping.pkl --dir_path=tmp --model_id=ch_auto

训练分类器

python3 train_classifier.py  --id_file=tmp/ch_auto.ids.npy --label_file=tmp/ch_auto.labels.npy --mapping_file=tmp/ch_auto.mapping.pkl  --encoder_file=ch_auto_enc

测试

python3 predicting.py --mapping_file=tmp/ch_auto.mapping.pkl --classifier_filename=tmp/models/classifier_1.h5 --num_class=2
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].