All Projects → zedom1 → XLNet_embbeding

zedom1 / XLNet_embbeding

Licence: MIT License
Using XLNet as Embedding of Keras

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to XLNet embbeding

mandrake
Mandrake 🌿/👨‍🔬🦆 – Fast visualisation of the population structure of pathogens using Stochastic Cluster Embedding
Stars: ✭ 29 (-9.37%)
Mutual labels:  embedding
AnnA Anki neuronal Appendix
Using machine learning on your anki collection to enhance the scheduling via semantic clustering and semantic similarity
Stars: ✭ 39 (+21.88%)
Mutual labels:  embedding
RolX
An alternative implementation of Recursive Feature and Role Extraction (KDD11 & KDD12)
Stars: ✭ 52 (+62.5%)
Mutual labels:  embedding
tf retrieval baseline
A Tensorflow retrieval (space embedding) baseline. Metric learning baseline on CUB and Stanford Online Products.
Stars: ✭ 39 (+21.88%)
Mutual labels:  embedding
playing with vae
Comparing FC VAE / FCN VAE / PCA / UMAP on MNIST / FMNIST
Stars: ✭ 53 (+65.63%)
Mutual labels:  embedding
text-classification-cn
中文文本分类实践,基于搜狗新闻语料库,采用传统机器学习方法以及预训练模型等方法
Stars: ✭ 81 (+153.13%)
Mutual labels:  embedding
DREML
PyTorch implementation of Deep Randomized Ensembles for Metric Learning(ECCV2018)
Stars: ✭ 67 (+109.38%)
Mutual labels:  embedding
FSCNMF
An implementation of "Fusing Structure and Content via Non-negative Matrix Factorization for Embedding Information Networks".
Stars: ✭ 16 (-50%)
Mutual labels:  embedding
NLP-paper
🎨 🎨NLP 自然语言处理教程 🎨🎨 https://dataxujing.github.io/NLP-paper/
Stars: ✭ 23 (-28.12%)
Mutual labels:  xlnet
walklets
A lightweight implementation of Walklets from "Don't Walk Skip! Online Learning of Multi-scale Network Embeddings" (ASONAM 2017).
Stars: ✭ 94 (+193.75%)
Mutual labels:  embedding
KGReasoning
Multi-Hop Logical Reasoning in Knowledge Graphs
Stars: ✭ 197 (+515.63%)
Mutual labels:  embedding
exembed
Go Embed experiments
Stars: ✭ 27 (-15.62%)
Mutual labels:  embedding
BERT-embedding
A simple wrapper class for extracting features(embedding) and comparing them using BERT in TensorFlow
Stars: ✭ 24 (-25%)
Mutual labels:  embedding
OpenANE
OpenANE: the first Open source framework specialized in Attributed Network Embedding. The related paper was accepted by Neurocomputing. https://doi.org/10.1016/j.neucom.2020.05.080
Stars: ✭ 39 (+21.88%)
Mutual labels:  embedding
pymde
Minimum-distortion embedding with PyTorch
Stars: ✭ 420 (+1212.5%)
Mutual labels:  embedding
event-embedding-multitask
*SEM 2018: Learning Distributed Event Representations with a Multi-Task Approach
Stars: ✭ 22 (-31.25%)
Mutual labels:  embedding
Text-Summarization
Abstractive and Extractive Text summarization using Transformers.
Stars: ✭ 38 (+18.75%)
Mutual labels:  xlnet
Bert-text-classification
This shows how to fine-tune Bert language model and use PyTorch-transformers for text classififcation
Stars: ✭ 54 (+68.75%)
Mutual labels:  xlnet
Embedding
Embedding模型代码和学习笔记总结
Stars: ✭ 25 (-21.87%)
Mutual labels:  embedding
nodebb-plugin-ns-embed
Embed media and rich content in posts: YouTube, Vimeo, Twitch and more.
Stars: ✭ 27 (-15.62%)
Mutual labels:  embedding

XLNet Embedding

将XLNet作为Embedding的Keras封装,根据需要取出某层或某些层的输出作为特征,并可以在后面搭建自定义的网络(如Fasttext)

Usage:

  1. 下载 XLNet模型:https://github.com/ymcui/Chinese-PreTrained-XLNet

  2. 下载代码,解压XLNet模型至代码目录

  3. 准备训练数据并放置在data目录

  4. 修改配置和网络

  5. 训练 / 测试 / 预测

代码说明

demo默认任务为文本分类,若目标为其他任务需要自行修改demo.py文件

高频修改函数:

get_config(): 模型及XLNet配置

process_data(): 修改文本读取、预处理

create_model(): 在XLNet后增加自己的网络结构,默认为fasttext

中频修改函数:

train(): 训练模型,可在这里修改优化器、回调函数等

test(): 加载训练保存的模型进行测试,使用classification_report 和 accuracy_score, 其他任务可自行修改

predict(): 加载模型进行预测,保存到文件中

不建议修改函数:

encode_data(): 对输入进行编码

init():初始化参数

参考/致谢

  1. Chinese-PreTrained-XLNet (ymcui) https://github.com/ymcui/Chinese-PreTrained-XLNet
  2. keras-xlnet (CyberZHG) https://github.com/CyberZHG/keras-xlnet
  3. Keras-TextClassification (yongzhuo) https://github.com/yongzhuo/Keras-TextClassification
  4. xlnet (zihangdai) https://github.com/zihangdai/xlnet
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].