All Projects → bamtercelboo → pytorch_Joint-Word-Segmentation-and-POS-Tagging

bamtercelboo / pytorch_Joint-Word-Segmentation-and-POS-Tagging

Licence: Apache-2.0 license
Paper: A Simple and Effective Neural Model for Joint Word Segmentation and POS Tagging

Programming Languages

python
139335 projects - #7 most used programming language
shell
77523 projects

Projects that are alternatives of or similar to pytorch Joint-Word-Segmentation-and-POS-Tagging

rakutenma-python
Rakuten MA (Python version)
Stars: ✭ 15 (-59.46%)
Mutual labels:  word-segmentation, pos-tagging
SynThai
Thai Word Segmentation and Part-of-Speech Tagging with Deep Learning
Stars: ✭ 41 (+10.81%)
Mutual labels:  word-segmentation, pos-tagging
Vncorenlp
A Vietnamese natural language processing toolkit (NAACL 2018)
Stars: ✭ 354 (+856.76%)
Mutual labels:  word-segmentation, pos-tagging
Pytorch-NLU
Pytorch-NLU,一个中文文本分类、序列标注工具包,支持中文长文本、短文本的多类、多标签分类任务,支持中文命名实体识别、词性标注、分词等序列标注任务。 Ptorch NLU, a Chinese text classification and sequence annotation toolkit, supports multi class and multi label classification tasks of Chinese long text and short text, and supports sequence annotation tasks such as Chinese named entity recognition, part of speech ta…
Stars: ✭ 151 (+308.11%)
Mutual labels:  word-segmentation, pos-tagging
Jumanpp
Juman++ (a Morphological Analyzer Toolkit)
Stars: ✭ 254 (+586.49%)
Mutual labels:  word-segmentation, pos-tagging
Nagisa
A Japanese tokenizer based on recurrent neural networks
Stars: ✭ 260 (+602.7%)
Mutual labels:  word-segmentation, pos-tagging
Monpa
MONPA 罔拍是一個提供正體中文斷詞、詞性標註以及命名實體辨識的多任務模型
Stars: ✭ 203 (+448.65%)
Mutual labels:  word-segmentation, pos-tagging
joineRML
R package for fitting joint models to time-to-event data and multivariate longitudinal data
Stars: ✭ 24 (-35.14%)
Mutual labels:  joint-models
skt
Sanskrit compound segmentation using seq2seq model
Stars: ✭ 21 (-43.24%)
Mutual labels:  word-segmentation
word tokenize
Vietnamese Word Tokenize
Stars: ✭ 45 (+21.62%)
Mutual labels:  word-segmentation
Paribhasha
paribhasha.herokuapp.com/
Stars: ✭ 21 (-43.24%)
Mutual labels:  pos-tagging
syntaxnet
Syntaxnet Parsey McParseface wrapper for POS tagging and dependency parsing
Stars: ✭ 77 (+108.11%)
Mutual labels:  pos-tagging
gum
Repository for the Georgetown University Multilayer Corpus (GUM)
Stars: ✭ 71 (+91.89%)
Mutual labels:  pos-tagging
cross-lingual-struct-flow
PyTorch implementation of ACL paper https://arxiv.org/abs/1906.02656
Stars: ✭ 23 (-37.84%)
Mutual labels:  pos-tagging
comparable-text-miner
Comparable documents miner: Arabic-English morphological analysis, text processing, n-gram features extraction, POS tagging, dictionary translation, documents alignment, corpus information, text classification, tf-idf computation, text similarity computation, html documents cleaning
Stars: ✭ 31 (-16.22%)
Mutual labels:  pos-tagging
wink-nlp
Developer friendly Natural Language Processing ✨
Stars: ✭ 312 (+743.24%)
Mutual labels:  pos-tagging
ckipnlp
CKIP CoreNLP Toolkits
Stars: ✭ 92 (+148.65%)
Mutual labels:  word-segmentation
sinling
A collection of NLP tools for Sinhalese (සිංහල).
Stars: ✭ 38 (+2.7%)
Mutual labels:  pos-tagging
WordSegmentationDP
Word Segmentation with Dynamic Programming
Stars: ✭ 18 (-51.35%)
Mutual labels:  word-segmentation
FISR
Official repository of FISR (AAAI 2020).
Stars: ✭ 72 (+94.59%)
Mutual labels:  joint-models

JointPS

A re-implementation of A Simple and Effective Neural Model for Joint Word Segmentation and POS Tagging based on PyTorch.

The C++ Code [zhangmeishan/NNTranJSTagger].

PyTorch-0.3.1 Code release on here. [PyTorch-0.3.1]

Requirement

pip3 install -r requirements.txt
Python  == 3.6  
PyTorch == 1.0.1

Usage

modify the config file, detail see the Config directory
Train:
(1) sh run_train_p.sh
(2) python -u main.py --config ./Config/config.cfg --device cuda:0--train -p 
    [device: "cpu", "cuda:0", "cuda:1", ......]

Config

optimizer: Adam
lr: 0.001
dropout: 0.25
embed_char_dim: 200
embed_bichar_dim: 200
rnn_dim: 200
rnn_hidden_dim: 200
pos_dim: 100
oov: avg 
Refer to config.cfg file for more details.

Network Structure

Performance

CTB5 CTB6 CTB7 PKU NCC
Model SEG      POS SEG      POS SEG      POS SEG      POS SEG      POS
Our Model (No External Embeddings) 97.69    94.16 95.37    90.83 95.32    90.25 95.22    92.62 93.97    89.47
Our Model (Basic Embeddings) 97.93    94.44 95.78    91.79 95.77    91.12 95.82    93.42 94.52    89.82
Our Model (Word-context Embeddings) 98.50    94.95 96.36    92.51 96.25    91.87 96.35    94.14 95.30    90.42

Cite

@Article{zhang2018jointposseg,  
  author    = {Zhang, Meishan and Yu, Nan and Fu, Guohong},  
  title     = {A Simple and Effective Neural Model for Joint Word Segmentation and POS Tagging},  
  journal   = {IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP)},  
  year      = {2018},  
  volume    = {26},  
  number    = {9},
  pages     = {1528--1538},
  publisher = {IEEE Press},
}

Question

  • if you have any question, you can open a issue or email [email protected][email protected]bamtercelboo@{gmail.com, 163.com}.

  • if you have any good suggestions, you can PR or email me.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].