All Projects → jiesutd → Latticelstm

jiesutd / Latticelstm

Chinese NER using Lattice LSTM. Code for ACL 2018 paper.

Programming Languages

python
139335 projects - #7 most used programming language

Labels

Projects that are alternatives of or similar to Latticelstm

Tf ner
Simple and Efficient Tensorflow implementations of NER models with tf.estimator and tf.data
Stars: ✭ 876 (-33.54%)
Mutual labels:  ner
Bert Ner
Use Google's BERT for named entity recognition (CoNLL-2003 as the dataset).
Stars: ✭ 1,012 (-23.22%)
Mutual labels:  ner
Farm
🏡 Fast & easy transfer learning for NLP. Harvesting language models for the industry. Focus on Question Answering.
Stars: ✭ 1,140 (-13.51%)
Mutual labels:  ner
Recognizers Text
Microsoft.Recognizers.Text provides recognition and resolution of numbers, units, and date/time expressed in multiple languages (ZH, EN, FR, ES, PT, DE, IT, TR, HI. Partial support for NL, JA, KO, SV). Contributions are greatly welcome! Packages are available at https://www.nuget.org/profiles/Recognizers.Text and https://www.npmjs.com/~recognizers.text
Stars: ✭ 915 (-30.58%)
Mutual labels:  ner
Named entity recognition
中文命名实体识别(包括多种模型:HMM,CRF,BiLSTM,BiLSTM+CRF的具体实现)
Stars: ✭ 995 (-24.51%)
Mutual labels:  ner
Phonlp
PhoNLP: A BERT-based multi-task learning toolkit for part-of-speech tagging, named entity recognition and dependency parsing (NAACL 2021)
Stars: ✭ 56 (-95.75%)
Mutual labels:  ner
Knowledge Graphs
A collection of research on knowledge graphs
Stars: ✭ 845 (-35.89%)
Mutual labels:  ner
Nlp Journey
Documents, papers and codes related to Natural Language Processing, including Topic Model, Word Embedding, Named Entity Recognition, Text Classificatin, Text Generation, Text Similarity, Machine Translation),etc. All codes are implemented intensorflow 2.0.
Stars: ✭ 1,290 (-2.12%)
Mutual labels:  ner
Jointre
End-to-end neural relation extraction using deep biaffine attention (ECIR 2019)
Stars: ✭ 41 (-96.89%)
Mutual labels:  ner
Tianchi Ruijin
瑞金医院MMC人工智能辅助构建知识图谱大赛-baseline
Stars: ✭ 62 (-95.3%)
Mutual labels:  ner
Meta Emb
Multilingual Meta-Embeddings for Named Entity Recognition (RepL4NLP & EMNLP 2019)
Stars: ✭ 28 (-97.88%)
Mutual labels:  ner
Nlp Experiments In Pytorch
PyTorch repository for text categorization and NER experiments in Turkish and English.
Stars: ✭ 35 (-97.34%)
Mutual labels:  ner
Ntagger
reference pytorch code for named entity tagging
Stars: ✭ 58 (-95.6%)
Mutual labels:  ner
Nlp Knowledge Graph
自然语言处理、知识图谱、对话系统三大技术研究与应用。
Stars: ✭ 908 (-31.11%)
Mutual labels:  ner
Turkish Bert Nlp Pipeline
Bert-base NLP pipeline for Turkish, Ner, Sentiment Analysis, Question Answering etc.
Stars: ✭ 85 (-93.55%)
Mutual labels:  ner
Company Names Corpus
公司名语料库。机构名语料库。公司简称,缩写,品牌词,企业名。可用于中文分词、机构名实体识别。
Stars: ✭ 868 (-34.14%)
Mutual labels:  ner
Ner blstm Crf
LSTM-CRF for NER with ConLL-2002 dataset
Stars: ✭ 51 (-96.13%)
Mutual labels:  ner
Uer Py
Open Source Pre-training Model Framework in PyTorch & Pre-trained Model Zoo
Stars: ✭ 1,295 (-1.75%)
Mutual labels:  ner
Ccks2019 Task5
CCKS2019评测任务五-公众公司公告信息抽取,第3名
Stars: ✭ 87 (-93.4%)
Mutual labels:  ner
Torchcrf
An Inplementation of CRF (Conditional Random Fields) in PyTorch 1.0
Stars: ✭ 58 (-95.6%)
Mutual labels:  ner

Chinese NER Using Lattice LSTM

Lattice LSTM for Chinese NER. Character based LSTM with Lattice embeddings as input.

Models and results can be found at our ACL 2018 paper Chinese NER Using Lattice LSTM. It achieves 93.18% F1-value on MSRA dataset, which is the state-of-the-art result on Chinese NER task.

Details will be updated soon.

Requirement:

Python: 2.7   
PyTorch: 0.3.0 

(for PyTorch 0.3.1, please refer issue#8 for a slight modification.)

Input format:

CoNLL format (prefer BIOES tag scheme), with each character its label for one line. Sentences are splited with a null line.

美	B-LOC
国	E-LOC
的	O
华	B-PER
莱	I-PER
士	E-PER

我	O
跟	O
他	O
谈	O
笑	O
风	O
生	O 

Pretrained Embeddings:

The pretrained character and word embeddings are the same with the embeddings in the baseline of RichWordSegmentor

Character embeddings (gigaword_chn.all.a2b.uni.ite50.vec): Google Drive or Baidu Pan

Word(Lattice) embeddings (ctb.50d.vec): Google Drive or Baidu Pan

How to run the code?

  1. Download the character embeddings and word embeddings and put them in the data folder.
  2. Modify the run_main.py or run_demo.py by adding your train/dev/test file directory.
  3. sh run_main.py or sh run_demo.py

Resume NER data

Crawled from the Sina Finance, it includes the resumes of senior executives from listed companies in the Chinese stock market. Details can be found in our paper.

Cite:

Please cite our ACL 2018 paper:

@article{zhang2018chinese,  
 title={Chinese NER Using Lattice LSTM},  
 author={Yue Zhang and Jie Yang},  
 booktitle={Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL)},
 year={2018}  
}
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].