All Projects → yaleimeng → NER_corpus_chinese

yaleimeng / NER_corpus_chinese

Licence: MIT License
NER(命名实体识别)中文语料,一站式获取

Projects that are alternatives of or similar to NER corpus chinese

SynLSTM-for-NER
Code and models for the paper titled "Better Feature Integration for Named Entity Recognition", NAACL 2021.
Stars: ✭ 26 (-74.51%)
Mutual labels:  named-entity-recognition, ner
CrossNER
CrossNER: Evaluating Cross-Domain Named Entity Recognition (AAAI-2021)
Stars: ✭ 87 (-14.71%)
Mutual labels:  named-entity-recognition, ner
scikitcrf NER
Python library for custom entity recognition using Sklearn CRF
Stars: ✭ 17 (-83.33%)
Mutual labels:  named-entity-recognition, ner
presidio-research
This package features data-science related tasks for developing new recognizers for Presidio. It is used for the evaluation of the entire system, as well as for evaluating specific PII recognizers or PII detection models.
Stars: ✭ 62 (-39.22%)
Mutual labels:  named-entity-recognition, ner
mitie-ruby
Named-entity recognition for Ruby
Stars: ✭ 77 (-24.51%)
Mutual labels:  named-entity-recognition, ner
NER-and-Linking-of-Ancient-and-Historic-Places
An NER tool for ancient place names based on Pleiades and Spacy.
Stars: ✭ 26 (-74.51%)
Mutual labels:  named-entity-recognition, ner
simple NER
simple rule based named entity recognition
Stars: ✭ 29 (-71.57%)
Mutual labels:  named-entity-recognition, ner
Ner Bert Pytorch
PyTorch solution of named entity recognition task Using Google AI's pre-trained BERT model.
Stars: ✭ 249 (+144.12%)
Mutual labels:  named-entity-recognition, ner
ner-d
Python module for Named Entity Recognition (NER) using natural language processing.
Stars: ✭ 14 (-86.27%)
Mutual labels:  named-entity-recognition, ner
korean ner tagging challenge
KU_NERDY 이동엽, 임희석 (2017 국어 정보 처리 시스템경진대회 금상) - 한글 및 한국어 정보처리 학술대회
Stars: ✭ 30 (-70.59%)
Mutual labels:  named-entity-recognition, ner
neural name tagging
Code for "Reliability-aware Dynamic Feature Composition for Name Tagging" (ACL2019)
Stars: ✭ 39 (-61.76%)
Mutual labels:  named-entity-recognition, ner
deep-atrous-ner
Deep-Atrous-CNN-NER: Word level model for Named Entity Recognition
Stars: ✭ 35 (-65.69%)
Mutual labels:  named-entity-recognition, ner
PhoNER COVID19
COVID-19 Named Entity Recognition for Vietnamese (NAACL 2021)
Stars: ✭ 55 (-46.08%)
Mutual labels:  named-entity-recognition, ner
TweebankNLP
[LREC 2022] An off-the-shelf pre-trained Tweet NLP Toolkit (NER, tokenization, lemmatization, POS tagging, dependency parsing) + Tweebank-NER dataset
Stars: ✭ 84 (-17.65%)
Mutual labels:  named-entity-recognition, ner
KoBERT-NER
NER Task with KoBERT (with Naver NLP Challenge dataset)
Stars: ✭ 76 (-25.49%)
Mutual labels:  named-entity-recognition, ner
molminer
Python library and command-line tool for extracting compounds from scientific literature. Written in Python.
Stars: ✭ 38 (-62.75%)
Mutual labels:  named-entity-recognition, ner
Pytorch Bert Crf Ner
KoBERT와 CRF로 만든 한국어 개체명인식기 (BERT+CRF based Named Entity Recognition model for Korean)
Stars: ✭ 236 (+131.37%)
Mutual labels:  named-entity-recognition, ner
Bert ner
Ner with Bert
Stars: ✭ 240 (+135.29%)
Mutual labels:  named-entity-recognition, ner
anonymization-api
How to build and deploy an anonymization API with FastAPI
Stars: ✭ 51 (-50%)
Mutual labels:  named-entity-recognition, ner
lingvo--Ner-ru
Named entity recognition (NER) in Russian texts / Определение именованных сущностей (NER) в тексте на русском языке
Stars: ✭ 38 (-62.75%)
Mutual labels:  named-entity-recognition, ner

NER_corpus_chinese

可获取的ENR中文语料

目前网络上流传比较广泛的主要是以下几个:

  • 人民日报1998版本:其实就是分词的训练语料,将其中/t、/nr、/ns、/nt看做实体标签就可以做NER了。
  • MSRA语料:以BIO格式标注了人名、地名、组织机构名三类实体。
  • 玻森NLP语料:包含2000个段落,标注了6类实体,除了时间和3种常见类别外,还有公司名和产品名。但规模小,只有1MB多点。

其他可能用来研究的语料:

  • 人民日报2014版:标注格式跟1998版有较大变化,词性分得更细致,实体标注有嵌套关系。规模大了不少,有1750万字左右,但需要较为复杂的预处理。
  • 不知名语料:以BIO格式标注了人名、地名、组织机构名三类实体。130多万字

不应公开传播的垂直域NER语料:

  • CCKS2017电子病历实体标注:
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].