Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

NCRF++, a Neural Sequence Labeling Toolkit. Easy use to any sequence labeling tasks (e.g. NER, POS, Segmentation). It includes character LSTM/CNN, word LSTM/CNN and softmax/CRF components.

Stars: ✭ 1,767 (+1740.63%)

Mutual labels: natural-language-processing, named-entity-recognition, ner

Awesome Hungarian Nlp

A curated list of NLP resources for Hungarian

Stars: ✭ 121 (+26.04%)

Mutual labels: dataset, natural-language-processing, named-entity-recognition

Spacy Streamlit

👑 spaCy building blocks and visualizers for Streamlit apps

Stars: ✭ 360 (+275%)

Mutual labels: natural-language-processing, named-entity-recognition, ner

Turkish Bert Nlp Pipeline

Bert-base NLP pipeline for Turkish, Ner, Sentiment Analysis, Question Answering etc.

Stars: ✭ 85 (-11.46%)

Mutual labels: natural-language-processing, named-entity-recognition, ner

Spacy Lookup

Named Entity Recognition based on dictionaries

Stars: ✭ 212 (+120.83%)

Mutual labels: natural-language-processing, named-entity-recognition, ner

Chatbot ner

chatbot_ner: Named Entity Recognition for chatbots.

Stars: ✭ 273 (+184.38%)

Mutual labels: natural-language-processing, named-entity-recognition, ner

Vncorenlp

A Vietnamese natural language processing toolkit (NAACL 2018)

Stars: ✭ 354 (+268.75%)

Mutual labels: natural-language-processing, named-entity-recognition, ner

Cluener2020

CLUENER2020 中文细粒度命名实体识别 Fine Grained Named Entity Recognition

Stars: ✭ 689 (+617.71%)

Mutual labels: dataset, named-entity-recognition, ner

Entity Recognition Datasets

A collection of corpora for named entity recognition (NER) and entity recognition tasks. These annotated datasets cover a variety of languages, domains and entity types.

Stars: ✭ 891 (+828.13%)

Mutual labels: natural-language-processing, named-entity-recognition, ner

Named entity recognition

中文命名实体识别（包括多种模型：HMM，CRF，BiLSTM，BiLSTM+CRF的具体实现）

Stars: ✭ 995 (+936.46%)

Mutual labels: named-entity-recognition, ner

Jointre

End-to-end neural relation extraction using deep biaffine attention (ECIR 2019)

Stars: ✭ 41 (-57.29%)

Mutual labels: named-entity-recognition, ner

Pytreebank

😡😇 Stanford Sentiment Treebank loader in Python

Stars: ✭ 93 (-3.12%)

Mutual labels: dataset, natural-language-processing

Mtnt

Code for the collection and analysis of the MTNT dataset

Stars: ✭ 48 (-50%)

Mutual labels: dataset, natural-language-processing

Understanding Financial Reports Using Natural Language Processing

Investigate how mutual funds leverage credit derivatives by studying their routine filings to the SEC using NLP techniques 📈🤑

Stars: ✭ 36 (-62.5%)

Mutual labels: natural-language-processing, named-entity-recognition

Nagisa Tutorial Pycon2019

Code for PyCon JP 2019 talk "Python による日本語自然言語処理〜系列ラベリングによる実世界テキスト分析〜"

Stars: ✭ 46 (-52.08%)

Mutual labels: natural-language-processing, named-entity-recognition

Corenlp

Stanford CoreNLP: A Java suite of core NLP tools.

Stars: ✭ 8,248 (+8491.67%)

Mutual labels: natural-language-processing, named-entity-recognition

Iob2corpus

Japanese IOB2 tagged corpus for Named Entity Recognition.

Stars: ✭ 51 (-46.87%)

Mutual labels: natural-language-processing, named-entity-recognition

Coarij

Corpus of Annual Reports in Japan

Stars: ✭ 55 (-42.71%)

Mutual labels: dataset, natural-language-processing

View All Similar Projects ➔

BOND

This repo contains our code and pre-processed distantly/weakly labeled data for paper BOND: BERT-Assisted Open-Domain Name Entity Recognition with Distant Supervision (KDD2020)

BOND

Benchmark

The reuslts (entity-level F1 score) are summarized as follows:

Method	CoNLL03	Tweet	OntoNote5.0	Webpage	Wikigold
Full Supervision	91.21	52.19	86.20	72.39	86.43
Previous SOTA	76.00	26.10	67.69	51.39	47.54
BOND	81.48	48.01	68.35	65.74	60.07

Full Supervision: Roberta Finetuning/BiLSTM CRF
Previous SOTA: BiLSTM-CRF/AutoNER/LR-CRF/KALM/CONNET

Data

We release five open-domain distantly/weakly labeled NER datasets here: dataset. For gazetteers information and distant label generation code, please directly email [email protected].

Environment

Python 3.7, Pytorch 1.3, Hugging Face Transformers v2.3.0.

Training & Evaluation

We provides the training scripts for all five open-domain distantly/weakly labeled NER datasets in scripts. E.g., for BOND training and evaluation on CoNLL03

cd BOND
./scripts/conll_self_training.sh

For Stage I training and evaluation on CoNLL03

cd BOND
./scripts/conll_baseline.sh

The test reuslts (entity-level F1 score) are summarized as follows:

Method	CoNLL03	Tweet	OntoNote5.0	Webpage	Wikigold
Stage I	75.61	46.61	68.11	59.11	52.15
BOND	81.48	48.01	68.35	65.74	60.07

Citation

Please cite the following paper if you are using our datasets/tool. Thanks!

@inproceedings{liang2020bond,
  title={BOND: Bert-Assisted Open-Domain Named Entity Recognition with Distant Supervision},
  author={Liang, Chen and Yu, Yue and Jiang, Haoming and Er, Siawpeng and Wang, Ruijia and Zhao, Tuo and Zhang, Chao},
  booktitle={ACM SIGKDD International Conference on Knowledge Discovery and Data Mining},
  year={2020}
}

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 96

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (2) 🔗