All Projects → cliang1453 → Bond

cliang1453 / Bond

Licence: apache-2.0
BOND: BERT-Assisted Open-Domain Name Entity Recognition with Distant Supervision

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Bond

Bert Sklearn
a sklearn wrapper for Google's BERT model
Stars: ✭ 182 (+89.58%)
Mutual labels:  natural-language-processing, named-entity-recognition, ner
Pytorch Bert Crf Ner
KoBERT와 CRF로 만든 한국어 개체명인식기 (BERT+CRF based Named Entity Recognition model for Korean)
Stars: ✭ 236 (+145.83%)
Mutual labels:  natural-language-processing, named-entity-recognition, ner
Ncrfpp
NCRF++, a Neural Sequence Labeling Toolkit. Easy use to any sequence labeling tasks (e.g. NER, POS, Segmentation). It includes character LSTM/CNN, word LSTM/CNN and softmax/CRF components.
Stars: ✭ 1,767 (+1740.63%)
Mutual labels:  natural-language-processing, named-entity-recognition, ner
Awesome Hungarian Nlp
A curated list of NLP resources for Hungarian
Stars: ✭ 121 (+26.04%)
Mutual labels:  dataset, natural-language-processing, named-entity-recognition
Spacy Streamlit
👑 spaCy building blocks and visualizers for Streamlit apps
Stars: ✭ 360 (+275%)
Mutual labels:  natural-language-processing, named-entity-recognition, ner
Turkish Bert Nlp Pipeline
Bert-base NLP pipeline for Turkish, Ner, Sentiment Analysis, Question Answering etc.
Stars: ✭ 85 (-11.46%)
Mutual labels:  natural-language-processing, named-entity-recognition, ner
Spacy Lookup
Named Entity Recognition based on dictionaries
Stars: ✭ 212 (+120.83%)
Mutual labels:  natural-language-processing, named-entity-recognition, ner
Chatbot ner
chatbot_ner: Named Entity Recognition for chatbots.
Stars: ✭ 273 (+184.38%)
Mutual labels:  natural-language-processing, named-entity-recognition, ner
Vncorenlp
A Vietnamese natural language processing toolkit (NAACL 2018)
Stars: ✭ 354 (+268.75%)
Mutual labels:  natural-language-processing, named-entity-recognition, ner
Cluener2020
CLUENER2020 中文细粒度命名实体识别 Fine Grained Named Entity Recognition
Stars: ✭ 689 (+617.71%)
Mutual labels:  dataset, named-entity-recognition, ner
Entity Recognition Datasets
A collection of corpora for named entity recognition (NER) and entity recognition tasks. These annotated datasets cover a variety of languages, domains and entity types.
Stars: ✭ 891 (+828.13%)
Mutual labels:  natural-language-processing, named-entity-recognition, ner
Named entity recognition
中文命名实体识别(包括多种模型:HMM,CRF,BiLSTM,BiLSTM+CRF的具体实现)
Stars: ✭ 995 (+936.46%)
Mutual labels:  named-entity-recognition, ner
Jointre
End-to-end neural relation extraction using deep biaffine attention (ECIR 2019)
Stars: ✭ 41 (-57.29%)
Mutual labels:  named-entity-recognition, ner
Pytreebank
😡😇 Stanford Sentiment Treebank loader in Python
Stars: ✭ 93 (-3.12%)
Mutual labels:  dataset, natural-language-processing
Mtnt
Code for the collection and analysis of the MTNT dataset
Stars: ✭ 48 (-50%)
Mutual labels:  dataset, natural-language-processing
Understanding Financial Reports Using Natural Language Processing
Investigate how mutual funds leverage credit derivatives by studying their routine filings to the SEC using NLP techniques 📈🤑
Stars: ✭ 36 (-62.5%)
Mutual labels:  natural-language-processing, named-entity-recognition
Nagisa Tutorial Pycon2019
Code for PyCon JP 2019 talk "Python による日本語自然言語処理 〜系列ラベリングによる実世界テキスト分析〜"
Stars: ✭ 46 (-52.08%)
Mutual labels:  natural-language-processing, named-entity-recognition
Corenlp
Stanford CoreNLP: A Java suite of core NLP tools.
Stars: ✭ 8,248 (+8491.67%)
Mutual labels:  natural-language-processing, named-entity-recognition
Iob2corpus
Japanese IOB2 tagged corpus for Named Entity Recognition.
Stars: ✭ 51 (-46.87%)
Mutual labels:  natural-language-processing, named-entity-recognition
Coarij
Corpus of Annual Reports in Japan
Stars: ✭ 55 (-42.71%)
Mutual labels:  dataset, natural-language-processing

BOND

This repo contains our code and pre-processed distantly/weakly labeled data for paper BOND: BERT-Assisted Open-Domain Name Entity Recognition with Distant Supervision (KDD2020)

BOND

BOND-Framework

Benchmark

The reuslts (entity-level F1 score) are summarized as follows:

Method CoNLL03 Tweet OntoNote5.0 Webpage Wikigold
Full Supervision 91.21 52.19 86.20 72.39 86.43
Previous SOTA 76.00 26.10 67.69 51.39 47.54
BOND 81.48 48.01 68.35 65.74 60.07
  • Full Supervision: Roberta Finetuning/BiLSTM CRF
  • Previous SOTA: BiLSTM-CRF/AutoNER/LR-CRF/KALM/CONNET

Data

We release five open-domain distantly/weakly labeled NER datasets here: dataset. For gazetteers information and distant label generation code, please directly email [email protected].

Environment

Python 3.7, Pytorch 1.3, Hugging Face Transformers v2.3.0.

Training & Evaluation

We provides the training scripts for all five open-domain distantly/weakly labeled NER datasets in scripts. E.g., for BOND training and evaluation on CoNLL03

cd BOND
./scripts/conll_self_training.sh

For Stage I training and evaluation on CoNLL03

cd BOND
./scripts/conll_baseline.sh

The test reuslts (entity-level F1 score) are summarized as follows:

Method CoNLL03 Tweet OntoNote5.0 Webpage Wikigold
Stage I 75.61 46.61 68.11 59.11 52.15
BOND 81.48 48.01 68.35 65.74 60.07

Citation

Please cite the following paper if you are using our datasets/tool. Thanks!

@inproceedings{liang2020bond,
  title={BOND: Bert-Assisted Open-Domain Named Entity Recognition with Distant Supervision},
  author={Liang, Chen and Yu, Yue and Jiang, Haoming and Er, Siawpeng and Wang, Ruijia and Zhao, Tuo and Zhang, Chao},
  booktitle={ACM SIGKDD International Conference on Knowledge Discovery and Data Mining},
  year={2020}
}
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].