Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → monologg → KoBERT-NER

monologg / KoBERT-NER

Licence: Apache-2.0 license

NER Task with KoBERT (with Naver NLP Challenge dataset)

Programming Languages

139335 projects - #7 most used programming language

Labels

nlp named-entity-recognition ner kobert distilkobert

Projects that are alternatives of or similar to KoBERT-NER

BNLP is a natural language processing toolkit for Bengali Language.

Stars: ✭ 127 (+67.11%)

Mutual labels: named-entity-recognition, ner

Datasets to train supervised classifiers for Named-Entity Recognition in different languages (Portuguese, German, Dutch, French, English)

Stars: ✭ 220 (+189.47%)

Mutual labels: named-entity-recognition, ner

NCRF++, a Neural Sequence Labeling Toolkit. Easy use to any sequence labeling tasks (e.g. NER, POS, Segmentation). It includes character LSTM/CNN, word LSTM/CNN and softmax/CRF components.

Stars: ✭ 1,767 (+2225%)

Mutual labels: named-entity-recognition, ner

Dan Jurafsky Chris Manning Nlp

My solution to the Natural Language Processing course made by Dan Jurafsky, Chris Manning in Winter 2012.

Stars: ✭ 124 (+63.16%)

Mutual labels: named-entity-recognition, ner

Named Entity Recognition based on dictionaries

Stars: ✭ 212 (+178.95%)

Mutual labels: named-entity-recognition, ner

An implementation of a full named-entity evaluation metrics based on SemEval'13 Task 9 - not at tag/token level but considering all the tokens that are part of the named-entity

Stars: ✭ 126 (+65.79%)

Mutual labels: named-entity-recognition, ner

Sequence tagging

Named Entity Recognition (LSTM + CRF) - Tensorflow

Stars: ✭ 1,889 (+2385.53%)

Mutual labels: named-entity-recognition, ner

BOND: BERT-Assisted Open-Domain Name Entity Recognition with Distant Supervision

Stars: ✭ 96 (+26.32%)

Mutual labels: named-entity-recognition, ner

Ner with Bert

Stars: ✭ 240 (+215.79%)

Mutual labels: named-entity-recognition, ner

a sklearn wrapper for Google's BERT model

Stars: ✭ 182 (+139.47%)

Mutual labels: named-entity-recognition, ner

MONPA 罔拍是一個提供正體中文斷詞、詞性標註以及命名實體辨識的多任務模型

Stars: ✭ 203 (+167.11%)

Mutual labels: named-entity-recognition, ner

Ner Bert Pytorch

PyTorch solution of named entity recognition task Using Google AI's pre-trained BERT model.

Stars: ✭ 249 (+227.63%)

Mutual labels: named-entity-recognition, ner

keras attentional bi-LSTM-CRF for Joint NLU (slot-filling and intent detection) with ATIS

Stars: ✭ 122 (+60.53%)

Mutual labels: named-entity-recognition, ner

ChineseNER based on BERT, with BiLSTM+CRF layer

Stars: ✭ 195 (+156.58%)

Mutual labels: named-entity-recognition, ner

命名体识别(NER)综述-论文-模型-代码(BiLSTM-CRF/BERT-CRF)-竞赛资源总结-随时更新

Stars: ✭ 118 (+55.26%)

Mutual labels: named-entity-recognition, ner

Efficient Contextualized Representation: Language Model Pruning for Sequence Labeling

Stars: ✭ 148 (+94.74%)

Mutual labels: named-entity-recognition, ner

Turkish Bert Nlp Pipeline

Bert-base NLP pipeline for Turkish, Ner, Sentiment Analysis, Question Answering etc.

Stars: ✭ 85 (+11.84%)

Mutual labels: named-entity-recognition, ner

Bi Lstm Crf Ner Tf2.0

Named Entity Recognition (NER) task using Bi-LSTM-CRF model implemented in Tensorflow 2.0(tensorflow2.0 +)

Stars: ✭ 93 (+22.37%)

Mutual labels: named-entity-recognition, ner

Kashgari is a production-level NLP Transfer learning framework built on top of tf.keras for text-labeling and text-classification, includes Word2Vec, BERT, and GPT2 Language Embedding.

Stars: ✭ 2,235 (+2840.79%)

Mutual labels: named-entity-recognition, ner

پیکره بزرگ شناسایی موجودیت‌های نامدار فارسی برچسب خورده

Stars: ✭ 183 (+140.79%)

Mutual labels: named-entity-recognition, ner

View All Similar Projects ➔

KoBERT-NER

KoBERT를 이용한 한국어 Named Entity Recognition Task
🤗Huggingface Tranformers🤗 라이브러리를 이용하여 구현

Dependencies

torch==1.4.0
transformers==2.10.0
seqeval>=0.0.12

Dataset

Naver NLP Challenge 2018의 NER Dataset 사용 (Github link)
해당 데이터셋에 Train dataset만 존재하기에, Test dataset은 Train dataset에서 split하였습니다. (Data link)
- Train (81,000) / Test (9,000)

How to use KoBERT on Huggingface Transformers Library

기존의 KoBERT를 transformers 라이브러리에서 곧바로 사용할 수 있도록 맞췄습니다.
- transformers v2.2.2부터 개인이 만든 모델을 transformers를 통해 직접 업로드/다운로드하여 사용할 수 있습니다
Tokenizer를 사용하려면 tokenization_kobert.py에서 KoBertTokenizer를 임포트해야 합니다.

from transformers import BertModel
from tokenization_kobert import KoBertTokenizer

model = BertModel.from_pretrained('monologg/kobert')
tokenizer = KoBertTokenizer.from_pretrained('monologg/kobert')

Usage

$ python3 main.py --model_type kobert --do_train --do_eval

--write_pred 옵션을 주면 evaluation의 prediction 결과가 preds 폴더에 저장됩니다.

Prediction

$ python3 predict.py --input_file {INPUT_FILE_PATH} --output_file {OUTPUT_FILE_PATH} --model_dir {SAVED_CKPT_PATH}

Results

	Slot F1 (%)
KoBERT	86.11
DistilKoBERT	84.13
Bert-Multilingual	84.20
CNN-BiLSTM-CRF	74.57

References

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 76

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (0) 🔗