All Projects → v-mipeng → Lexiconner

v-mipeng / Lexiconner

Licence: apache-2.0
Lexicon-based Named Entity Recognition

Programming Languages

python
139335 projects - #7 most used programming language

Labels

Projects that are alternatives of or similar to Lexiconner

Bi Lstm Crf Ner Tf2.0
Named Entity Recognition (NER) task using Bi-LSTM-CRF model implemented in Tensorflow 2.0(tensorflow2.0 +)
Stars: ✭ 93 (-8.82%)
Mutual labels:  ner
Bond
BOND: BERT-Assisted Open-Domain Name Entity Recognition with Distant Supervision
Stars: ✭ 96 (-5.88%)
Mutual labels:  ner
Text predictor
Char-level RNN LSTM text generator📄.
Stars: ✭ 99 (-2.94%)
Mutual labels:  ai
Webots
Webots Robot Simulator
Stars: ✭ 1,324 (+1198.04%)
Mutual labels:  ai
Frigate
NVR with realtime local object detection for IP cameras
Stars: ✭ 1,329 (+1202.94%)
Mutual labels:  ai
Ikbt
A python package to solve robot arm inverse kinematics in symbolic form
Stars: ✭ 97 (-4.9%)
Mutual labels:  ai
Micromlp
A micro neural network multilayer perceptron for MicroPython (used on ESP32 and Pycom modules)
Stars: ✭ 92 (-9.8%)
Mutual labels:  ai
Etagger
reference tensorflow code for named entity tagging
Stars: ✭ 100 (-1.96%)
Mutual labels:  ner
Blurr
Data transformations for the ML era
Stars: ✭ 96 (-5.88%)
Mutual labels:  ai
Atlasnetv2
This repository contains the source codes for the paper AtlasNet V2 - Learning Elementary Structures.
Stars: ✭ 99 (-2.94%)
Mutual labels:  ai
Ai Study
人工智能学习资料超全整理,包含机器学习基础ML、深度学习基础DL、计算机视觉CV、自然语言处理NLP、推荐系统、语音识别、图神经网路、算法工程师面试题
Stars: ✭ 93 (-8.82%)
Mutual labels:  ai
Tensor Safe
A Haskell framework to define valid deep learning models and export them to other frameworks like TensorFlow JS or Keras.
Stars: ✭ 96 (-5.88%)
Mutual labels:  ai
Helix theory
螺旋论(theory of helix)—— “熵减机理论(可用来构建AGI、复杂性系统等)”
Stars: ✭ 98 (-3.92%)
Mutual labels:  ai
Emojiintelligence
Neural Network built in Apple Playground using Swift
Stars: ✭ 1,323 (+1197.06%)
Mutual labels:  ai
Monkeys
A strongly-typed genetic programming framework for Python
Stars: ✭ 98 (-3.92%)
Mutual labels:  ai
Latticelstm
Chinese NER using Lattice LSTM. Code for ACL 2018 paper.
Stars: ✭ 1,318 (+1192.16%)
Mutual labels:  ner
Happy Transformer
A package built on top of Hugging Face's transformer library that makes it easy to utilize state-of-the-art NLP models
Stars: ✭ 97 (-4.9%)
Mutual labels:  ai
Opencage
A modding toolkit for Alien: Isolation that covers a wide range of game content and configurations.
Stars: ✭ 98 (-3.92%)
Mutual labels:  ai
Dopamine
Dopamine is a research framework for fast prototyping of reinforcement learning algorithms.
Stars: ✭ 9,681 (+9391.18%)
Mutual labels:  ai
Objectron
Objectron is a dataset of short, object-centric video clips. In addition, the videos also contain AR session metadata including camera poses, sparse point-clouds and planes. In each video, the camera moves around and above the object and captures it from different views. Each object is annotated with a 3D bounding box. The 3D bounding box describes the object’s position, orientation, and dimensions. The dataset contains about 15K annotated video clips and 4M annotated images in the following categories: bikes, books, bottles, cameras, cereal boxes, chairs, cups, laptops, and shoes
Stars: ✭ 1,352 (+1225.49%)
Mutual labels:  ai

LexiconNER

This is the implementation of "Distantly Supervised Named Entity Recognition using Positive-Unlabeled Learning" published at ACL 2019. The highlight of this work is it performs NER using only entity dictionaries without any labeled data.

By the way, we recently publish our another work related to Chinese NER. It designs to augment Chinese NER with lexicons. The highlight of this work is that it has high computational efficiency and at the same time, achieves comparative or better performance over existing methods. You can access the source code of that work and a hyper-link of its associated paper at LexiconAugmentedNER.

Set up and run

Download glove.6B.100d.txt

Environment

pytorch 1.1.0 python 3.6.4 cuda 8.0

Instructions for running code

Phrase one <train bnPU model>

Train Print parameters run python feature_pu_model.py --h

optional arguments:
  -h, --help            show this help message and exit
  --lr LR               learning rate
  --beta BETA           beta of pu learning (default 0.0)
  --gamma GAMMA         gamma of pu learning (default 1.0)
  --drop_out DROP_OUT   dropout rate
  --m M                 class balance rate
  --flag FLAG           entity type (PER/LOC/ORG/MISC)
  --dataset DATASET     name of the dataset
  --batch_size BATCH_SIZE
                    	batch size for training and testing
  --print_time PRINT_TIME
                    	epochs for printing result
  --pert PERT           percentage of data use for training
  --type TYPE           pu learning type (bnpu/bpu/upu)

e.g.) Train on PER type of conll2003 dataset: python feature_pu_model.py --dataset conll2003 --type PER ** Evaluating**

python feature_pu_model_evl.py --model saved_model/bnpu_conll2003_PER_lr_0.0001_prior_0.3_beta_0.0_gamma_1.0_percent_1.0 --flag PER --dataset conll2003 --output 1

replace the model name from the training

python final_evl.py 

Get the final result on all the entity type. Remember to revise the filenames to be the output file name of evaluating.

Phrase two <train adaPU model>

dictionary generation run python ada_dict_generation.py -h

optional arguments:
  -h, --help            show this help message and exit
  --beta BETA           learning rate
  --gamma GAMMA         gamma of pu learning (default 1.0)
  --drop_out DROP_OUT   dropout rate
  --m M                 class balance rate
  --flag FLAG           entity type (PER/LOC/ORG/MISC)
  --dataset DATASET     name of the dataset
  --lr LR               learning rate
  --batch_size BATCH_SIZE
                        batch size for training and testing
  --iter ITER           iteration time
  --unlabeled UNLABELED
                        use unlabeled data or not
  --pert PERT           percentage of data use for training
  --model MODEL         saved model name

e.g.) python ada_dict_generation.py --model saved_model/bnpu_conll2003_PER_lr_0.0001_prior_0.3_beta_0.0_gamma_1.0_percent_1.0 --flag PER --iter 1 adaptive training run python adaptivepumodel.py -h

optional arguments:
  -h, --help            show this help message and exit
  --beta BETA           beta of pu learning (default 0.0)
  --gamma GAMMA         gamma of pu learning (default 1.0)
  --drop_out DROP_OUT   dropout rate
  --m M                 class balance rate
  --p P                 estimate value of prior
  --flag FLAG           entity type (PER/LOC/ORG/MISC)
  --dataset DATASET     name of the dataset
  --lr LR               learning rate
  --batch_size BATCH_SIZE
                        batch size for training and testing
  --output OUTPUT       write the test result, set 1 for writing result to
                        file
  --model MODEL         saved model name
  --iter ITER           iteration time

e.g.) python adaptive\_pu\_model.py --model saved\_model/bnpu\_conll2003\_PER\_lr\_0.0001\_prior\_0.3\_beta\_0.0\_gamma\_1.0\_percent\_1.0 --flag PER --iter 1 Replace saved model names and iteration times when doing adaptive learning. And in the same iteration the iter number in dictionary generation and adaptive learning should be same.

Cite

Please cite our ACL 2019 paper:

@article{peng2019distantly,
  title={Distantly Supervised Named Entity Recognition using Positive-Unlabeled Learning},
  author={Peng, Minlong and Xing, Xiaoyu and Zhang, Qi and Fu, Jinlan and Huang, Xuanjing},
  journal={Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (ACL)},
  year={2019}
}
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].