Genalog is an open source, cross-platform python package allowing generation of synthetic document images with custom degradations and text alignment capabilities.

Stars: ✭ 234 (-20.14%)

Mutual labels: ner

parser-lang

A parser combinator library with declarative superpowers

Stars: ✭ 25 (-91.47%)

Mutual labels: parsing

ParsecSharp

The faster monadic parser combinator library for C#

Stars: ✭ 23 (-92.15%)

Mutual labels: parsing

TweebankNLP

[LREC 2022] An off-the-shelf pre-trained Tweet NLP Toolkit (NER, tokenization, lemmatization, POS tagging, dependency parsing) + Tweebank-NER dataset

Stars: ✭ 84 (-71.33%)

Mutual labels: ner

Time Convert

时间转换工具

Stars: ✭ 32 (-89.08%)

Mutual labels: ner

metal

A Java library for parsing binary data formats, using declarative descriptions.

Stars: ✭ 13 (-95.56%)

Mutual labels: parsing

molminer

Python library and command-line tool for extracting compounds from scientific literature. Written in Python.

Stars: ✭ 38 (-87.03%)

Mutual labels: ner

codeparser

Parse Wolfram Language source code as abstract syntax trees (ASTs) or concrete syntax trees (CSTs)

Stars: ✭ 84 (-71.33%)

Mutual labels: parsing

scikitcrf NER

Python library for custom entity recognition using Sklearn CRF

Stars: ✭ 17 (-94.2%)

Mutual labels: ner

twitter-to-rss

Simple python script to parse twitter feed to generate a rss feed.

Stars: ✭ 15 (-94.88%)

Mutual labels: parsing

FullFIX

A library for parsing FIX (Financial Information eXchange) protocol messages.

Stars: ✭ 60 (-79.52%)

Mutual labels: parsing

View All Similar Projects ➔

Named Entity Recognition as Dependency Parsing

Introduction

This repository contains code introduced in the following paper:

Named Entity Recognition as Dependency Parsing
Juntao Yu, Bernd Bohnet and Massimo Poesio
In Proceedings of the 58th Annual Conference of the Association for Computational Linguistics (ACL), 2020

Setup Environments

The code is written in Python 2 and Tensorflow 1.0, A Python3 and Tensorflow 2.0 version is provided by Amir (see Other Versions).
Before starting, you need to install all the required packages listed in the requirment.txt using pip install -r requirements.txt.
Then download the BERT models, for English we used the original cased BERT-Large model and for other languages we used the cased BERT-Base multilingual model.
After that modify and run extract_bert_features/extract_bert_features.sh to compute the BERT embeddings for your training or testing.
You also need to download context-independent word embeddings such as fasttext or GloVe embeddings that required by the system.

To use a pre-trained model

Pre-trained models can be download from this link. We provide all nine pre-trained models reported in our paper.
Choose the model you want to use and copy them to the logs/ folder.

Modifiy the test_path accordingly in the experiments.conf:

the test_path is the path to .jsonlines file, each line of the .jsonlines file is a batch of sentences and must in the following format:

{"doc_key": "batch_01", 
"ners": [[[0, 0, "PER"], [3, 3, "GPE"], [5, 5, "GPE"]], 
[[3, 3, "PER"], [10, 14, "ORG"], [20, 20, "GPE"], [20, 25, "GPE"], [22, 22, "GPE"]], 
[]], 
"sentences": [["Anwar", "arrived", "in", "Shanghai", "from", "Nanjing", "yesterday", "afternoon", "."], 
["This", "morning", ",", "Anwar", "attended", "the", "foundation", "laying", "ceremony", "of", "the", "Minhang", "China-Malaysia", "joint-venture", "enterprise", ",", "and", "after", "that", "toured", "Pudong", "'s", "Jingqiao", "export", "processing", "district", "."], 
["(", "End", ")"]]}

Each of the sentences in the batch corresponds to a list of NEs stored under ners key, if some sentences do not contain NEs use an empty list [] instead.

Then use python evaluate.py config_name to start your evaluation

To train your own model

You will need additionally to create the character vocabulary by using python get_char_vocab.py train.jsonlines dev.jsonlines
Then you can start training by using python train.py config_name

Other Versions

Amir Zeldes kindly created a tensorflow 2.0 and python 3 ready version and can be find here

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

juntaoy / biaffine-ner

Programming Languages

Labels

Projects that are alternatives of or similar to biaffine-ner

Named Entity Recognition as Dependency Parsing

Introduction

Setup Environments

To use a pre-trained model

To train your own model

Other Versions