Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → Hironsan → Anago

Hironsan / Anago

Licence: mit

Bidirectional LSTM-CRF and ELMo for Named-Entity Recognition, Part-of-Speech Tagging and so on.

Programming Languages

139335 projects - #7 most used programming language

Labels

deep-learning machine-learning keras natural-language-processing named-entity-recognition sequence-labeling

Projects that are alternatives of or similar to Anago

A very simple framework for state-of-the-art Natural Language Processing (NLP)

Stars: ✭ 11,065 (+694.9%)

Mutual labels: natural-language-processing, named-entity-recognition, sequence-labeling

A Python framework for sequence labeling evaluation(named-entity recognition, pos tagging, etc...)

Stars: ✭ 508 (-63.51%)

Mutual labels: natural-language-processing, named-entity-recognition, sequence-labeling

NCRF++, a Neural Sequence Labeling Toolkit. Easy use to any sequence labeling tasks (e.g. NER, POS, Segmentation). It includes character LSTM/CNN, word LSTM/CNN and softmax/CRF components.

Stars: ✭ 1,767 (+26.94%)

Mutual labels: natural-language-processing, named-entity-recognition, sequence-labeling

Deep neural models for core NLP tasks (Pytorch version)

Stars: ✭ 397 (-71.48%)

Mutual labels: natural-language-processing, named-entity-recognition, sequence-labeling

CLUENER2020 中文细粒度命名实体识别 Fine Grained Named Entity Recognition

Stars: ✭ 689 (-50.5%)

Mutual labels: named-entity-recognition, sequence-labeling

Official Stanford NLP Python Library for Many Human Languages

Stars: ✭ 5,887 (+322.92%)

Mutual labels: natural-language-processing, named-entity-recognition

BOND: BERT-Assisted Open-Domain Name Entity Recognition with Distant Supervision

Stars: ✭ 96 (-93.1%)

Mutual labels: natural-language-processing, named-entity-recognition

基于深度学习的自然语言处理库

Stars: ✭ 34 (-97.56%)

Mutual labels: natural-language-processing, named-entity-recognition

Entity Recognition Datasets

A collection of corpora for named entity recognition (NER) and entity recognition tasks. These annotated datasets cover a variety of languages, domains and entity types.

Stars: ✭ 891 (-35.99%)

Mutual labels: natural-language-processing, named-entity-recognition

Understanding Financial Reports Using Natural Language Processing

Investigate how mutual funds leverage credit derivatives by studying their routine filings to the SEC using NLP techniques 📈🤑

Stars: ✭ 36 (-97.41%)

Mutual labels: natural-language-processing, named-entity-recognition

Stanford CoreNLP: A Java suite of core NLP tools.

Stars: ✭ 8,248 (+492.53%)

Mutual labels: natural-language-processing, named-entity-recognition

Open Semantic Entity Search Api

Open Source REST API for named entity extraction, named entity linking, named entity disambiguation, recommendation & reconciliation of entities like persons, organizations and places for (semi)automatic semantic tagging & analysis of documents by linked data knowledge graph like SKOS thesaurus, RDF ontology, database(s) or list(s) of names

Stars: ✭ 98 (-92.96%)

Mutual labels: natural-language-processing, named-entity-recognition

中文分词词性标注命名实体识别依存句法分析成分句法分析语义依存分析语义角色标注指代消解风格转换语义相似度新词发现关键词短语提取自动摘要文本分类聚类拼音简繁转换自然语言处理

Stars: ✭ 24,626 (+1669.11%)

Mutual labels: natural-language-processing, named-entity-recognition

A pythonic wrapper for Stanford CoreNLP.

Stars: ✭ 103 (-92.6%)

Mutual labels: natural-language-processing, named-entity-recognition

Named Entity Recognition using multilayered bidirectional LSTM

Stars: ✭ 532 (-61.78%)

Mutual labels: natural-language-processing, named-entity-recognition

Named Entity Recognition

name entity recognition with recurrent neural network(RNN) in tensorflow

Stars: ✭ 20 (-98.56%)

Mutual labels: natural-language-processing, named-entity-recognition

Nagisa Tutorial Pycon2019

Code for PyCon JP 2019 talk "Python による日本語自然言語処理〜系列ラベリングによる実世界テキスト分析〜"

Stars: ✭ 46 (-96.7%)

Mutual labels: natural-language-processing, named-entity-recognition

Japanese IOB2 tagged corpus for Named Entity Recognition.

Stars: ✭ 51 (-96.34%)

Mutual labels: natural-language-processing, named-entity-recognition

Turkish Bert Nlp Pipeline

Bert-base NLP pipeline for Turkish, Ner, Sentiment Analysis, Question Answering etc.

Stars: ✭ 85 (-93.89%)

Mutual labels: natural-language-processing, named-entity-recognition

💫 Industrial-strength Natural Language Processing (NLP) in Python

Stars: ✭ 21,978 (+1478.88%)

Mutual labels: natural-language-processing, named-entity-recognition

View All Similar Projects ➔

anaGo

anaGo is a Python library for sequence labeling(NER, PoS Tagging,...), implemented in Keras.

anaGo can solve sequence labeling tasks such as named entity recognition (NER), part-of-speech tagging (POS tagging), semantic role labeling (SRL) and so on. Unlike traditional sequence labeling solver, anaGo don't need to define any language dependent features. Thus, we can easily use anaGo for any languages.

As an example of anaGo, the following image shows named entity recognition in English:

Get Started

In anaGo, the simplest type of model is the Sequence model. Sequence model includes essential methods like fit, score, analyze and save/load. For more complex features, you should use the anaGo modules such as models, preprocessing and so on.

Here is the data loader:

>>> from anago.utils import load_data_and_labels

>>> x_train, y_train = load_data_and_labels('train.txt')
>>> x_test, y_test = load_data_and_labels('test.txt')
>>> x_train[0]
['EU', 'rejects', 'German', 'call', 'to', 'boycott', 'British', 'lamb', '.']
>>> y_train[0]
['B-ORG', 'O', 'B-MISC', 'O', 'O', 'O', 'B-MISC', 'O', 'O']

You can now iterate on your training data in batches:

>>> import anago

>>> model = anago.Sequence()
>>> model.fit(x_train, y_train, epochs=15)
Epoch 1/15
541/541 [==============================] - 166s 307ms/step - loss: 12.9774
...

Evaluate your performance in one line:

>>> model.score(x_test, y_test)
0.802  # f1-micro score
# For more performance, you have to use pre-trained word embeddings.
# For now, anaGo's best score is 90.94 f1-micro score.

Or tagging text on new data:

>>> text = 'President Obama is speaking at the White House.'
>>> model.analyze(text)
{
    "words": [
        "President",
        "Obama",
        "is",
        "speaking",
        "at",
        "the",
        "White",
        "House."
    ],
    "entities": [
        {
            "beginOffset": 1,
            "endOffset": 2,
            "score": 1,
            "text": "Obama",
            "type": "PER"
        },
        {
            "beginOffset": 6,
            "endOffset": 8,
            "score": 1,
            "text": "White House.",
            "type": "LOC"
        }
    ]
}

To download a pre-trained model, call download function:

>>> from anago.utils import download

>>> url = 'https://s3-ap-northeast-1.amazonaws.com/dev.tech-sketch.jp/chakki/public/conll2003_en.zip'
>>> weights, params, preprocessor = download(url)
>>> model = anago.Sequence.load(weights, params, preprocessor)
>>> model.score(x_test, y_test)
0.909446369856927

If you want to use ELMo for better performance(f1: 92.22), you can use ELModel and ELMoTransformer:

# Transforming datasets.
p = ELMoTransformer()
p.fit(x_train, y_train)

# Building a model.
model = ELModel(...)
model, loss = model.build()
model.compile(loss=loss, optimizer='adam')

# Training the model.
trainer = Trainer(model, preprocessor=p)
trainer.train(x_train, y_train, x_test, y_test)

For futher details, see anago/examples/elmo_example.py.

Feature Support

anaGo supports following features:

Model Training
Model Evaluation
Tagging Text
Custom Model Support
Downloading pre-trained model
GPU Support
Character feature
CRF Support
Custom Callback Support
💥(new) ELMo

anaGo officially supports Python 3.4–3.6.

Installation

To install anaGo, simply use pip:

$ pip install anago

or install from the repository:

$ git clone https://github.com/Hironsan/anago.git
$ cd anago
$ python setup.py install

Documentation

(coming soon)

Reference

This library is based on the following papers:

Lample, Guillaume, et al. "Neural architectures for named entity recognition." arXiv preprint arXiv:1603.01360 (2016).
Peters, Matthew E., et al. "Deep contextualized word representations." arXiv preprint arXiv:1802.05365 (2018).

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 1,392

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (41) 🔗