Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → d5555 → Tageditor

d5555 / Tageditor

Licence: mit

🏖TagEditor - Annotation tool for spaCy

Labels

machine-learning nlp data-science natural-language-processing neural-networks annotation spacy

Projects that are alternatives of or similar to Tageditor

Prodigy Recipes

🍳 Recipes for the Prodigy, our fully scriptable annotation tool

Stars: ✭ 229 (+148.91%)

Mutual labels: data-science, natural-language-processing, annotation, spacy

💫 Industrial-strength Natural Language Processing (NLP) in Python

Stars: ✭ 21,978 (+23789.13%)

Mutual labels: data-science, natural-language-processing, neural-networks, spacy

Jupyterlab Prodigy

🧬 A JupyterLab extension for annotating data with Prodigy

Stars: ✭ 97 (+5.43%)

Mutual labels: data-science, natural-language-processing, annotation, spacy

Datacamp Python Data Science Track

All the slides, accompanying code and exercises all stored in this repo. 🎈

Stars: ✭ 250 (+171.74%)

Mutual labels: data-science, natural-language-processing, neural-networks

Datasets, tools, and benchmarks for representation learning of code.

Stars: ✭ 1,378 (+1397.83%)

Mutual labels: data-science, natural-language-processing, neural-networks

Amacımız Türkçe NLP literatüründeki birçok farklı sorunu bir arada çözebilen, eşsiz yaklaşımlar öne süren ve literatürdeki çalışmaların eksiklerini gideren open source bir yazım destekleyicisi/denetleyicisi oluşturmak. Kullanıcıların yazdıkları metinlerdeki yazım yanlışlarını derin öğrenme yaklaşımıyla çözüp aynı zamanda metinlerde anlamsal analizi de gerçekleştirerek bu bağlamda ortaya çıkan yanlışları da fark edip düzeltebilmek.

Stars: ✭ 165 (+79.35%)

Mutual labels: data-science, natural-language-processing, neural-networks

Awesome Distributed Deep Learning

A curated list of awesome Distributed Deep Learning resources.

Stars: ✭ 277 (+201.09%)

Mutual labels: data-science, natural-language-processing, neural-networks

Tensorlayer Tricks

How to use TensorLayer

Stars: ✭ 357 (+288.04%)

Mutual labels: data-science, natural-language-processing, neural-networks

Learn Data Science For Free

This repositary is a combination of different resources lying scattered all over the internet. The reason for making such an repositary is to combine all the valuable resources in a sequential manner, so that it helps every beginners who are in a search of free and structured learning resource for Data Science. For Constant Updates Follow me in …

Stars: ✭ 4,757 (+5070.65%)

Mutual labels: data-science, natural-language-processing, neural-networks

💥 Use the latest Stanza (StanfordNLP) research models directly in spaCy

Stars: ✭ 508 (+452.17%)

Mutual labels: data-science, natural-language-processing, spacy

A Python package for gender classification.

Stars: ✭ 64 (-30.43%)

Mutual labels: data-science, natural-language-processing

A Super-Lightweight Annotation Tool for Experts: Label text in a terminal with just Python

Stars: ✭ 61 (-33.7%)

Mutual labels: natural-language-processing, annotation

An open-source platform for automating tasks using machine learning models

Stars: ✭ 61 (-33.7%)

Mutual labels: data-science, neural-networks

Bidirectional Attention Flow for Machine Comprehension implemented in Keras 2

Stars: ✭ 60 (-34.78%)

Mutual labels: natural-language-processing, neural-networks

Intent classifier

Stars: ✭ 67 (-27.17%)

Mutual labels: natural-language-processing, neural-networks

Text Analytics With Python

Learn how to process, classify, cluster, summarize, understand syntax, semantics and sentiment of text data with the power of Python! This repository contains code and datasets used in my book, "Text Analytics with Python" published by Apress/Springer.

Stars: ✭ 1,132 (+1130.43%)

Mutual labels: natural-language-processing, spacy

Lda Topic Modeling

A PureScript, browser-based implementation of LDA topic modeling.

Stars: ✭ 91 (-1.09%)

Mutual labels: data-science, natural-language-processing

🦆 Contextually-keyed word vectors

Stars: ✭ 1,184 (+1186.96%)

Mutual labels: natural-language-processing, spacy

Python nlp tutorial

This repository provides everything to get started with Python for Text Mining / Natural Language Processing (NLP)

Stars: ✭ 72 (-21.74%)

Mutual labels: natural-language-processing, spacy

Mckinsey Smartcities Traffic Prediction

Adventure into using multi attention recurrent neural networks for time-series (city traffic) for the 2017-11-18 McKinsey IronMan (24h non-stop) prediction challenge

Stars: ✭ 49 (-46.74%)

Mutual labels: data-science, neural-networks

View All Similar Projects ➔

TagEditor(v3.0.3) annotation tool

TagEditor is a desktop application (requires Windows 10, 64-bit) designed to annotate text for training with spaCy library.
With TagEditor you can label dependencies, parts of speech, Named entities, text categories and Coreference resolution or create your customized training data.

Installation

No installation required.
Download and unpack TagEditor.7z
Run 'TagEditor.exe' in the main folder.

Usage

Insert your text or open a text file and press Start tagging (or choose one of the options in Menu/Tools). Choose types of annotation and labels like in the screenshot below and press Ok.

Select a tag in TAG SET pannel then select a word to assign the tag. Select a head tag to assign dependency if you are working in the Dependencies window .
Right-click on a word to edit, delete, insert word, merge or split sentences. To merge sentences right-click on the first word of sentence. It is checked as Sentence start. Uncheck it and the sentence will merge with the previous sentence.
To delete all newline characters and extra whitespaces in the text, select the tab Words and press Remove Whitespaces.
Press button Create DATA to create training data in "simple training style" or JSON. You can save it in a simple text format or as a python file...
Save project for future editing. Load project to continue where you left.

If you don't have a pretrained model for a given language, select the language from the list for proper tokenization:

Try NeuralGym to train spaCy model with your training data.

Named Entities
First click on a label in the Tag Set pannel then select text in the main window. To delete assigned label from text just click on it. Create output data with char/token offset or BILUO / IOB scheme. It is allowed to create nested or overlapping tags.

Create dataset with selected items and save into txt or json file or print it on the screen.:

POS tags
In this window you can edit POS tags (fine-grained) and also view coarse-grained pos tags and morphs.

Dependencies
Select a tag in TAG SET pannel then click on a word in the editor window to assign the tag. Click on another word(token) to assign a head tag. Click on the word again to remove the tag.

Co-reference tagger
Coreference annotation is according to PreCo 'Data Format'.
Dataset can be downloaded from here: https://github.com/d5555/Coreference-dataset
Compatible with NeuralCoref 4.0. To use NeuralCoref for annotating select "Enable NeuralCoref" after 'Start tagging'. Set parameter 'greedyness' 0,55.

https://preschool-lab.github.io/PreCo/
https://arxiv.org/abs/1810.09807
"sentences" - is a list of sentences. Each sentence is a list of tokens. Each token is a string, which can be a word or a punctuation mark.
"mention_clusters" - is a list of mention clusters. Each mention cluster is a list of mentions. Each mention is a tuple of integers [sentence_idx, begin_idx, end_idx]. Sentence_idx is the index of the sentence of the mention. Begin_idx is the index of the first token of the mention in the sentence. End_index is the index of the last token of the mention in the sentence plus one. All indices are zero-based.
Select in the editor window a word or a span of words. It will be a singleton(single entity) with no connection to other entities and framed with dash line. Then select another span. Everytime you select an entity it is highlighted by green color frame. While it is in selected state click on another entity and they will be linked together and highligted by same color and get same coref number (a num in the right corner of frame). That simple!
To deselect just click on empty space in the main window.
To unlink a span from the entity , select it and then click on it again. It will turn into singleton. You can also use the table on the right side. If the text is long and you don't want to scroll it just click on an entity in the table to get spans linked. Entities which are not singletons are added to the table automatically. Though you can add singletons too. Entity color can be changed except for singleton.
You can load data from PreCo dataset to TagEditor directly. Unzip PreCo dataset , run tagEditor and select menu File->Load PreCO/Coref->(select file). You can test it with the file coref_example.jsonl

Text Categories
In the Text Categories you can assign labels to paragraphs, sentences or to spans (see below).
Select the score in the TAG SET pannel - True or False(i.e 1.0 or 0.0) and select a category label. Go to the editor window and click on sentence. Category and score will be added. You can easily switch the score True/False by just clicking on the score label in editor window. Supports multiple, non-mutually exclusive labels.
Use check button Assign/unassign all to assign/unassign all labels to all sentences in one click. Then you can manually change True/False status of each label or delete a label in the editor window.
For demo purporses the text classifier of this tool was trained on the IMDB dataset with labels 'POSITIVE NEGATIVE'
https://spacy.io/usage/training#textcat
'Spans classification mode' allows multiple overlapping labels. Can be used as an all-purporse text tagger with the data format (index of first token, index of last token+1, label name). Zero based.

Try NeuralGym to train spaCy model with your training data.

To use your pretrained models with TagEditor or other spacy models, acquire the full version of TagEditor.

*You have any suggestions on improving the program, adding extra feature, feel free to leave a comment or email at [email protected]

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 92

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (4) 🔗