golsun / NLP-tools

Licence: other

Useful python NLP tools (evaluation, GUI interface, tokenization)

Programming Languages

python

139335 projects - #7 most used programming language

Projects that are alternatives of or similar to NLP-tools

Fastnlp

fastNLP: A Modularized and Extensible NLP Framework. Currently still in incubation.

Stars: ✭ 2,441 (+6158.97%)

Mutual labels: text-processing, nlp-parsing, nlp-library

Ekphrasis

Ekphrasis is a text processing tool, geared towards text from social networks, such as Twitter or Facebook. Ekphrasis performs tokenization, word normalization, word segmentation (for splitting hashtags) and spell correction, using word statistics from 2 big corpora (english Wikipedia, twitter - 330mil english tweets).

Stars: ✭ 433 (+1010.26%)

Mutual labels: text-processing, nlp-library

Pyarabic

pyarabic

Stars: ✭ 183 (+369.23%)

Mutual labels: text-processing, nlp-library

Pynlpl

PyNLPl, pronounced as 'pineapple', is a Python library for Natural Language Processing. It contains various modules useful for common, and less common, NLP tasks. PyNLPl can be used for basic tasks such as the extraction of n-grams and frequency lists, and to build simple language model. There are also more complex data types and algorithms. Moreover, there are parsers for file formats common in NLP (e.g. FoLiA/Giza/Moses/ARPA/Timbl/CQL). There are also clients to interface with various NLP specific servers. PyNLPl most notably features a very extensive library for working with FoLiA XML (Format for Linguistic Annotation).

Stars: ✭ 426 (+992.31%)

Mutual labels: text-processing, nlp-library

GrammarEngine

Грамматический Словарь Русского Языка (+ английский, японский, etc)

Stars: ✭ 68 (+74.36%)

Mutual labels: nlp-parsing, nlp-library

Tmtoolkit

Text Mining and Topic Modeling Toolkit for Python with parallel processing power

Stars: ✭ 135 (+246.15%)

Mutual labels: evaluation, text-processing

precision-recall-distributions

Assessing Generative Models via Precision and Recall (official repository)

Stars: ✭ 80 (+105.13%)

Mutual labels: evaluation, evaluation-metrics

PySODEvalToolkit

PySODEvalToolkit: A Python-based Evaluation Toolbox for Salient Object Detection and Camouflaged Object Detection

Stars: ✭ 59 (+51.28%)

Mutual labels: evaluation, evaluation-metrics

VirtualBLU

A Virtual Assistant for Windows PC with wicked Qt Graphics.

Stars: ✭ 41 (+5.13%)

Mutual labels: nlp-parsing

classy

classy is a simple-to-use library for building high-performance Machine Learning models in NLP.

Stars: ✭ 61 (+56.41%)

Mutual labels: nlp-library

hck

A sharp cut(1) clone.

Stars: ✭ 542 (+1289.74%)

Mutual labels: text-processing

AIODrive

Official Python/PyTorch Implementation for "All-In-One Drive: A Large-Scale Comprehensive Perception Dataset with High-Density Long-Range Point Clouds"

Stars: ✭ 32 (-17.95%)

Mutual labels: evaluation

edd

Erlang Declarative Debugger

Stars: ✭ 20 (-48.72%)

Mutual labels: evaluation

midi degradation toolkit

A toolkit for generating datasets of midi files which have been degraded to be 'un-musical'.

Stars: ✭ 29 (-25.64%)

Mutual labels: evaluation

gnu-linux-shell-scripting

A foundation for GNU/Linux shell scripting

Stars: ✭ 23 (-41.03%)

Mutual labels: text-processing

stringx

Drop-in replacements for base R string functions powered by stringi

Stars: ✭ 14 (-64.1%)

Mutual labels: text-processing

image-matching-toolbox

This is a toolbox repository to help evaluate various methods that perform image matching from a pair of images.

Stars: ✭ 252 (+546.15%)

Mutual labels: evaluation

support-tickets-classification

This case study shows how to create a model for text analysis and classification and deploy it as a web service in Azure cloud in order to automatically classify support tickets. This project is a proof of concept made by Microsoft (Commercial Software Engineering team) in collaboration with Endava http://endava.com/en

Stars: ✭ 142 (+264.1%)

Mutual labels: text-processing

text

Qiniu Text Processing Libraries for Go

Stars: ✭ 25 (-35.9%)

Mutual labels: text-processing

Giveme5W

Extraction of the five journalistic W-questions (5W) from news articles

Stars: ✭ 16 (-58.97%)

Mutual labels: nlp-library

View All Similar Projects ➔

What does it do?

provides easy Python ways for

evaluation: calculate automated NLP metrics (BLEU, NIST, METEOR, entropy, etc...)

from metrics import nlp_metrics
nist, bleu, meteor, entropy, diversity, avg_len = nlp_metrics(
	  path_refs=["demo/ref0.txt", "demo/ref1.txt"], 
	  path_hyp="demo/hyp.txt")
	  
# nist = [1.8338, 2.0838, 2.1949, 2.1949]
# bleu = [0.4667, 0.441, 0.4017, 0.3224]
# meteor = 0.2832
# entropy = [2.5232, 2.4849, 2.1972, 1.7918]
# diversity = [0.8667, 1.000]
# avg_len = 5.0000

tokenizatioin: clean string and deal with punctation, contraction, url, mention, tag, etc

from data_prepare import clean_str
s = " I don't know:). how about this?https://github.com"
clean_str(s)

# i do n't know :) . how about this ? __url__

dialog GUI: provide a graphic user interface (GUI). You just need to provide a respond() function.

from dialog_gui import *

def my_respond_func(inp):
        # TODO
        # input: type=str, value=conversation history. turns delimited by 'EOS'
        # return: a list of (score, hyp) tuple based on input
        
app = QtWidgets.QApplication([])
respond_funcs = [my_respond_func]
gui = DialogGUI(respond_funcs, ['my_system_name'])
gui.w.update()
app.exec_()

Requirement

Tested with Python 2.7 and 3.6
For GUI, you need PyQt5
For evaluation part, please download the following 3rd-party packages and save in a new folder 3rdparty
- mteval-v14c.pl (ftp://jaguar.ncsl.nist.gov/mt/resources/mteval-v14c.pl) to compute NIST. You may need to install the following perl modules (e.g. by cpan install): XML:Twig, Sort:Naturally and String:Util
- meteor-1.5 to compute METEOR. It requires JAVA

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

golsun / NLP-tools

Programming Languages

Labels

Projects that are alternatives of or similar to NLP-tools

What does it do?

Requirement