All Projects → golsun → NLP-tools

golsun / NLP-tools

Licence: other
Useful python NLP tools (evaluation, GUI interface, tokenization)

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to NLP-tools

Fastnlp
fastNLP: A Modularized and Extensible NLP Framework. Currently still in incubation.
Stars: ✭ 2,441 (+6158.97%)
Mutual labels:  text-processing, nlp-parsing, nlp-library
Ekphrasis
Ekphrasis is a text processing tool, geared towards text from social networks, such as Twitter or Facebook. Ekphrasis performs tokenization, word normalization, word segmentation (for splitting hashtags) and spell correction, using word statistics from 2 big corpora (english Wikipedia, twitter - 330mil english tweets).
Stars: ✭ 433 (+1010.26%)
Mutual labels:  text-processing, nlp-library
Pyarabic
pyarabic
Stars: ✭ 183 (+369.23%)
Mutual labels:  text-processing, nlp-library
Pynlpl
PyNLPl, pronounced as 'pineapple', is a Python library for Natural Language Processing. It contains various modules useful for common, and less common, NLP tasks. PyNLPl can be used for basic tasks such as the extraction of n-grams and frequency lists, and to build simple language model. There are also more complex data types and algorithms. Moreover, there are parsers for file formats common in NLP (e.g. FoLiA/Giza/Moses/ARPA/Timbl/CQL). There are also clients to interface with various NLP specific servers. PyNLPl most notably features a very extensive library for working with FoLiA XML (Format for Linguistic Annotation).
Stars: ✭ 426 (+992.31%)
Mutual labels:  text-processing, nlp-library
GrammarEngine
Грамматический Словарь Русского Языка (+ английский, японский, etc)
Stars: ✭ 68 (+74.36%)
Mutual labels:  nlp-parsing, nlp-library
Tmtoolkit
Text Mining and Topic Modeling Toolkit for Python with parallel processing power
Stars: ✭ 135 (+246.15%)
Mutual labels:  evaluation, text-processing
precision-recall-distributions
Assessing Generative Models via Precision and Recall (official repository)
Stars: ✭ 80 (+105.13%)
Mutual labels:  evaluation, evaluation-metrics
PySODEvalToolkit
PySODEvalToolkit: A Python-based Evaluation Toolbox for Salient Object Detection and Camouflaged Object Detection
Stars: ✭ 59 (+51.28%)
Mutual labels:  evaluation, evaluation-metrics
VirtualBLU
A Virtual Assistant for Windows PC with wicked Qt Graphics.
Stars: ✭ 41 (+5.13%)
Mutual labels:  nlp-parsing
classy
classy is a simple-to-use library for building high-performance Machine Learning models in NLP.
Stars: ✭ 61 (+56.41%)
Mutual labels:  nlp-library
hck
A sharp cut(1) clone.
Stars: ✭ 542 (+1289.74%)
Mutual labels:  text-processing
AIODrive
Official Python/PyTorch Implementation for "All-In-One Drive: A Large-Scale Comprehensive Perception Dataset with High-Density Long-Range Point Clouds"
Stars: ✭ 32 (-17.95%)
Mutual labels:  evaluation
edd
Erlang Declarative Debugger
Stars: ✭ 20 (-48.72%)
Mutual labels:  evaluation
midi degradation toolkit
A toolkit for generating datasets of midi files which have been degraded to be 'un-musical'.
Stars: ✭ 29 (-25.64%)
Mutual labels:  evaluation
gnu-linux-shell-scripting
A foundation for GNU/Linux shell scripting
Stars: ✭ 23 (-41.03%)
Mutual labels:  text-processing
stringx
Drop-in replacements for base R string functions powered by stringi
Stars: ✭ 14 (-64.1%)
Mutual labels:  text-processing
image-matching-toolbox
This is a toolbox repository to help evaluate various methods that perform image matching from a pair of images.
Stars: ✭ 252 (+546.15%)
Mutual labels:  evaluation
support-tickets-classification
This case study shows how to create a model for text analysis and classification and deploy it as a web service in Azure cloud in order to automatically classify support tickets. This project is a proof of concept made by Microsoft (Commercial Software Engineering team) in collaboration with Endava http://endava.com/en
Stars: ✭ 142 (+264.1%)
Mutual labels:  text-processing
text
Qiniu Text Processing Libraries for Go
Stars: ✭ 25 (-35.9%)
Mutual labels:  text-processing
Giveme5W
Extraction of the five journalistic W-questions (5W) from news articles
Stars: ✭ 16 (-58.97%)
Mutual labels:  nlp-library

What does it do?

provides easy Python ways for

  • evaluation: calculate automated NLP metrics (BLEU, NIST, METEOR, entropy, etc...)
from metrics import nlp_metrics
nist, bleu, meteor, entropy, diversity, avg_len = nlp_metrics(
	  path_refs=["demo/ref0.txt", "demo/ref1.txt"], 
	  path_hyp="demo/hyp.txt")
	  
# nist = [1.8338, 2.0838, 2.1949, 2.1949]
# bleu = [0.4667, 0.441, 0.4017, 0.3224]
# meteor = 0.2832
# entropy = [2.5232, 2.4849, 2.1972, 1.7918]
# diversity = [0.8667, 1.000]
# avg_len = 5.0000
  • tokenizatioin: clean string and deal with punctation, contraction, url, mention, tag, etc
from data_prepare import clean_str
s = " I don't know:). how about this?https://github.com"
clean_str(s)

# i do n't know :) . how about this ? __url__
  • dialog GUI: provide a graphic user interface (GUI). You just need to provide a respond() function.
from dialog_gui import *

def my_respond_func(inp):
        # TODO
        # input: type=str, value=conversation history. turns delimited by 'EOS'
        # return: a list of (score, hyp) tuple based on input
        
app = QtWidgets.QApplication([])
respond_funcs = [my_respond_func]
gui = DialogGUI(respond_funcs, ['my_system_name'])
gui.w.update()
app.exec_()

Requirement

  • Tested with Python 2.7 and 3.6
  • For GUI, you need PyQt5
  • For evaluation part, please download the following 3rd-party packages and save in a new folder 3rdparty
    • mteval-v14c.pl (ftp://jaguar.ncsl.nist.gov/mt/resources/mteval-v14c.pl) to compute NIST. You may need to install the following perl modules (e.g. by cpan install): XML:Twig, Sort:Naturally and String:Util
    • meteor-1.5 to compute METEOR. It requires JAVA
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].