FastnlpfastNLP: A Modularized and Extensible NLP Framework. Currently still in incubation.
Stars: ✭ 2,441 (+6158.97%)
EkphrasisEkphrasis is a text processing tool, geared towards text from social networks, such as Twitter or Facebook. Ekphrasis performs tokenization, word normalization, word segmentation (for splitting hashtags) and spell correction, using word statistics from 2 big corpora (english Wikipedia, twitter - 330mil english tweets).
Stars: ✭ 433 (+1010.26%)
PynlplPyNLPl, pronounced as 'pineapple', is a Python library for Natural Language Processing. It contains various modules useful for common, and less common, NLP tasks. PyNLPl can be used for basic tasks such as the extraction of n-grams and frequency lists, and to build simple language model. There are also more complex data types and algorithms. Moreover, there are parsers for file formats common in NLP (e.g. FoLiA/Giza/Moses/ARPA/Timbl/CQL). There are also clients to interface with various NLP specific servers. PyNLPl most notably features a very extensive library for working with FoLiA XML (Format for Linguistic Annotation).
Stars: ✭ 426 (+992.31%)
PySODEvalToolkitPySODEvalToolkit: A Python-based Evaluation Toolbox for Salient Object Detection and Camouflaged Object Detection
Stars: ✭ 59 (+51.28%)
Pyarabicpyarabic
Stars: ✭ 183 (+369.23%)
TmtoolkitText Mining and Topic Modeling Toolkit for Python with parallel processing power
Stars: ✭ 135 (+246.15%)
GrammarEngineГрамматический Словарь Русского Языка (+ английский, японский, etc)
Stars: ✭ 68 (+74.36%)
TRUNAJOD2.0An easy-to-use library to extract indices from texts.
Stars: ✭ 18 (-53.85%)
hckA sharp cut(1) clone.
Stars: ✭ 542 (+1289.74%)
CBLUE中文医疗信息处理基准CBLUE: A Chinese Biomedical Language Understanding Evaluation Benchmark
Stars: ✭ 379 (+871.79%)
HrEasy Access to Uppercase H
Stars: ✭ 56 (+43.59%)
VirtualBLUA Virtual Assistant for Windows PC with wicked Qt Graphics.
Stars: ✭ 41 (+5.13%)
classyclassy is a simple-to-use library for building high-performance Machine Learning models in NLP.
Stars: ✭ 61 (+56.41%)
image-matching-toolboxThis is a toolbox repository to help evaluate various methods that perform image matching from a pair of images.
Stars: ✭ 252 (+546.15%)
S-measureStructure-measure: A New Way to Evaluate Foreground Maps, IJCV2021 (ICCV 2017-Spotlight)
Stars: ✭ 43 (+10.26%)
datasets🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools
Stars: ✭ 13,870 (+35464.1%)
textQiniu Text Processing Libraries for Go
Stars: ✭ 25 (-35.9%)
Giveme5WExtraction of the five journalistic W-questions (5W) from news articles
Stars: ✭ 16 (-58.97%)
OpenPromptAn Open-Source Framework for Prompt-Learning.
Stars: ✭ 1,769 (+4435.9%)
TextrudeCode generation from YAML/JSON/CSV models via SCRIBAN templates
Stars: ✭ 79 (+102.56%)
simple NERsimple rule based named entity recognition
Stars: ✭ 29 (-25.64%)
AIODriveOfficial Python/PyTorch Implementation for "All-In-One Drive: A Large-Scale Comprehensive Perception Dataset with High-Density Long-Range Point Clouds"
Stars: ✭ 32 (-17.95%)
Compare-UserJSPowerShell script for comparing user.js (or prefs.js) files.
Stars: ✭ 79 (+102.56%)
eddErlang Declarative Debugger
Stars: ✭ 20 (-48.72%)
midi degradation toolkitA toolkit for generating datasets of midi files which have been degraded to be 'un-musical'.
Stars: ✭ 29 (-25.64%)
lingua-go👄 The most accurate natural language detection library for Go, suitable for long and short text alike
Stars: ✭ 684 (+1653.85%)
minieAn open information extraction system that provides compact extractions
Stars: ✭ 83 (+112.82%)
stringxDrop-in replacements for base R string functions powered by stringi
Stars: ✭ 14 (-64.1%)
cinjeA Pythonic and ultra fast template engine DSL.
Stars: ✭ 26 (-33.33%)
WhatsMissingInGeoparsingThe accompanying code and data for the Springer 2017 publication "What's missing in geographical parsing?" in Language Resources and Evaluation.
Stars: ✭ 15 (-61.54%)
textstatRuby gem to calculate statistics from text to determine readability, complexity and grade level of a particular corpus.
Stars: ✭ 25 (-35.9%)
TextDatasetCleaner🔬 Очистка датасетов от мусора (нормализация, препроцессинг)
Stars: ✭ 27 (-30.77%)
pdq evaluationEvaluation code for using probabilistic detection quality (PDQ) measure for probabilistic object detection tasks. Currently supports COCO and robotic vision challenge (RVC) data.
Stars: ✭ 34 (-12.82%)
support-tickets-classificationThis case study shows how to create a model for text analysis and classification and deploy it as a web service in Azure cloud in order to automatically classify support tickets. This project is a proof of concept made by Microsoft (Commercial Software Engineering team) in collaboration with Endava http://endava.com/en
Stars: ✭ 142 (+264.1%)
nlcliNatural language interface for the command line.
Stars: ✭ 21 (-46.15%)
f1-communitiesA novel approach to evaluate community detection algorithms on ground truth
Stars: ✭ 20 (-48.72%)
hama-py🦛 파이썬 한글 처리 라이브러리. Python Korean Morphological Analyzer
Stars: ✭ 16 (-58.97%)
table-evaluatorEvaluate real and synthetic datasets with each other
Stars: ✭ 44 (+12.82%)
nervaluateFull named-entity (i.e., not tag/token) evaluation metrics based on SemEval’13
Stars: ✭ 40 (+2.56%)
webMUSHRAa MUSHRA compliant web audio API based experiment software
Stars: ✭ 171 (+338.46%)
SuffixTreeOptimized implementation of suffix tree in python using Ukkonen's algorithm.
Stars: ✭ 38 (-2.56%)
syntaxnet-apiA small HTTP API for SyntaxNet
Stars: ✭ 20 (-48.72%)
NLP ToolkitLibrary of state-of-the-art models (PyTorch) for NLP tasks
Stars: ✭ 92 (+135.9%)
TextFeatureSelectionPython library for feature selection for text features. It has filter method, genetic algorithm and TextFeatureSelectionEnsemble for improving text classification models. Helps improve your machine learning models
Stars: ✭ 42 (+7.69%)
go-eekBlazingly fast and safe Go evaluation library, created on top of Go pkg/plugin package
Stars: ✭ 37 (-5.13%)
andaluh-jsTransliterate español (spanish) spelling to andaluz proposals using javascript
Stars: ✭ 22 (-43.59%)
MusDrEvaluation metrics for machine-composed symbolic music. Paper: "The Jazz Transformer on the Front Line: Exploring the Shortcomings of AI-Composed Music through Quantitative Measures", ISMIR 2020
Stars: ✭ 38 (-2.56%)
finglishA Finglish to Persian converter.
Stars: ✭ 60 (+53.85%)
CYK-ParserA CYK parser written in Python 3.
Stars: ✭ 24 (-38.46%)
Nuts自然语言处理常见任务(主要包括文本分类,序列标注,自动问答等)解决方案试验田
Stars: ✭ 21 (-46.15%)
frogFrog is an integration of memory-based natural language processing (NLP) modules developed for Dutch. All NLP modules are based on Timbl, the Tilburg memory-based learning software package.
Stars: ✭ 70 (+79.49%)