甲言，专注于古代汉语(古汉语/古文/文言文/文言)处理的NLP工具包，支持文言词库构建、分词、词性标注、断句和标点。Jiayan, the NLP toolkit designed for Classical Chinese, supports lexicon construction, tokenizing, POS tagging, sentence segmentation and punctuation.

✭ 167

python nlp

Radish

C++ model train&inference framework

✭ 168

deep-learning nlp

Rouge 2.0

ROUGE automatic summarization evaluation toolkit. Support for ROUGE-[N, L, S, SU], stemming and stopwords in different languages, unicode text evaluation, CSV output.

✭ 167

java nlp metrics evaluation text-summarization

Node Postal

NodeJS bindings to libpostal for fast international address parsing/normalization

✭ 165

nlp native binding address

Indic Bert

BERT-based Multilingual Model for Indian Languages

✭ 160

python nlp language-model

Turkish Stemmer Python

🐍 Turkish Language Stemmer for Python

✭ 165

python language nlp natural-language-processing

Improved Dynamic Memory Networks Dmn Plus

Theano Implementation of DMN+ (Improved Dynamic Memory Networks) from the paper by Xiong, Merity, & Socher at MetaMind, http://arxiv.org/abs/1603.01417 (Dynamic Memory Networks for Visual and Textual Question Answering)

✭ 165

python deep-learning nlp neural-network deep-neural-networks question-answering

Text Emotion Classification

Archived - not answering issues

✭ 165

jupyter-notebook nlp keras deep-neural-networks sentiment-classification

Fixy

Amacımız Türkçe NLP literatüründeki birçok farklı sorunu bir arada çözebilen, eşsiz yaklaşımlar öne süren ve literatürdeki çalışmaların eksiklerini gideren open source bir yazım destekleyicisi/denetleyicisi oluşturmak. Kullanıcıların yazdıkları metinlerdeki yazım yanlışlarını derin öğrenme yaklaşımıyla çözüp aynı zamanda metinlerde anlamsal analizi de gerçekleştirerek bu bağlamda ortaya çıkan yanlışları da fark edip düzeltebilmek.

✭ 165

python jupyter-notebook deep-learning nlp data-science keras neural-network natural-language-processing artificial-intelligence ai neural-networks deeplearning

Metalearning4nlp Papers

A list of recent papers about Meta / few-shot learning methods applied in NLP areas.

✭ 163

nlp meta-learning dialogue-systems

Prosodic

Prosodic: a metrical-phonological parser, written in Python. For English and Finnish, with flexible language support.

✭ 162

python nlp linguistics

Chatspace

핑퐁에서 만든 채팅체랑 잘 맞는 띄어쓰기 모델!

✭ 163

python pytorch nlp korean

Negspacy

spaCy pipeline object for negating concepts in text

✭ 162

python nlp spacy

Xk Time

xk-time 是时间转换，时间计算，时间格式化，时间解析，日历，时间cron表达式和时间NLP等的工具，使用Java8，线程安全，简单易用，多达70几种常用日期格式化模板，支持Java8时间类和Date，轻量级，无第三方依赖。

✭ 162

java nlp calendar formatter date cron calculator

Solrtexttagger

A text tagger based on Lucene / Solr, using FST technology

✭ 162

java nlp named-entity-recognition solr

Textlint

The pluggable natural language linter for text and markdown.

✭ 2,158

javascript typescript nlp markdown linter lint natural-language textlint

Sru

SRU is a recurrent unit that can run over 10 times faster than cuDNN LSTM, without loss of accuracy tested on many tasks.

✭ 2,009

python Cuda C++shell CMake deep-learning pytorch nlp recurrent-neural-networks

Multi rake

Multilingual Rapid Automatic Keyword Extraction (RAKE) for Python

✭ 162

python nlp text-mining

Denspi

Real-Time Open-Domain Question Answering with Dense-Sparse Phrase Index (DenSPI)

✭ 162

python nlp question-answering

Tokenizers

Fast, Consistent Tokenization of Natural Language Text

✭ 161

r nlp rstats r-package text-mining tokenizer

Lazynlp

Library to scrape and clean web pages to create massive datasets.

✭ 1,985

python nlp data-science natural-language-processing artificial-intelligence language-model text-mining open

Kasaya

A "WYSIWYG" (sort of) scripting language and runtime for browser automation

✭ 1,906

javascript nlp testing automation browser

Unilm

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

Udpipe

R package for Tokenization, Parts of Speech Tagging, Lemmatization and Dependency Parsing Based on the UDPipe Natural Language Processing Toolkit

✭ 160

r nlp natural-language-processing r-package text-mining tokenizer pos-tagging

Pyate

PYthon Automated Term Extraction

✭ 161

html nlp ai

Ai Hackathon 2018

"한계를 넘어 상상에 도전하자!" 네이버 AI 해커톤 2018 - 대회종료

✭ 160

machine-learning nlp ai hackathon

Nlp bahasa resources

A Curated List of Dataset and Usable Library Resources for NLP in Bahasa Indonesia

✭ 158

library nlp natural-language-processing dataset sentiment-analysis packages corpus

Datetimeseer

A painless way to pick future time.

✭ 159

java android nlp datetime

Awesome Text Classification

Awesome-Text-Classification Projects,Papers,Tutorial .

✭ 158

tensorflow awesome nlp classification text-classification text-mining text-analysis nltk

Ruijin round2

瑞金医院MMC人工智能辅助构建知识图谱大赛复赛

✭ 159

jupyter-notebook nlp relation-extraction

Keras Xlnet

Implementation of XLNet that can load pretrained checkpoints

✭ 159

python nlp keras language-model

Transformers for text classification

基于Transformers的文本分类

✭ 158

python nlp text-classification

Pytorch Nlp

Basic Utilities for PyTorch Natural Language Processing (NLP)

✭ 1,996

python shell deep-learning machine-learning pytorch nlp neural-network natural-language-processing dataset metrics embeddings data-loader sru word-vectors pytorch-nlp torchnlp

Nlpre

Python library for Natural Language Preprocessing (NLPre)

✭ 158

python nlp natural-language-processing text-processing

Vdcnn

Implementation of Very Deep Convolutional Neural Network for Text Classification

✭ 158

python tensorflow nlp keras convolutional-neural-networks text-classification keras-tensorflow

Ape

Parser for Attempto Controlled English (ACE)

✭ 156

prolog nlp ace

Awesome Nlp

📖 A curated list of resources dedicated to Natural Language Processing (NLP)

✭ 12,626

deep-learning machine-learning awesome awesome-list nlp natural-language-processing text-mining

Awesome Pytorch List

A comprehensive list of pytorch related content on github,such as different models,implementations,helper libraries,tutorials etc.

✭ 12,475

deep-learning machine-learning pytorch awesome awesome-list computer-vision nlp data-science neural-network natural-language-processing facebook tutorials papers cv nlp-library utility-library pytorch-tutorials pytorch-model

Sling

SLING - A natural language frame semantics parser

✭ 1,892

C++python javascript Starlark HTML shell machine-learning nlp neural-network natural-language-processing natural-language-understanding jit-compiler frame-semantic-parsing

Pdfanno

Linguistic Annotation and Visualization Tool for PDF Documents

✭ 156

javascript nlp pdf annotation

Speech signal processing and classification

Front-end speech processing aims at extracting proper features from short- term segments of a speech utterance, known as frames. It is a pre-requisite step toward any pattern recognition problem employing speech or audio (e.g., music). Here, we are interesting in voice disorder classification. That is, to develop two-class classifiers, which can discriminate between utterances of a subject suffering from say vocal fold paralysis and utterances of a healthy subject.The mathematical modeling of the speech production system in humans suggests that an all-pole system function is justified [1-3]. As a consequence, linear prediction coefficients (LPCs) constitute a first choice for modeling the magnitute of the short-term spectrum of speech. LPC-derived cepstral coefficients are guaranteed to discriminate between the system (e.g., vocal tract) contribution and that of the excitation. Taking into account the characteristics of the human ear, the mel-frequency cepstral coefficients (MFCCs) emerged as descriptive features of the speech spectral envelope. Similarly to MFCCs, the perceptual linear prediction coefficients (PLPs) could also be derived. The aforementioned sort of speaking tradi- tional features will be tested against agnostic-features extracted by convolu- tive neural networks (CNNs) (e.g., auto-encoders) [4]. The pattern recognition step will be based on Gaussian Mixture Model based classifiers,K-nearest neighbor classifiers, Bayes classifiers, as well as Deep Neural Networks. The Massachussets Eye and Ear Infirmary Dataset (MEEI-Dataset) [5] will be exploited. At the application level, a library for feature extraction and classification in Python will be developed. Credible publicly available resources will be 1used toward achieving our goal, such as KALDI. Comparisons will be made against [6-8].

✭ 155

python nlp natural-language-processing feature-extraction speech-processing classifier nltk

Java Deep Learning Cookbook

Code for Java Deep Learning Cookbook

✭ 156