A simple NLP library allows profiling datasets with one or more text columns. When given a dataset and a column name containing text data, NLP Profiler will return either high-level insights or low-level/granular statistical information about the text in that column.

Stars: ✭ 181 (-92.94%)

Mutual labels: natural-language-processing

Lazynlp

Library to scrape and clean web pages to create massive datasets.

Stars: ✭ 1,985 (-22.58%)

Mutual labels: natural-language-processing

Udpipe

R package for Tokenization, Parts of Speech Tagging, Lemmatization and Dependency Parsing Based on the UDPipe Natural Language Processing Toolkit

Stars: ✭ 160 (-93.76%)

Mutual labels: natural-language-processing

Displacy Ent

💥 displaCy-ent.js: An open-source named entity visualiser for the modern web

Stars: ✭ 191 (-92.55%)

Mutual labels: natural-language-processing

Deep Survey Text Classification

The project surveys 16+ Natural Language Processing (NLP) research papers that propose novel Deep Neural Network Models for Text Classification, based on Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN). It also implements each of the models using Tensorflow and Keras.

Stars: ✭ 187 (-92.71%)

Mutual labels: natural-language-processing

Gerbil

GERBIL - General Entity annotatoR Benchmark

Stars: ✭ 180 (-92.98%)

Mutual labels: named-entity-recognition

Covid Papers Browser

Browse Covid-19 & SARS-CoV-2 Scientific Papers with Transformers 🦠 📖

Stars: ✭ 161 (-93.72%)

Mutual labels: natural-language-processing

Ngx Dynamic Dashboard Framework

This is a JSON driven angular x based dashboard framework that is inspired by JIRA's dashboard implementation and https://github.com/raulgomis/angular-dashboard-framework

Stars: ✭ 160 (-93.76%)

Mutual labels: natural-language-processing

Stopwords

Default English stopword lists from many different sources

Stars: ✭ 179 (-93.02%)

Mutual labels: natural-language-processing

Nlp bahasa resources

A Curated List of Dataset and Usable Library Resources for NLP in Bahasa Indonesia

Stars: ✭ 158 (-93.84%)

Mutual labels: natural-language-processing

Mixtext

MixText: Linguistically-Informed Interpolation of Hidden Space for Semi-Supervised Text Classification

Stars: ✭ 159 (-93.8%)

Mutual labels: natural-language-processing

Bert Vocab Builder

Builds wordpiece(subword) vocabulary compatible for Google Research's BERT

Stars: ✭ 187 (-92.71%)

Mutual labels: natural-language-processing

Cookiecutter Spacy Fastapi

Cookiecutter API for creating Custom Skills for Azure Search using Python and Docker

Stars: ✭ 179 (-93.02%)

Mutual labels: natural-language-processing

Pytorch Nlp

Basic Utilities for PyTorch Natural Language Processing (NLP)

Stars: ✭ 1,996 (-22.15%)

Mutual labels: natural-language-processing

Mtbook

《机器翻译：基础与模型》肖桐朱靖波著 - Machine Translation: Foundations and Models

Stars: ✭ 2,307 (-10.02%)

Mutual labels: natural-language-processing

Cs224n 2019

My completed implementation solutions for CS224N 2019

Stars: ✭ 178 (-93.06%)

Mutual labels: natural-language-processing

Mishkal

Mishkal is an arabic text vocalization software

Stars: ✭ 158 (-93.84%)

Mutual labels: natural-language-processing

Nlpre

Python library for Natural Language Preprocessing (NLPre)

Stars: ✭ 158 (-93.84%)

Mutual labels: natural-language-processing

Nlvr

Cornell NLVR and NLVR2 are natural language grounding datasets. Each example shows a visual input and a sentence describing it, and is annotated with the truth-value of the sentence.

Stars: ✭ 192 (-92.51%)

Mutual labels: natural-language-processing

Delbot

It understands your voice commands, searches news and knowledge sources, and summarizes and reads out content to you.

Stars: ✭ 191 (-92.55%)

Mutual labels: natural-language-processing

Deepinterests

深度有趣

Stars: ✭ 2,232 (-12.95%)

Mutual labels: natural-language-processing

Kashgari

Kashgari is a production-level NLP Transfer learning framework built on top of tf.keras for text-labeling and text-classification, includes Word2Vec, BERT, and GPT2 Language Embedding.

Stars: ✭ 2,235 (-12.83%)

Mutual labels: named-entity-recognition

Gensim

Topic Modelling for Humans

Stars: ✭ 12,763 (+397.78%)

Mutual labels: natural-language-processing

Awesome Nlp

📖 A curated list of resources dedicated to Natural Language Processing (NLP)

Stars: ✭ 12,626 (+392.43%)

Mutual labels: natural-language-processing

Nel

Entity linking framework

Stars: ✭ 176 (-93.14%)

Mutual labels: natural-language-processing

Awesome Pytorch List

A comprehensive list of pytorch related content on github,such as different models,implementations,helper libraries,tutorials etc.

Stars: ✭ 12,475 (+386.54%)

Mutual labels: natural-language-processing

Sling

SLING - A natural language frame semantics parser

Stars: ✭ 1,892 (-26.21%)

Mutual labels: natural-language-processing

Deep Generative Models For Natural Language Processing

DGMs for NLP. A roadmap.

Stars: ✭ 185 (-92.78%)

Mutual labels: natural-language-processing

Fastnlp

fastNLP: A Modularized and Extensible NLP Framework. Currently still in incubation.

Stars: ✭ 2,441 (-4.8%)

Mutual labels: natural-language-processing

Visdial Rl

PyTorch code for Learning Cooperative Visual Dialog Agents using Deep Reinforcement Learning

Stars: ✭ 157 (-93.88%)

Mutual labels: natural-language-processing

Holiday Cn

📅🇨🇳 中国法定节假日数据自动每日抓取国务院公告

Stars: ✭ 157 (-93.88%)

Mutual labels: natural-language-processing

Cleannlp

R package providing annotators and a normalized data model for natural language processing

Stars: ✭ 174 (-93.21%)

Mutual labels: natural-language-processing

Speech signal processing and classification

Front-end speech processing aims at extracting proper features from short- term segments of a speech utterance, known as frames. It is a pre-requisite step toward any pattern recognition problem employing speech or audio (e.g., music). Here, we are interesting in voice disorder classification. That is, to develop two-class classifiers, which can discriminate between utterances of a subject suffering from say vocal fold paralysis and utterances of a healthy subject.The mathematical modeling of the speech production system in humans suggests that an all-pole system function is justified [1-3]. As a consequence, linear prediction coefficients (LPCs) constitute a first choice for modeling the magnitute of the short-term spectrum of speech. LPC-derived cepstral coefficients are guaranteed to discriminate between the system (e.g., vocal tract) contribution and that of the excitation. Taking into account the characteristics of the human ear, the mel-frequency cepstral coefficients (MFCCs) emerged as descriptive features of the speech spectral envelope. Similarly to MFCCs, the perceptual linear prediction coefficients (PLPs) could also be derived. The aforementioned sort of speaking tradi- tional features will be tested against agnostic-features extracted by convolu- tive neural networks (CNNs) (e.g., auto-encoders) [4]. The pattern recognition step will be based on Gaussian Mixture Model based classifiers,K-nearest neighbor classifiers, Bayes classifiers, as well as Deep Neural Networks. The Massachussets Eye and Ear Infirmary Dataset (MEEI-Dataset) [5] will be exploited. At the application level, a library for feature extraction and classification in Python will be developed. Credible publicly available resources will be 1used toward achieving our goal, such as KALDI. Comparisons will be made against [6-8].

Stars: ✭ 155 (-93.95%)

Mutual labels: natural-language-processing

Swagaf

Repository for paper "SWAG: A Large-Scale Adversarial Dataset for Grounded Commonsense Inference"

Stars: ✭ 156 (-93.92%)

Mutual labels: natural-language-processing

Dostoevsky

Sentiment analysis library for russian language

Stars: ✭ 191 (-92.55%)

Mutual labels: natural-language-processing

Neuralqa

NeuralQA: A Usable Library for Question Answering on Large Datasets with BERT