Open-korean-corporaOpen Korean NLP Dataset Curation for the Users All Around the Globe
Stars: ✭ 82 (+41.38%)
ProsodyHelsinki Prosody Corpus and A System for Predicting Prosodic Prominence from Text
Stars: ✭ 139 (+139.66%)
Kor2vecLibrary for Korean morpheme and word vector representation
Stars: ✭ 64 (+10.34%)
FakenewscorpusA dataset of millions of news articles scraped from a curated list of data sources.
Stars: ✭ 255 (+339.66%)
Pytreebank😡😇 Stanford Sentiment Treebank loader in Python
Stars: ✭ 93 (+60.34%)
ChazutsuThe tool to make NLP datasets ready to use
Stars: ✭ 238 (+310.34%)
MtntCode for the collection and analysis of the MTNT dataset
Stars: ✭ 48 (-17.24%)
Text2sql DataA collection of datasets that pair questions with SQL queries.
Stars: ✭ 287 (+394.83%)
DoccanoOpen source annotation tool for machine learning practitioners.
Stars: ✭ 5,600 (+9555.17%)
Oie ResourcesA curated list of Open Information Extraction (OIE) resources: papers, code, data, etc.
Stars: ✭ 283 (+387.93%)
Open Korean TextOpen Korean Text Processor - An Open-source Korean Text Processor
Stars: ✭ 438 (+655.17%)
Mams For AbsaA Multi-Aspect Multi-Sentiment Dataset for aspect-based sentiment analysis.
Stars: ✭ 135 (+132.76%)
CoarijCorpus of Annual Reports in Japan
Stars: ✭ 55 (-5.17%)
Pytorch Bert Crf NerKoBERT와 CRF로 만든 한국어 개체명인식기 (BERT+CRF based Named Entity Recognition model for Korean)
Stars: ✭ 236 (+306.9%)
Pytorch NlpBasic Utilities for PyTorch Natural Language Processing (NLP)
Stars: ✭ 1,996 (+3341.38%)
Nlp bahasa resourcesA Curated List of Dataset and Usable Library Resources for NLP in Bahasa Indonesia
Stars: ✭ 158 (+172.41%)
BondBOND: BERT-Assisted Open-Domain Name Entity Recognition with Distant Supervision
Stars: ✭ 96 (+65.52%)
Ua GecUA-GEC: Grammatical Error Correction and Fluency Corpus for the Ukrainian Language
Stars: ✭ 108 (+86.21%)
Hunspell Dict KoKorean spellchecking dictionary for Hunspell
Stars: ✭ 187 (+222.41%)
WikisqlA large annotated semantic parsing corpus for developing natural language interfaces.
Stars: ✭ 965 (+1563.79%)
Research papersRecord some papers I have read and paper notes I have taken, also including some awesome papers reading lists and academic blog posts.
Stars: ✭ 55 (-5.17%)
Covid 19Novel Coronavirus 2019 time series data on cases
Stars: ✭ 1,060 (+1727.59%)
Tensorflow Lstm SinTensorFlow 1.3 experiment with LSTM (and GRU) RNNs for sine prediction
Stars: ✭ 52 (-10.34%)
Codar✅ CODAR is a Framework built using PyTorch to analyze post (Text+Media) and predict Cyber Bullying and offensive content. 💬📷
Stars: ✭ 52 (-10.34%)
View Finding NetworkA deep ranking network that learns to find good compositions in a photograph.
Stars: ✭ 57 (-1.72%)
Iob2corpusJapanese IOB2 tagged corpus for Named Entity Recognition.
Stars: ✭ 51 (-12.07%)
DemosSome JavaScript works published as demos, mostly ML or DS
Stars: ✭ 55 (-5.17%)
Images Web CrawlerThis package is a complete tool for creating a large dataset of images (specially designed -but not only- for machine learning enthusiasts). It can crawl the web, download images, rename / resize / covert the images and merge folders..
Stars: ✭ 51 (-12.07%)
Lingua FrancaMycroft's multilingual text parsing and formatting library
Stars: ✭ 51 (-12.07%)
Yesterday I LearnedBrainfarts are caused by the rupturing of the cerebral sphincter.
Stars: ✭ 50 (-13.79%)
CourseraforumsAnonymized versions of the discussion threads from the forums of 60 Coursera MOOCs
Stars: ✭ 50 (-13.79%)
Spark NkpNatural Korean Processor for Apache Spark
Stars: ✭ 50 (-13.79%)
Gobyexample🎁 Go By Example 한국어 버전
Stars: ✭ 50 (-13.79%)
Php MlPHP-ML - Machine Learning library for PHP
Stars: ✭ 7,900 (+13520.69%)
CorenlpStanford CoreNLP: A Java suite of core NLP tools.
Stars: ✭ 8,248 (+14120.69%)
Knyfeknyfe is a python utility for rapid exploration of datasets.
Stars: ✭ 54 (-6.9%)
PatternWeb mining module for Python, with tools for scraping, natural language processing, machine learning, network analysis and visualization.
Stars: ✭ 8,112 (+13886.21%)
Control User CursorAlter user cursor behavior. Simulates users cursor and can apply transformations to it.
Stars: ✭ 1,050 (+1710.34%)
Stevens Vlp16 DatasetThis dataset is captured using a Velodyne VLP-16, which is mounted on an UGV - Clearpath Jackal, on Stevens Institute of Technology campus
Stars: ✭ 58 (+0%)
Geodata BrFree open public domain geographic data of Brazil available in multiple languages and formats.
Stars: ✭ 57 (-1.72%)
Covidnet CtCOVID-Net Open Source Initiative - Models and Data for COVID-19 Detection in Chest CT
Stars: ✭ 57 (-1.72%)
Distil💧 In memory dataset filtering, inspired by snikch/aggro
Stars: ✭ 49 (-15.52%)
ScdvText classification with Sparse Composite Document Vectors.
Stars: ✭ 54 (-6.9%)
Li emnlp 2017Deep Recurrent Generative Decoder for Abstractive Text Summarization in DyNet
Stars: ✭ 56 (-3.45%)
Jieba Php"結巴"中文分詞:做最好的 PHP 中文分詞、中文斷詞組件。 / "Jieba" (Chinese for "to stutter") Chinese text segmentation: built to be the best PHP Chinese word segmentation module.
Stars: ✭ 1,073 (+1750%)
Cs224n SolutionsSolutions for CS224n course from Stanford University: Natural Language Processing with Deep Learning
Stars: ✭ 48 (-17.24%)