Chinese Names Corpus中文人名语料库。人名生成器。中文姓名,姓氏,名字,称呼,日本人名,翻译人名,英文人名。可用于中文分词、人名实体识别。
Stars: ✭ 3,053 (+11642.31%)
FakenewscorpusA dataset of millions of news articles scraped from a curated list of data sources.
Stars: ✭ 255 (+880.77%)
Nlp chinese corpus大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP
Stars: ✭ 6,656 (+25500%)
Dialog corpus用于训练中英文对话系统的语料库 Datasets for Training Chatbot System
Stars: ✭ 1,662 (+6292.31%)
Ua GecUA-GEC: Grammatical Error Correction and Fluency Corpus for the Ukrainian Language
Stars: ✭ 108 (+315.38%)
Pubmed RctPubMed 200k RCT dataset: a large dataset for sequential sentence classification.
Stars: ✭ 101 (+288.46%)
CoarijCorpus of Annual Reports in Japan
Stars: ✭ 55 (+111.54%)
ProsodyHelsinki Prosody Corpus and A System for Predicting Prosodic Prominence from Text
Stars: ✭ 139 (+434.62%)
Medmnist[ISBI'21] MedMNIST Classification Decathlon: A Lightweight AutoML Benchmark for Medical Image Analysis
Stars: ✭ 338 (+1200%)
Clue中文语言理解测评基准 Chinese Language Understanding Evaluation Benchmark: datasets, baselines, pre-trained models, corpus and leaderboard
Stars: ✭ 2,425 (+9226.92%)
Dataset Listlists of text corpus and more (mainly Japanese)
Stars: ✭ 84 (+223.08%)
Nlp bahasa resourcesA Curated List of Dataset and Usable Library Resources for NLP in Bahasa Indonesia
Stars: ✭ 158 (+507.69%)
climateRAn R 📦 for getting point and gridded climate data by AOI
Stars: ✭ 93 (+257.69%)
infirmary-integratedMedical device simulator for training healthcare professionals.
Stars: ✭ 27 (+3.85%)
DeepSentiPersRepository for the experiments described in the paper named "DeepSentiPers: Novel Deep Learning Models Trained Over Proposed Augmented Persian Sentiment Corpus"
Stars: ✭ 17 (-34.62%)
covid19-data-greeceDatasets and analysis of Novel Coronavirus (COVID-19) outbreak in Greece
Stars: ✭ 16 (-38.46%)
Audio-Classification-using-CNN-MLPMulti class audio classification using Deep Learning (MLP, CNN): The objective of this project is to build a multi class classifier to identify sound of a bee, cricket or noise.
Stars: ✭ 36 (+38.46%)
Filipino-Text-BenchmarksOpen-source benchmark datasets and pretrained transformer models in the Filipino language.
Stars: ✭ 22 (-15.38%)
squad-v1.1-ptPortuguese translation of the SQuAD dataset
Stars: ✭ 13 (-50%)
BugZooKeep your bugs contained. A platform for studying historical software bugs.
Stars: ✭ 49 (+88.46%)
MICCAI21 MMQMultiple Meta-model Quantifying for Medical Visual Question Answering
Stars: ✭ 16 (-38.46%)
pump-and-dump-datasetAdditional material for paper: Pump and Dumps in the Bitcoin Era: Real Time Detection of Cryptocurrency Market Manipulations, ICCCN '20
Stars: ✭ 66 (+153.85%)
dialogue-datasetscollect the open dialog corpus and some useful data processing utils.
Stars: ✭ 24 (-7.69%)
tracing-vs-freehandTracing Versus Freehand for Evaluating Computer-Generated Drawings (SIGGRAPH 2021)
Stars: ✭ 21 (-19.23%)
BIRLBIRL: Benchmark on Image Registration methods with Landmark validations
Stars: ✭ 66 (+153.85%)
SpiCE-CorpusAn open-access corpus of conversational bilingual speech in Cantonese and English
Stars: ✭ 33 (+26.92%)
humanapiThe easiest way to integrate health data from anywhere - https://www.humanapi.co
Stars: ✭ 21 (-19.23%)
snorkelingExtracting biomedical relationships from literature with Snorkel 🏊
Stars: ✭ 56 (+115.38%)
wolfpacsWolfPACS is an DICOM load balancer written in Erlang.
Stars: ✭ 1 (-96.15%)
ChRIS uiUI for ChRIS
Stars: ✭ 20 (-23.08%)
fastmorphFast corpus search engine originally made for the Corpus of Written Tatar language
Stars: ✭ 14 (-46.15%)
OpenDialogAn Open-Source Package for Chinese Open-domain Conversational Chatbot (中文闲聊对话系统,一键部署微信闲聊机器人)
Stars: ✭ 94 (+261.54%)
dicomC++11 and boost based implementation of the DICOM standard.
Stars: ✭ 14 (-46.15%)
Indian ParallelCorpusCurated list of publicly available parallel corpus for Indian Languages
Stars: ✭ 23 (-11.54%)
user qualityDataset for Software Evolution and Quality Improvement
Stars: ✭ 27 (+3.85%)
dog daysUsing AWS RDS and S3 to store data about my dogs' vaccination and medical records. Creating an R shiny app to keep track of and share records with vets. 🐶 🐶
Stars: ✭ 44 (+69.23%)
aarogya sevaA beautiful 😍 covid-19 app with self - assessment and more.
Stars: ✭ 118 (+353.85%)
dotty dictDictionary wrapper for quick access to deeply nested keys.
Stars: ✭ 67 (+157.69%)
HJDatasetA Large Dataset of Historical Japanese Documents with Complex Layouts
Stars: ✭ 19 (-26.92%)
HARRecognize one of six human activities such as standing, sitting, and walking using a Softmax Classifier trained on mobile phone sensor data.
Stars: ✭ 18 (-30.77%)