dialogue-datasetscollect the open dialog corpus and some useful data processing utils.
Stars: ✭ 24 (-63.08%)
TV4DialogNo description or website provided.
Stars: ✭ 33 (-49.23%)
BSDThe Business Scene Dialogue corpus
Stars: ✭ 51 (-21.54%)
kanji-frequencyKanji usage frequency data collected from various sources
Stars: ✭ 92 (+41.54%)
Cluecorpus2020Large-scale Pre-training Corpus for Chinese 100G 中文预训练语料
Stars: ✭ 278 (+327.69%)
KWDLCKyoto University Web Document Leads Corpus
Stars: ✭ 64 (-1.54%)
compact-wineNo description or website provided.
Stars: ✭ 87 (+33.85%)
frostpunk modFrostpunk / Mod Tools / 非公式日本語化MODツール
Stars: ✭ 17 (-73.85%)
sample-ui-vue-pagesBootstrap + Vue.js [ Scss / Babel ] (Multi-Page/SSR Model)
Stars: ✭ 20 (-69.23%)
geodaDataData package for accessing GeoDa datasets using R
Stars: ✭ 15 (-76.92%)
nytwitNew York Times Word Innovation Types dataset
Stars: ✭ 21 (-67.69%)
mlxMachine Learning eXchange (MLX). Data and AI Assets Catalog and Execution Engine
Stars: ✭ 132 (+103.08%)
proiel-treebankOfficial releases of the PROIEL treebank of ancient Indo-European languages
Stars: ✭ 30 (-53.85%)
rasa ch faq用 rasa 实现 rasa demo 机器人,有一些惊奇的功能,faq,图谱,多轮等
Stars: ✭ 156 (+140%)
delitos-caba🚓 Crime dataset for the City of Buenos Aires, Argentina
Stars: ✭ 44 (-32.31%)
industrial-ml-datasetsA curated list of datasets, publically available for machine learning research in the area of manufacturing
Stars: ✭ 45 (-30.77%)
gumRepository for the Georgetown University Multilayer Corpus (GUM)
Stars: ✭ 71 (+9.23%)
jmdict-kindleJapanese - English dictionary for Kindle based on the JMdict / EDICT database
Stars: ✭ 151 (+132.31%)
OpenConvertText conversion tool (from e.g. Word, HTML, txt) to corpus formats TEI or FoLiA)
Stars: ✭ 20 (-69.23%)
DANeSDANeS is an open-source E-newspaper dataset by collaboration between DATASET JSC (dataset.vn) and AIV Group (aivgroup.vn)
Stars: ✭ 64 (-1.54%)
tvsubTVsub: DCU-Tencent Chinese-English Dialogue Corpus
Stars: ✭ 40 (-38.46%)
zkanjiJapanese language study suite and dictionary
Stars: ✭ 55 (-15.38%)
humanflow2Official repository of Learning Multi-Human Optical Flow (IJCV 2019)
Stars: ✭ 37 (-43.08%)
biomechanics datasetInformation of public available data sets for biomechanics.
Stars: ✭ 31 (-52.31%)
torchgeoTorchGeo: datasets, samplers, transforms, and pre-trained models for geospatial data
Stars: ✭ 1,125 (+1630.77%)
KawazuA C# library for converting Japanese sentence to Hiragana, Katakana or Romaji with furigana and okurigana modes supported. Inspired by project Kuroshiro.
Stars: ✭ 33 (-49.23%)
CHRSIXray : A Large-scale Security Inspection X-ray Benchmark in CVPR 2019
Stars: ✭ 78 (+20%)
next-qrcodeReact hooks for generating QRCode for your next React apps.
Stars: ✭ 87 (+33.85%)
lang-jaManage Japanese language files which distributed with vim.
Stars: ✭ 20 (-69.23%)
bugrepoA collection of publicly available bug reports
Stars: ✭ 93 (+43.08%)
opensource-voice-toolsA repo listing known open source voice tools, ordered by where they sit in the voice stack
Stars: ✭ 21 (-67.69%)
git-rdmA research data management plugin for the Git version control system.
Stars: ✭ 34 (-47.69%)
trafilaturaPython & command-line tool to gather text on the Web: web crawling/scraping, extraction of text, metadata, comments
Stars: ✭ 711 (+993.85%)
kanjigridA web-app displaying the 2200 kanji characters taught in James Heisig's "Remembering the Kanji", 6th edition.
Stars: ✭ 37 (-43.08%)
rclcRich Context leaderboard competition, including the corpus and current SOTA for required tasks.
Stars: ✭ 20 (-69.23%)
german-nounsA list of ~100,000 German nouns and their grammatical properties compiled from WiktionaryDE as CSV file. Plus a module to look up the data and parse compound words.
Stars: ✭ 101 (+55.38%)
DiscEvalDiscourse Based Evaluation of Language Understanding
Stars: ✭ 18 (-72.31%)
ocr2textConvert a PDF via OCR to a TXT file in UTF-8 encoding
Stars: ✭ 90 (+38.46%)
kanjigridFork of the Kanji Grid addon for Anki
Stars: ✭ 21 (-67.69%)
Kaku画 - Japanese OCR Dictionary
Stars: ✭ 160 (+146.15%)
dagpiDagpi is a powerful and fast api that does image manipulation as well as serves datasets. It is fast and written in rust and python. Perfect for discord bots, social media apps, camera apps and more.
Stars: ✭ 25 (-61.54%)
scrapeOPA python package for scraping oddsportal.com
Stars: ✭ 99 (+52.31%)
morghulisNo description or website provided.
Stars: ✭ 18 (-72.31%)