All Projects → Chinese Nlp Corpus → Similar Projects or Alternatives

344 Open source projects that are alternatives of or similar to Chinese Nlp Corpus

Weixin public corpus
微信公众号语料库
Stars: ✭ 465 (+6.16%)
Mutual labels:  corpus, chinese-nlp
Small Chinese Corpus
Some useful Chinese corpus datasets 中文语料小数据
Stars: ✭ 462 (+5.48%)
Mutual labels:  corpus, chinese-nlp
Nlp chinese corpus
大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP
Stars: ✭ 6,656 (+1419.63%)
Mutual labels:  corpus, chinese-nlp
Cluedatasetsearch
搜索所有中文NLP数据集,附常用英文NLP数据集
Stars: ✭ 2,112 (+382.19%)
Mutual labels:  datasets, corpus
Cluecorpus2020
Large-scale Pre-training Corpus for Chinese 100G 中文预训练语料
Stars: ✭ 278 (-36.53%)
Mutual labels:  datasets, corpus
Gossiping Chinese Corpus
PTT 八卦版問答中文語料
Stars: ✭ 137 (-68.72%)
Mutual labels:  corpus, chinese-nlp
open2ch-dialogue-corpus
おーぷん2ちゃんねるをクロールして作成した対話コーパス
Stars: ✭ 65 (-85.16%)
Mutual labels:  corpus, datasets
datasets
TFDS data loaders for sign language datasets.
Stars: ✭ 17 (-96.12%)
Mutual labels:  datasets
Medical Datasets
tracking medical datasets, with a focus on medical imaging
Stars: ✭ 296 (-32.42%)
Mutual labels:  datasets
covid-19-data-cleanup
Scripts to cleanup data from https://github.com/CSSEGISandData/COVID-19
Stars: ✭ 25 (-94.29%)
Mutual labels:  datasets
open-discourse
Open Discourse is the first fully comprehensive corpus of the plenary proceedings of the federal German Parliament (Bundestag).
Stars: ✭ 47 (-89.27%)
Mutual labels:  corpus
Medical-Names-Corpus
医疗语料库。医疗机构名语料库。药品本位码。
Stars: ✭ 26 (-94.06%)
Mutual labels:  corpus
Ltp
Language Technology Platform
Stars: ✭ 3,648 (+732.88%)
Mutual labels:  chinese-nlp
EdgarAllanPoetry
Computer-generated poetry
Stars: ✭ 22 (-94.98%)
Mutual labels:  corpus
Wordless
An Integrated Corpus Tool With Multilingual Support for the Study of Language, Literature, and Translation
Stars: ✭ 378 (-13.7%)
Mutual labels:  corpus
fastmorph
Fast corpus search engine originally made for the Corpus of Written Tatar language
Stars: ✭ 14 (-96.8%)
Mutual labels:  corpus
Open3d Ml
An extension of Open3D to address 3D Machine Learning tasks
Stars: ✭ 284 (-35.16%)
Mutual labels:  datasets
Awesome Autonomous Vehicle
无人驾驶的资源列表中文版
Stars: ✭ 389 (-11.19%)
Mutual labels:  datasets
podium
Podium: a framework agnostic Python NLP library for data loading and preprocessing
Stars: ✭ 55 (-87.44%)
Mutual labels:  datasets
Paperrobot
Code for PaperRobot: Incremental Draft Generation of Scientific Ideas
Stars: ✭ 372 (-15.07%)
Mutual labels:  datasets
Thulac Java
An Efficient Lexical Analyzer for Chinese
Stars: ✭ 285 (-34.93%)
Mutual labels:  chinese-nlp
disent
🧶 Modular VAE disentanglement framework for python built with PyTorch Lightning ▸ Including metrics and datasets ▸ With strongly supervised, weakly supervised and unsupervised methods ▸ Easily configured and run with Hydra config ▸ Inspired by disentanglement_lib
Stars: ✭ 41 (-90.64%)
Mutual labels:  datasets
DeepSentiPers
Repository for the experiments described in the paper named "DeepSentiPers: Novel Deep Learning Models Trained Over Proposed Augmented Persian Sentiment Corpus"
Stars: ✭ 17 (-96.12%)
Mutual labels:  corpus
Meglass
An eyeglass face dataset collected and cleaned for face recognition evaluation, CCBR 2018.
Stars: ✭ 281 (-35.84%)
Mutual labels:  datasets
dplace-data
The data repository for the D-PLACE Project (Database of Places, Language, Culture and Environment)
Stars: ✭ 49 (-88.81%)
Mutual labels:  datasets
download audioset
📁 This repo makes it easy to download the raw audio files from AudioSet (32.45 GB, 632 classes).
Stars: ✭ 53 (-87.9%)
Mutual labels:  datasets
newsletter-archive
Markdown archive & RSS/Atom feeds for Data Is Plural.
Stars: ✭ 65 (-85.16%)
Mutual labels:  datasets
Awesome Segmentation Saliency Dataset
A collection of some datasets for segmentation / saliency detection. Welcome to PR...😄
Stars: ✭ 315 (-28.08%)
Mutual labels:  datasets
Indian ParallelCorpus
Curated list of publicly available parallel corpus for Indian Languages
Stars: ✭ 23 (-94.75%)
Mutual labels:  corpus
Awesome Cybersecurity Datasets
A curated list of amazingly awesome Cybersecurity datasets
Stars: ✭ 380 (-13.24%)
Mutual labels:  datasets
Writing-editing-Network
Code for Paper Abstract Writing through Editing Mechanism
Stars: ✭ 72 (-83.56%)
Mutual labels:  datasets
Chineseaddress ocr
Photographing Chinese-Address OCR implemented using CTPN+CTC+Address Correction. 拍照文档中文地址文字识别。
Stars: ✭ 309 (-29.45%)
Mutual labels:  chinese-nlp
wordfish-python
extract relationships from standardized terms from corpus of interest with deep learning 🐟
Stars: ✭ 19 (-95.66%)
Mutual labels:  corpus
Projects
🪐 End-to-end NLP workflows from prototype to production
Stars: ✭ 397 (-9.36%)
Mutual labels:  datasets
NetEmb-Datasets
A collection of real-world networks/graphs for Network Embedding
Stars: ✭ 18 (-95.89%)
Mutual labels:  datasets
Cleora
Cleora AI is a general-purpose model for efficient, scalable learning of stable and inductive entity embeddings for heterogeneous relational data.
Stars: ✭ 303 (-30.82%)
Mutual labels:  datasets
databrewer-recipes
DataBrewer Recipes Repository.
Stars: ✭ 19 (-95.66%)
Mutual labels:  datasets
Fuzzdata
Fuzzing resources for feeding various fuzzers with input. 🔧
Stars: ✭ 376 (-14.16%)
Mutual labels:  corpus
recurrent-defocus-deblurring-synth-dual-pixel
Reference github repository for the paper "Learning to Reduce Defocus Blur by Realistically Modeling Dual-Pixel Data". We propose a procedure to generate realistic DP data synthetically. Our synthesis approach mimics the optical image formation found on DP sensors and can be applied to virtual scenes rendered with standard computer software. Lev…
Stars: ✭ 30 (-93.15%)
Mutual labels:  datasets
Tdc
Therapeutics Data Commons: Machine Learning Datasets and Tasks for Therapeutics
Stars: ✭ 291 (-33.56%)
Mutual labels:  datasets
Species-Names-Corpus
物种名称语料库。植物名,动物名。
Stars: ✭ 23 (-94.75%)
Mutual labels:  corpus
Zhparser
zhparser is a PostgreSQL extension for full-text search of Chinese language
Stars: ✭ 418 (-4.57%)
Mutual labels:  chinese-nlp
TSForecasting
This repository contains the implementations related to the experiments of a set of publicly available datasets that are used in the time series forecasting research space.
Stars: ✭ 53 (-87.9%)
Mutual labels:  datasets
Dr.sure
🏫DeepLearning学习笔记以及Tensorflow、Pytorch的使用心得笔记。Dr. Sure会不定时往项目中添加他看到的最新的技术,欢迎批评指正。
Stars: ✭ 365 (-16.67%)
Mutual labels:  datasets
Kartaslov
Stars: ✭ 270 (-38.36%)
Mutual labels:  datasets
opendatasets
A Python library for downloading datasets from Kaggle, Google Drive, and other online sources.
Stars: ✭ 161 (-63.24%)
Mutual labels:  datasets
dialogue-datasets
collect the open dialog corpus and some useful data processing utils.
Stars: ✭ 24 (-94.52%)
Mutual labels:  corpus
Animal Matting
Github repository for the paper End-to-end Animal Image Matting
Stars: ✭ 363 (-17.12%)
Mutual labels:  datasets
Filipino-Text-Benchmarks
Open-source benchmark datasets and pretrained transformer models in the Filipino language.
Stars: ✭ 22 (-94.98%)
Mutual labels:  corpus
Korpora
Korean corpus repository
Stars: ✭ 270 (-38.36%)
Mutual labels:  corpus
ml4se
A curated list of papers, theses, datasets, and tools related to the application of Machine Learning for Software Engineering
Stars: ✭ 46 (-89.5%)
Mutual labels:  datasets
Awesome Holistic 3d
A list of papers and resources (data,code,etc) for holistic 3D reconstruction in computer vision
Stars: ✭ 387 (-11.64%)
Mutual labels:  datasets
fuzzing-corpus
My fuzzing corpus
Stars: ✭ 120 (-72.6%)
Mutual labels:  corpus
Hub
Dataset format for AI. Build, manage, & visualize datasets for deep learning. Stream data real-time to PyTorch/TensorFlow & version-control it. https://activeloop.ai
Stars: ✭ 4,003 (+813.93%)
Mutual labels:  datasets
SpiCE-Corpus
An open-access corpus of conversational bilingual speech in Cantonese and English
Stars: ✭ 33 (-92.47%)
Mutual labels:  corpus
databrewer
The missing datasets manager. Like hombrew but for datasets. CLI-tool for search and discover datasets!
Stars: ✭ 39 (-91.1%)
Mutual labels:  datasets
Chakin
Simple downloader for pre-trained word vectors
Stars: ✭ 323 (-26.26%)
Mutual labels:  datasets
Roapi
Create full-fledged APIs for static datasets without writing a single line of code.
Stars: ✭ 253 (-42.24%)
Mutual labels:  datasets
ck-env
CK repository with components and automation actions to enable portable workflows across diverse platforms including Linux, Windows, MacOS and Android. It includes software detection plugins and meta packages (code, data sets, models, scripts, etc) with the possibility of multiple versions to co-exist in a user or system environment:
Stars: ✭ 67 (-84.7%)
Mutual labels:  datasets
RData.jl
Read R data files from Julia
Stars: ✭ 49 (-88.81%)
Mutual labels:  datasets
1-60 of 344 similar projects