Chinese Names Corpus中文人名语料库。人名生成器。中文姓名,姓氏,名字,称呼,日本人名,翻译人名,英文人名。可用于中文分词、人名实体识别。
Stars: ✭ 3,053 (+251.73%)
ProsodyHelsinki Prosody Corpus and A System for Predicting Prosodic Prominence from Text
Stars: ✭ 139 (-83.99%)
BondBOND: BERT-Assisted Open-Domain Name Entity Recognition with Distant Supervision
Stars: ✭ 96 (-88.94%)
Dataset Listlists of text corpus and more (mainly Japanese)
Stars: ✭ 84 (-90.32%)
Clue中文语言理解测评基准 Chinese Language Understanding Evaluation Benchmark: datasets, baselines, pre-trained models, corpus and leaderboard
Stars: ✭ 2,425 (+179.38%)
Ua GecUA-GEC: Grammatical Error Correction and Fluency Corpus for the Ukrainian Language
Stars: ✭ 108 (-87.56%)
Dialog corpus用于训练中英文对话系统的语料库 Datasets for Training Chatbot System
Stars: ✭ 1,662 (+91.47%)
CoarijCorpus of Annual Reports in Japan
Stars: ✭ 55 (-93.66%)
Nlp chinese corpus大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP
Stars: ✭ 6,656 (+666.82%)
Cluener2020CLUENER2020 中文细粒度命名实体识别 Fine Grained Named Entity Recognition
Stars: ✭ 689 (-20.62%)
Nlp bahasa resourcesA Curated List of Dataset and Usable Library Resources for NLP in Bahasa Indonesia
Stars: ✭ 158 (-81.8%)
FakenewscorpusA dataset of millions of news articles scraped from a curated list of data sources.
Stars: ✭ 255 (-70.62%)
Chatito🎯🗯 Generate datasets for AI chatbots, NLP tasks, named entity recognition or text classification models using a simple DSL!
Stars: ✭ 678 (-21.89%)
Person searchJoint Detection and Identification Feature Learning for Person Search
Stars: ✭ 666 (-23.27%)
Awesome Project IdeasCurated list of Machine Learning, NLP, Vision, Recommender Systems Project Ideas
Stars: ✭ 6,114 (+604.38%)
Cophy"CoPhy: Counterfactual Learning of Physical Dynamics", F. Baradel, N. Neverova, J. Mille, G. Mori, C. Wolf, ICLR'2020
Stars: ✭ 24 (-97.24%)
Datastream.ioAn open-source framework for real-time anomaly detection using Python, ElasticSearch and Kibana
Stars: ✭ 814 (-6.22%)
QuantedaAn R package for the Quantitative Analysis of Textual Data
Stars: ✭ 647 (-25.46%)
Naive Bayes ClassifierNaive Bayes classifier is classification algorithm. It uses Naive based Bernoulli and Multinomial equation to classify documents(Text) as ham or spam.
Stars: ✭ 6 (-99.31%)
MobiusC# and F# language binding and extensions to Apache Spark
Stars: ✭ 929 (+7.03%)
ProteinnetStandardized data set for machine learning of protein structure
Stars: ✭ 664 (-23.5%)
Covid CtCOVID-CT-Dataset: A CT Scan Dataset about COVID-19
Stars: ✭ 820 (-5.53%)
Bert Ner PytorchChinese NER(Named Entity Recognition) using BERT(Softmax, CRF, Span)
Stars: ✭ 654 (-24.65%)
FacerankFaceRank - Rank Face by CNN Model based on TensorFlow (add keras version). FaceRank-人脸打分基于 TensorFlow (新增 Keras 版本) 的 CNN 模型(QQ群:167122861)。技术支持:http://tensorflow123.com
Stars: ✭ 841 (-3.11%)
Devblogs+2600 developer-related blogs and publications.
Stars: ✭ 637 (-26.61%)
Osint collectionMaintained collection of OSINT related resources. (All Free & Actionable)
Stars: ✭ 809 (-6.8%)
Imagenetscraper👁 Bulk-download all thumbnails from an ImageNet synset, with optional rescaling
Stars: ✭ 24 (-97.24%)
Esc 50ESC-50: Dataset for Environmental Sound Classification
Stars: ✭ 631 (-27.3%)
Gensim DataData repository for pretrained NLP models and NLP corpora.
Stars: ✭ 622 (-28.34%)
Chatbot cn基于金融-司法领域(兼有闲聊性质)的聊天机器人,其中的主要模块有信息抽取、NLU、NLG、知识图谱等,并且利用Django整合了前端展示,目前已经封装了nlp和kg的restful接口
Stars: ✭ 791 (-8.87%)
Label StudioLabel Studio is a multi-type data labeling and annotation tool with standardized output format
Stars: ✭ 7,264 (+736.87%)
Dict build自动构建中文词库:http://www.matrix67.com/blog/archives/5044
Stars: ✭ 599 (-30.99%)
Khayyam106 Omar Khayyam quatrains in YAML format.
Stars: ✭ 8 (-99.08%)
Chinesener中文命名实体识别,实体抽取,tensorflow,pytorch,BiLSTM+CRF
Stars: ✭ 938 (+8.06%)
RdhsAPI Client and Data Munging for the Demographic and Health Survey Data
Stars: ✭ 22 (-97.47%)
NatashaSolves basic Russian NLP tasks, API for lower level Natasha projects
Stars: ✭ 788 (-9.22%)
Xmnlpxmnlp:提供中文分词, 词性标注, 命名体识别,情感分析,文本纠错,文本转拼音,文本摘要,偏旁部首等功能
Stars: ✭ 591 (-31.91%)
Lm Lstm CrfEmpower Sequence Labeling with Task-Aware Language Model
Stars: ✭ 778 (-10.37%)
CvatPowerful and efficient Computer Vision Annotation Tool (CVAT)
Stars: ✭ 6,557 (+655.41%)
Seq2seq ChatbotChatbot in 200 lines of code using TensorLayer
Stars: ✭ 777 (-10.48%)
Total Text DatasetTotal Text Dataset. It consists of 1555 images with more than 3 different text orientations: Horizontal, Multi-Oriented, and Curved, one of a kind.
Stars: ✭ 580 (-33.18%)
Sequence Labeling Bilstm CrfThe classical BiLSTM-CRF model implemented in Tensorflow, for sequence labeling tasks. In Vex version, everything is configurable.
Stars: ✭ 579 (-33.29%)
Nas Bench 201NAS-Bench-201 API and Instruction
Stars: ✭ 537 (-38.13%)