G2pcg2pC: A Context-aware Grapheme-to-Phoneme Conversion module for Chinese
Stars: ✭ 155 (+811.76%)
classyclassy is a simple-to-use library for building high-performance Machine Learning models in NLP.
Stars: ✭ 61 (+258.82%)
JcsegJcseg is a light weight NLP framework developed with Java. Provide CJK and English segmentation based on MMSEG algorithm, With also keywords extraction, key sentence extraction, summary extraction implemented based on TEXTRANK algorithm. Jcseg had a build-in http server and search modules for the latest lucene,solr,elasticsearch
Stars: ✭ 754 (+4335.29%)
ChinesenlpDatasets, SOTA results of every fields of Chinese NLP
Stars: ✭ 1,206 (+6994.12%)
HugsVisionHugsVision is a easy to use huggingface wrapper for state-of-the-art computer vision
Stars: ✭ 154 (+805.88%)
Nlp chinese corpus大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP
Stars: ✭ 6,656 (+39052.94%)
text2textText2Text: Cross-lingual natural language processing and generation toolkit
Stars: ✭ 188 (+1005.88%)
Lac百度NLP:分词,词性标注,命名实体识别,词重要性
Stars: ✭ 2,792 (+16323.53%)
Nlp4han中文自然语言处理工具集【断句/分词/词性标注/组块/句法分析/语义分析/NER/N元语法/HMM/代词消解/情感分析/拼写检查】
Stars: ✭ 206 (+1111.76%)
FrisoHigh performance Chinese tokenizer with both GBK and UTF-8 charset support based on MMSEG algorithm developed by ANSI C. Completely based on modular implementation and can be easily embedded in other programs, like: MySQL, PostgreSQL, PHP, etc.
Stars: ✭ 313 (+1741.18%)
NLP-paper🎨 🎨NLP 自然语言处理教程 🎨🎨 https://dataxujing.github.io/NLP-paper/
Stars: ✭ 23 (+35.29%)
efficientnet-jaxEfficientNet, MobileNetV3, MobileNetV2, MixNet, etc in JAX w/ Flax Linen and Objax
Stars: ✭ 114 (+570.59%)
NAG-BERT[EACL'21] Non-Autoregressive with Pretrained Language Model
Stars: ✭ 47 (+176.47%)
NLPIR-ICTCLASThe Java Package of NLPIR-ICTCLAS.
Stars: ✭ 16 (-5.88%)
classifier multi labelmulti-label,classifier,text classification,多标签文本分类,文本分类,BERT,ALBERT,multi-label-classification
Stars: ✭ 127 (+647.06%)
bert nliA Natural Language Inference (NLI) model based on Transformers (BERT and ALBERT)
Stars: ✭ 97 (+470.59%)
sticker2Further developed as SyntaxDot: https://github.com/tensordot/syntaxdot
Stars: ✭ 14 (-17.65%)
Text-SummarizationAbstractive and Extractive Text summarization using Transformers.
Stars: ✭ 38 (+123.53%)
DE-LIMITDeEpLearning models for MultIlingual haTespeech (DELIMIT): Benchmarking multilingual models across 9 languages and 16 datasets.
Stars: ✭ 90 (+429.41%)
SA-BERTCIKM 2020: Speaker-Aware BERT for Multi-Turn Response Selection in Retrieval-Based Chatbots
Stars: ✭ 71 (+317.65%)
CVAE DialCVAE_XGate model in paper "Xu, Dusek, Konstas, Rieser. Better Conversations by Modeling, Filtering, and Optimizing for Coherence and Diversity"
Stars: ✭ 16 (-5.88%)
dynmt-pyNeural machine translation implementation using dynet's python bindings
Stars: ✭ 17 (+0%)
anonymisationAnonymization of legal cases (Fr) based on Flair embeddings
Stars: ✭ 85 (+400%)
hard-label-attackNatural Language Attacks in a Hard Label Black Box Setting.
Stars: ✭ 26 (+52.94%)
Transformer-TransducerPyTorch implementation of "Transformer Transducer: A Streamable Speech Recognition Model with Transformer Encoders and RNN-T Loss" (ICASSP 2020)
Stars: ✭ 61 (+258.82%)
Cross-Lingual-MRCCross-Lingual Machine Reading Comprehension (EMNLP 2019)
Stars: ✭ 66 (+288.24%)
wechselCode for WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models.
Stars: ✭ 39 (+129.41%)
HE2LaTeXConverting handwritten equations to LaTeX
Stars: ✭ 84 (+394.12%)
PromptPapersMust-read papers on prompt-based tuning for pre-trained language models.
Stars: ✭ 2,317 (+13529.41%)
psr2r-snifferA PSR-2-R code sniffer and code-style auto-correction-tool - including many useful additions
Stars: ✭ 32 (+88.24%)
lexLex is an implementation of lex tool in Ruby.
Stars: ✭ 49 (+188.24%)
googlecodelabsTPU ile Yapay Sinir Ağlarınızı Çok Daha Hızlı Eğitin
Stars: ✭ 116 (+582.35%)
consistencyImplementation of models in our EMNLP 2019 paper: A Logic-Driven Framework for Consistency of Neural Models
Stars: ✭ 26 (+52.94%)
KitanaQAKitanaQA: Adversarial training and data augmentation for neural question-answering models
Stars: ✭ 58 (+241.18%)
troveWeakly supervised medical named entity classification
Stars: ✭ 55 (+223.53%)
JointIDSFBERT-based joint intent detection and slot filling with intent-slot attention mechanism (INTERSPEECH 2021)
Stars: ✭ 55 (+223.53%)
strollr2d icassp2017Image Denoising Codes using STROLLR learning, the Matlab implementation of the paper in ICASSP2017
Stars: ✭ 22 (+29.41%)
hunspellHigh-Performance Stemmer, Tokenizer, and Spell Checker for R
Stars: ✭ 101 (+494.12%)
tokenizerA simple tokenizer in Ruby for NLP tasks.
Stars: ✭ 44 (+158.82%)
Transformers-TutorialsThis repository contains demos I made with the Transformers library by HuggingFace.
Stars: ✭ 2,828 (+16535.29%)
banglabertThis repository contains the official release of the model "BanglaBERT" and associated downstream finetuning code and datasets introduced in the paper titled "BanglaBERT: Language Model Pretraining and Benchmarks for Low-Resource Language Understanding Evaluation in Bangla" accpeted in Findings of the Annual Conference of the North American Chap…
Stars: ✭ 186 (+994.12%)
rustfstRust re-implementation of OpenFST - library for constructing, combining, optimizing, and searching weighted finite-state transducers (FSTs). A Python binding is also available.
Stars: ✭ 104 (+511.76%)
BertSimilarityComputing similarity of two sentences with google's BERT algorithm。利用Bert计算句子相似度。语义相似度计算。文本相似度计算。
Stars: ✭ 348 (+1947.06%)
SCINetForecast time series and stock prices with SCINet
Stars: ✭ 28 (+64.71%)
Cross-Domain-CWSCode for IJCAI 2018 paper "Neural Networks Incorporating Unlabeled and Partially-labeled Data for Cross-domain Chinese Word Segmentation"
Stars: ✭ 14 (-17.65%)
neural tokenizerTokenize English sentences using neural networks.
Stars: ✭ 64 (+276.47%)
Electra with tensorflowThis is an implementation of electra according to the paper {ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators}
Stars: ✭ 13 (-23.53%)
CAIL法研杯CAIL2019阅读理解赛题参赛模型
Stars: ✭ 34 (+100%)