tfbert基于tensorflow1.x的预训练模型调用,支持单机多卡、梯度累积,XLA加速,混合精度。可灵活训练、验证、预测。
Stars: ✭ 54 (-71.28%)
InstahelpInstahelp is a Q&A portal website similar to Quora
Stars: ✭ 21 (-88.83%)
lorcaNatural Language Processing for Spanish in Node.js. Stemmer, sentiment analysis, readability, tf-idf with batteries, concordance and more!
Stars: ✭ 95 (-49.47%)
roberta-wwm-base-distillthis is roberta wwm base distilled model which was distilled from roberta wwm by roberta wwm large
Stars: ✭ 61 (-67.55%)
luceneApache Lucene open-source search software
Stars: ✭ 1,009 (+436.7%)
nalcosSearch Git commits in natural language
Stars: ✭ 50 (-73.4%)
fb scraperFBLYZE is a Facebook scraping system and analysis system.
Stars: ✭ 61 (-67.55%)
bookworm📚 social networks from novels
Stars: ✭ 72 (-61.7%)
customized-symspellJava port of SymSpell: 1 million times faster through Symmetric Delete spelling correction algorithm
Stars: ✭ 51 (-72.87%)
nlp-notebooksA collection of natural language processing notebooks.
Stars: ✭ 19 (-89.89%)
JD2Skills-BERT-XMLCCode and Dataset for the Bhola et al. (2020) Retrieving Skills from Job Descriptions: A Language Model Based Extreme Multi-label Classification Framework
Stars: ✭ 33 (-82.45%)
linguistic-style-transfer-pytorchImplementation of "Disentangled Representation Learning for Non-Parallel Text Style Transfer(ACL 2019)" in Pytorch
Stars: ✭ 55 (-70.74%)
DRhardSIGIR'21: Optimizing DR with hard negatives and achieving SOTA first-stage retrieval performance on TREC DL Track.
Stars: ✭ 93 (-50.53%)
DeepNERAn Easy-to-use, Modular and Prolongable package of deep-learning based Named Entity Recognition Models.
Stars: ✭ 9 (-95.21%)
pair2vecpair2vec: Compositional Word-Pair Embeddings for Cross-Sentence Inference
Stars: ✭ 62 (-67.02%)
webdatasetA high-performance Python-based I/O system for large (and small) deep learning problems, with strong support for PyTorch.
Stars: ✭ 816 (+334.04%)
BM25Transformer(Python) transform a document-term matrix to an Okapi/BM25 representation
Stars: ✭ 50 (-73.4%)
ODSQAODSQA: OPEN-DOMAIN SPOKEN QUESTION ANSWERING DATASET
Stars: ✭ 43 (-77.13%)
easseEasier Automatic Sentence Simplification Evaluation
Stars: ✭ 109 (-42.02%)
tutorialsA tutorial series by Preferred.AI
Stars: ✭ 136 (-27.66%)
ilmultiTooling to play around with multilingual machine translation for Indian Languages.
Stars: ✭ 19 (-89.89%)
CODERCODER: Knowledge infused cross-lingual medical term embedding for term normalization. [JBI, ACL-BioNLP 2022]
Stars: ✭ 24 (-87.23%)
mtdataA tool that locates, downloads, and extracts machine translation corpora
Stars: ✭ 95 (-49.47%)
srctools for fast reading of docs
Stars: ✭ 40 (-78.72%)
tika-similarityTika-Similarity uses the Tika-Python package (Python port of Apache Tika) to compute file similarity based on Metadata features.
Stars: ✭ 92 (-51.06%)
DeepLTranslatorThe DeepL Translator is an API written in Java that translates via the DeepL website sentences. Without API key.
Stars: ✭ 45 (-76.06%)
psr2r-snifferA PSR-2-R code sniffer and code-style auto-correction-tool - including many useful additions
Stars: ✭ 32 (-82.98%)
mnist-challengeMy solution to TUM's Machine Learning MNIST challenge 2016-2017 [winner]
Stars: ✭ 68 (-63.83%)
UnetsImplemenation of UNets for Lung Segmentation
Stars: ✭ 18 (-90.43%)
semantic-parsing-dualSource code and data for ACL 2019 Long Paper ``Semantic Parsing with Dual Learning".
Stars: ✭ 17 (-90.96%)
audio degraderAudio degradation toolbox in python, with a command-line tool. It is useful to apply controlled degradations to audio: e.g. data augmentation, evaluation in noisy conditions, etc.
Stars: ✭ 40 (-78.72%)
syntaxmakerThe NLG tool for Finnish
Stars: ✭ 19 (-89.89%)
TCEThis repository contains the code implementation used in the paper Temporally Coherent Embeddings for Self-Supervised Video Representation Learning (TCE).
Stars: ✭ 51 (-72.87%)
deepfrogAn NLP-suite powered by deep learning
Stars: ✭ 16 (-91.49%)
spellchecker-wasmSpellcheckerWasm is an extrememly fast spellchecker for WebAssembly based on SymSpell
Stars: ✭ 46 (-75.53%)
WSDM-Cup-2019[ACM-WSDM] 3rd place solution at WSDM Cup 2019, Fake News Classification on Kaggle.
Stars: ✭ 62 (-67.02%)
GNN-Recommender-SystemsAn index of recommendation algorithms that are based on Graph Neural Networks.
Stars: ✭ 505 (+168.62%)
factedit🧐 Code & Data for Fact-based Text Editing (Iso et al; ACL 2020)
Stars: ✭ 16 (-91.49%)
transformers-interpretModel explainability that works seamlessly with 🤗 transformers. Explain your transformers model in just 2 lines of code.
Stars: ✭ 861 (+357.98%)
language-plannerOfficial Code for "Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents"
Stars: ✭ 84 (-55.32%)
elastic transformersMaking BERT stretchy. Semantic Elasticsearch with Sentence Transformers
Stars: ✭ 153 (-18.62%)
PororoQAPororoQA, https://arxiv.org/abs/1707.00836
Stars: ✭ 26 (-86.17%)
WikiTableQuestionsA dataset of complex questions on semi-structured Wikipedia tables
Stars: ✭ 81 (-56.91%)
tf-idf-pythonTerm frequency–inverse document frequency for Chinese novel/documents implemented in python.
Stars: ✭ 98 (-47.87%)
bredonA modern CSS value compiler in JavaScript
Stars: ✭ 39 (-79.26%)
ark-nlpA private nlp coding package, which quickly implements the SOTA solutions.
Stars: ✭ 232 (+23.4%)
liblexC library for Lexical Analysis
Stars: ✭ 25 (-86.7%)
XORQAThis is the official repository for NAACL 2021, "XOR QA: Cross-lingual Open-Retrieval Question Answering".
Stars: ✭ 61 (-67.55%)
relation-networkTensorflow Implementation of Relation Networks for the bAbI QA Task, detailed in "A Simple Neural Network Module for Relational Reasoning," [https://arxiv.org/abs/1706.01427] by Santoro et. al.
Stars: ✭ 45 (-76.06%)
gnn-lspeSource code for GNN-LSPE (Graph Neural Networks with Learnable Structural and Positional Representations), ICLR 2022
Stars: ✭ 165 (-12.23%)
denspiReal-Time Open-Domain Question Answering with Dense-Sparse Phrase Index (DenSPI)
Stars: ✭ 188 (+0%)