Lmdb EmbeddingsFast word vectors with little memory usage in Python
Stars: ✭ 404 (+562.3%)
DutchembeddingsRepository for the word embeddings experiments described in "Evaluating Unsupervised Dutch Word Embeddings as a Linguistic Resource", presented at LREC 2016.
Stars: ✭ 71 (+16.39%)
SensegramMaking sense embedding out of word embeddings using graph-based word sense induction
Stars: ✭ 209 (+242.62%)
NatashaSolves basic Russian NLP tasks, API for lower level Natasha projects
Stars: ✭ 788 (+1191.8%)
Officer👮 officer: office documents from R
Stars: ✭ 405 (+563.93%)
VectorhubVector Hub - Library for easy discovery, and consumption of State-of-the-art models to turn data into vectors. (text2vec, image2vec, video2vec, graph2vec, bert, inception, etc)
Stars: ✭ 317 (+419.67%)
PolyfuzzFuzzy string matching, grouping, and evaluation.
Stars: ✭ 292 (+378.69%)
DogembeddingsRare pupper image compression model for word-embedding-esque operations.
Stars: ✭ 30 (-50.82%)
Wikipedia2vecA tool for learning vector representations of words and entities from Wikipedia
Stars: ✭ 655 (+973.77%)
UniofficePure go library for creating and processing Office Word (.docx), Excel (.xlsx) and Powerpoint (.pptx) documents
Stars: ✭ 3,111 (+5000%)
FiduswriterFidus Writer is an online collaborative editor for academics.
Stars: ✭ 405 (+563.93%)
Nlp CubeNatural Language Processing Pipeline - Sentence Splitting, Tokenization, Lemmatization, Part-of-speech Tagging and Dependency Parsing
Stars: ✭ 353 (+478.69%)
Sensitive敏感词查找,验证,过滤和替换 🤓 FindAll, Validate, Filter and Replace words.
Stars: ✭ 292 (+378.69%)
DocconvConverts PDF, DOC, DOCX, XML, HTML, RTF, etc to plain text
Stars: ✭ 735 (+1104.92%)
VuewordcloudGenerates a cloud out of the words.
Stars: ✭ 284 (+365.57%)
Word2htmla quick and dirty script to convert a Word (docx) document to html.
Stars: ✭ 44 (-27.87%)
SpeedtorchLibrary for faster pinned CPU <-> GPU transfer in Pytorch
Stars: ✭ 615 (+908.2%)
HetuA high-performance distributed deep learning system targeting large-scale and automated distributed training.
Stars: ✭ 78 (+27.87%)
go2vecRead and use word2vec vectors in Go
Stars: ✭ 44 (-27.87%)
game2vecTensorFlow implementation of word2vec applied on https://www.kaggle.com/tamber/steam-video-games dataset, using both CBOW and Skip-gram.
Stars: ✭ 62 (+1.64%)
BpembPre-trained subword embeddings in 275 languages, based on Byte-Pair Encoding (BPE)
Stars: ✭ 909 (+1390.16%)
Ner LstmNamed Entity Recognition using multilayered bidirectional LSTM
Stars: ✭ 532 (+772.13%)
markdown-to-documentA Markdown CLI to easily generate HTML documents from Markdown files
Stars: ✭ 28 (-54.1%)
Multi Class Text Classification CnnClassify Kaggle Consumer Finance Complaints into 11 classes. Build the model with CNN (Convolutional Neural Network) and Word Embeddings on Tensorflow.
Stars: ✭ 410 (+572.13%)
gscloudplugin浏览器打印PDF。 浏览器打印HTML。 浏览器打印图片。 浏览器打印Word。浏览器打印Excel。浏览器打印PPT。浏览器打印自定义绘图。浏览器打印微软报表。 使用静默方式打印。蓝牙打印。读写串口数据。读取电子秤重量
Stars: ✭ 18 (-70.49%)
Eda nlpData augmentation for NLP, presented at EMNLP 2019
Stars: ✭ 902 (+1378.69%)
Contextualized Topic ModelsA python package to run contextualized topic modeling. CTMs combine BERT with topic models to get coherent topics. Also supports multilingual tasks. Cross-lingual Zero-shot model published at EACL 2021.
Stars: ✭ 318 (+421.31%)
Word Checker🇨🇳🇬🇧Chinese and English word spelling corrector.(中文易错别字检测,中文拼写检测纠正。英文单词拼写校验工具)
Stars: ✭ 48 (-21.31%)
CleoraCleora AI is a general-purpose model for efficient, scalable learning of stable and inductive entity embeddings for heterogeneous relational data.
Stars: ✭ 303 (+396.72%)
Awesome 2vecCurated list of 2vec-type embedding models
Stars: ✭ 784 (+1185.25%)
Philo2vecAn implementation of word2vec applied to [stanford philosophy encyclopedia](http://plato.stanford.edu/)
Stars: ✭ 33 (-45.9%)
Docxa ruby library/gem for interacting with .docx files
Stars: ✭ 288 (+372.13%)
Ngram2vecFour word embedding models implemented in Python. Supporting arbitrary context features
Stars: ✭ 703 (+1052.46%)
DecagonGraph convolutional neural network for multirelational link prediction
Stars: ✭ 268 (+339.34%)
Ml Surveys📋 Survey papers summarizing advances in deep learning, NLP, CV, graphs, reinforcement learning, recommendations, graphs, etc.
Stars: ✭ 1,063 (+1642.62%)
HubA library for transfer learning by reusing parts of TensorFlow models.
Stars: ✭ 3,007 (+4829.51%)
Node2vecImplementation of the node2vec algorithm.
Stars: ✭ 654 (+972.13%)
Keras Textclassification中文长文本分类、短句子分类、多标签分类、两句子相似度(Chinese Text Classification of Keras NLP, multi-label classify, or sentence classify, long or short),字词句向量嵌入层(embeddings)和网络层(graph)构建基类,FastText,TextCNN,CharCNN,TextRNN, RCNN, DCNN, DPCNN, VDCNN, CRNN, Bert, Xlnet, Albert, Attention, DeepMoji, HAN, 胶囊网络-CapsuleNet, Transformer-encode, Seq2seq, SWEM, LEAM, TextGCN
Stars: ✭ 914 (+1398.36%)
cadeCompass-aligned Distributional Embeddings. Align embeddings from different corpora
Stars: ✭ 29 (-52.46%)
Multi Class Text Classification Cnn RnnClassify Kaggle San Francisco Crime Description into 39 classes. Build the model with CNN, RNN (GRU and LSTM) and Word Embeddings on Tensorflow.
Stars: ✭ 570 (+834.43%)
GemBox.Document.ExamplesRead, write, convert and print document files (DOCX, DOC, PDF, HTML, XPS, RTF, and TXT) in a simple and efficient way.
Stars: ✭ 53 (-13.11%)
DesktopeditorsAn office suite that combines text, spreadsheet and presentation editors allowing to create, view and edit local documents
Stars: ✭ 1,008 (+1552.46%)
watset-javaAn implementation of the Watset clustering algorithm in Java.
Stars: ✭ 24 (-60.66%)
Vicword 一个纯php分词
Stars: ✭ 516 (+745.9%)
Deep MihashCode for papers "Hashing with Mutual Information" (TPAMI 2019) and "Hashing with Binary Matrix Pursuit" (ECCV 2018)
Stars: ✭ 13 (-78.69%)
lingose-notationThe best mnemonics and notational system of English words.
Stars: ✭ 17 (-72.13%)
Awesome Persian Nlp IrCurated List of Persian Natural Language Processing and Information Retrieval Tools and Resources
Stars: ✭ 460 (+654.1%)
Twelveish🕛 Twelveish - Android Wear/Wear OS Watch Face
Stars: ✭ 29 (-52.46%)
rgpipelesspipe for ripgrep for common new filetypes using few dependencies
Stars: ✭ 21 (-65.57%)
LightlyA python library for self-supervised learning on images.
Stars: ✭ 439 (+619.67%)
Pytorch Continuous Bag Of WordsThe Continuous Bag-of-Words model (CBOW) is frequently used in NLP deep learning. It's a model that tries to predict words given the context of a few words before and a few words after the target word.
Stars: ✭ 50 (-18.03%)