record-linkage-resourcesResources for tackling record linkage / deduplication / data matching problems
Stars: ✭ 67 (-30.21%)
splinkImplementation of Fellegi-Sunter's canonical model of record linkage in Apache Spark, including EM algorithm to estimate parameters
Stars: ✭ 181 (+88.54%)
Dedupe🆔 A python library for accurate and scalable fuzzy matching, record deduplication and entity-resolution.
Stars: ✭ 3,241 (+3276.04%)
JodieA PyTorch implementation of ACM SIGKDD 2019 paper "Predicting Dynamic Embedding Trajectory in Temporal Interaction Networks"
Stars: ✭ 172 (+79.17%)
DecagonGraph convolutional neural network for multirelational link prediction
Stars: ✭ 268 (+179.17%)
Graph 2d cnnCode and data for the paper 'Classifying Graphs as Images with Convolutional Neural Networks' (new title: 'Graph Classification with 2D Convolutional Neural Networks')
Stars: ✭ 67 (-30.21%)
stanceLearned string similarity for entity names using optimal transport.
Stars: ✭ 27 (-71.87%)
snowmanWelcome to Snowman App – a Data Matching Benchmark Platform.
Stars: ✭ 25 (-73.96%)
Merge-MachineMerge Dirty Data with Clean Reference Tables
Stars: ✭ 35 (-63.54%)
LibpostalA C library for parsing/normalizing street addresses around the world. Powered by statistical NLP and open geo data.
Stars: ✭ 3,312 (+3350%)
TCEThis repository contains the code implementation used in the paper Temporally Coherent Embeddings for Self-Supervised Video Representation Learning (TCE).
Stars: ✭ 51 (-46.87%)
image embeddingsUsing efficientnet to provide embeddings for retrieval
Stars: ✭ 107 (+11.46%)
zinggScalable identity resolution, entity resolution, data mastering and deduplication using ML
Stars: ✭ 655 (+582.29%)
CodeT5Code for CodeT5: a new code-aware pre-trained encoder-decoder model.
Stars: ✭ 390 (+306.25%)
gan tensorflowAutomatic feature engineering using Generative Adversarial Networks using TensorFlow.
Stars: ✭ 48 (-50%)
KGE-LDAKnowledge Graph Embedding LDA. AAAI 2017
Stars: ✭ 35 (-63.54%)
lda2vecMixing Dirichlet Topic Models and Word Embeddings to Make lda2vec from this paper https://arxiv.org/abs/1605.02019
Stars: ✭ 27 (-71.87%)
game-feature-learningCode for paper "Cross-Domain Self-supervised Multi-task Feature Learning using Synthetic Imagery", Ren et al., CVPR'18
Stars: ✭ 68 (-29.17%)
cpnetLearning Video Representations from Correspondence Proposals (CVPR 2019 Oral)
Stars: ✭ 93 (-3.12%)
nccNeural Code Comprehension: A Learnable Representation of Code Semantics
Stars: ✭ 162 (+68.75%)
muse-as-serviceREST API for sentence tokenization and embedding using Multilingual Universal Sentence Encoder.
Stars: ✭ 45 (-53.12%)
embeddinghubA vector database for machine learning embeddings.
Stars: ✭ 645 (+571.88%)
autoencoders tensorflowAutomatic feature engineering using deep learning and Bayesian inference using TensorFlow.
Stars: ✭ 66 (-31.25%)
SentimentAnalysis(BOW, TF-IDF, Word2Vec, BERT) Word Embeddings + (SVM, Naive Bayes, Decision Tree, Random Forest) Base Classifiers + Pre-trained BERT on Tensorflow Hub + 1-D CNN and Bi-Directional LSTM on IMDB Movie Reviews Dataset
Stars: ✭ 40 (-58.33%)
code-compassa contextual search engine for software packages built on import2vec embeddings (https://www.code-compass.com)
Stars: ✭ 33 (-65.62%)
dduperFast block-level out-of-band BTRFS deduplication tool.
Stars: ✭ 108 (+12.5%)
Keras-Application-ZooReference implementations of popular DL models missing from keras-applications & keras-contrib
Stars: ✭ 31 (-67.71%)
text2textText2Text: Cross-lingual natural language processing and generation toolkit
Stars: ✭ 188 (+95.83%)
IR2VecImplementation of IR2Vec, published in ACM TACO
Stars: ✭ 28 (-70.83%)
CODERCODER: Knowledge infused cross-lingual medical term embedding for term normalization. [JBI, ACL-BioNLP 2022]
Stars: ✭ 24 (-75%)
word2vec-tsneGoogle News and Leo Tolstoy: Visualizing Word2Vec Word Embeddings using t-SNE.
Stars: ✭ 59 (-38.54%)
conciliatorOpenRefine reconciliation services for VIAF, ORCID, and Open Library + framework for creating more.
Stars: ✭ 95 (-1.04%)
factorized[ICLR 2019] Learning Factorized Multimodal Representations
Stars: ✭ 49 (-48.96%)
acid-storeA library for secure, deduplicated, transactional, and verifiable data storage
Stars: ✭ 48 (-50%)
SimCLRPytorch implementation of "A Simple Framework for Contrastive Learning of Visual Representations"
Stars: ✭ 65 (-32.29%)
FLANN.jlA Julia wrapper for Fast Library for Approximate Nearest Neighbors (FLANN)
Stars: ✭ 14 (-85.42%)
protoProto-RL: Reinforcement Learning with Prototypical Representations
Stars: ✭ 67 (-30.21%)
relation-networkTensorflow Implementation of Relation Networks for the bAbI QA Task, detailed in "A Simple Neural Network Module for Relational Reasoning," [https://arxiv.org/abs/1706.01427] by Santoro et. al.
Stars: ✭ 45 (-53.12%)
whatisWhatIs.this: simple entity resolution through Wikipedia
Stars: ✭ 18 (-81.25%)
ExConExCon: Explanation-driven Supervised Contrastive Learning
Stars: ✭ 17 (-82.29%)
LSCDetectionData Sets and Models for Evaluation of Lexical Semantic Change Detection
Stars: ✭ 17 (-82.29%)
entity-networkTensorflow implementation of "Tracking the World State with Recurrent Entity Networks" [https://arxiv.org/abs/1612.03969] by Henaff, Weston, Szlam, Bordes, and LeCun.
Stars: ✭ 58 (-39.58%)
EgoNetOfficial project website for the CVPR 2021 paper "Exploring intermediate representation for monocular vehicle pose estimation"
Stars: ✭ 111 (+15.63%)
awesome-multimodal-mlReading list for research topics in multimodal machine learning
Stars: ✭ 3,125 (+3155.21%)
SentimentAnalysisSentiment Analysis: Deep Bi-LSTM+attention model
Stars: ✭ 32 (-66.67%)
deep-scite🚣 A simple recommendation engine (by way of convolutions and embeddings) written in TensorFlow
Stars: ✭ 20 (-79.17%)
ethereum-privacyProfiling and Deanonymizing Ethereum Users
Stars: ✭ 37 (-61.46%)
towheeTowhee is a framework that is dedicated to making neural data processing pipelines simple and fast.
Stars: ✭ 821 (+755.21%)
VarCLRVarCLR: Variable Semantic Representation Pre-training via Contrastive Learning
Stars: ✭ 30 (-68.75%)