Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

A python package to run contextualized topic modeling. CTMs combine BERT with topic models to get coherent topics. Also supports multilingual tasks. Cross-lingual Zero-shot model published at EACL 2021.

Stars: ✭ 318 (-65.95%)

Mutual labels: embeddings

Multi Class Text Classification Cnn

Classify Kaggle Consumer Finance Complaints into 11 classes. Build the model with CNN (Convolutional Neural Network) and Word Embeddings on Tensorflow.

Stars: ✭ 410 (-56.1%)

Mutual labels: embeddings

Ner Lstm

Named Entity Recognition using multilayered bidirectional LSTM

Stars: ✭ 532 (-43.04%)

Mutual labels: embeddings

Polyfuzz

Fuzzy string matching, grouping, and evaluation.

Stars: ✭ 292 (-68.74%)

Mutual labels: embeddings

Eda nlp

Data augmentation for NLP, presented at EMNLP 2019

Stars: ✭ 902 (-3.43%)

Mutual labels: embeddings

Nimfa

Nimfa: Nonnegative matrix factorization in Python

Stars: ✭ 440 (-52.89%)

Mutual labels: embeddings

Node2vec

Implementation of the node2vec algorithm.

Stars: ✭ 654 (-29.98%)

Mutual labels: embeddings

Nlp Cube

Natural Language Processing Pipeline - Sentence Splitting, Tokenization, Lemmatization, Part-of-speech Tagging and Dependency Parsing

Stars: ✭ 353 (-62.21%)

Mutual labels: embeddings

Lmdb Embeddings

Fast word vectors with little memory usage in Python

Stars: ✭ 404 (-56.75%)

Mutual labels: embeddings

Multi Class Text Classification Cnn Rnn

Classify Kaggle San Francisco Crime Description into 39 classes. Build the model with CNN, RNN (GRU and LSTM) and Word Embeddings on Tensorflow.

Stars: ✭ 570 (-38.97%)

Mutual labels: embeddings

Vectorhub

Vector Hub - Library for easy discovery, and consumption of State-of-the-art models to turn data into vectors. (text2vec, image2vec, video2vec, graph2vec, bert, inception, etc)

Stars: ✭ 317 (-66.06%)

Mutual labels: embeddings

Awesome 2vec

Curated list of 2vec-type embedding models

Stars: ✭ 784 (-16.06%)

Mutual labels: embeddings

Paperlist For Recommender Systems

Recommender Systems Paperlist that I am interested in

Stars: ✭ 293 (-68.63%)

Mutual labels: embeddings

Awesome Persian Nlp Ir

Curated List of Persian Natural Language Processing and Information Retrieval Tools and Resources

Stars: ✭ 460 (-50.75%)

Mutual labels: embeddings

Orange3 Imageanalytics

🍊 🎑 Orange3 add-on for dealing with image related tasks

Stars: ✭ 24 (-97.43%)

Mutual labels: embeddings

Natasha

Solves basic Russian NLP tasks, API for lower level Natasha projects

Stars: ✭ 788 (-15.63%)

Mutual labels: embeddings

Speedtorch

Library for faster pinned CPU <-> GPU transfer in Pytorch

Stars: ✭ 615 (-34.15%)

Mutual labels: embeddings

View All Similar Projects ➔

Triplet loss in TensorFlow

Author: Olivier Moindrot

This repository contains a triplet loss implementation in TensorFlow with online triplet mining. Please check the blog post for a full description.

The code structure is adapted from code I wrote for CS230 in this repository at tensorflow/vision. A set of tutorials for this code can be found here.

Requirements

We recommend using python3 and a virtual environment. The default venv should be used, or virtualenv with python3.

python3 -m venv .env
source .env/bin/activate
pip install -r requirements_cpu.txt

If you are using a GPU, you will need to install tensorflow-gpu so do:

pip install -r requirements_gpu.txt

Triplet loss


Triplet loss on two positive faces (Obama) and one negative face (Macron)

The interesting part, defining triplet loss with triplet mining can be found in model/triplet_loss.py.

Everything is explained in the blog post.

To use the "batch all" version, you can do:

from model.triplet_loss import batch_all_triplet_loss

loss, fraction_positive = batch_all_triplet_loss(labels, embeddings, margin, squared=False)

In this case fraction_positive is a useful thing to plot in TensorBoard to track the average number of hard and semi-hard triplets.

To use the "batch hard" version, you can do:

from model.triplet_loss import batch_hard_triplet_loss

loss = batch_hard_triplet_loss(labels, embeddings, margin, squared=False)

Training on MNIST

To run a new experiment called base_model, do:

python train.py --model_dir experiments/base_model

You will first need to create a configuration file like this one: params.json. This json file specifies all the hyperparameters for the model. All the weights and summaries will be saved in the model_dir.

Once trained, you can visualize the embeddings by running:

python visualize_embeddings.py --model_dir experiments/base_model

And run tensorboard in the experiment directory:

tensorboard --logdir experiments/base_model

Here is the result (link to gif):


Embeddings of the MNIST test images visualized with T-SNE (perplexity 25)

Test

To run all the tests, run this from the project directory:

pytest

To run a specific test:

pytest model/tests/test_triplet_loss.py

Resources

Blog post explaining this project.
Source code for the built-in TensorFlow function for semi hard online mining triplet loss: tf.contrib.losses.metric_learning.triplet_semihard_loss.
Facenet paper introducing online triplet mining
Detailed explanation of online triplet mining in In Defense of the Triplet Loss for Person Re-Identification
Blog post by Brandom Amos on online triplet mining: OpenFace 0.2.0: Higher accuracy and halved execution time.
Source code for the built-in TensorFlow function for semi hard online mining triplet loss: tf.contrib.losses.metric_learning.triplet_semihard_loss.
The coursera lecture on triplet loss

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 934

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (27) 🔗