All Projects → tofunlp → sister

tofunlp / sister

Licence: other
SImple SenTence EmbeddeR

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to sister

Filipino-Text-Benchmarks
Open-source benchmark datasets and pretrained transformer models in the Filipino language.
Stars: ✭ 22 (-66.67%)
Mutual labels:  transformer, bert
bert in a flask
A dockerized flask API, serving ALBERT and BERT predictions using TensorFlow 2.0.
Stars: ✭ 32 (-51.52%)
Mutual labels:  transformer, bert
text-generation-transformer
text generation based on transformer
Stars: ✭ 36 (-45.45%)
Mutual labels:  transformer, bert
golgotha
Contextualised Embeddings and Language Modelling using BERT and Friends using R
Stars: ✭ 39 (-40.91%)
Mutual labels:  transformer, bert
Bertviz
Tool for visualizing attention in the Transformer model (BERT, GPT-2, Albert, XLNet, RoBERTa, CTRL, etc.)
Stars: ✭ 3,443 (+5116.67%)
Mutual labels:  transformer, bert
are-16-heads-really-better-than-1
Code for the paper "Are Sixteen Heads Really Better than One?"
Stars: ✭ 128 (+93.94%)
Mutual labels:  transformer, bert
SIGIR2021 Conure
One Person, One Model, One World: Learning Continual User Representation without Forgetting
Stars: ✭ 23 (-65.15%)
Mutual labels:  transformer, bert
COVID-19-Tweet-Classification-using-Roberta-and-Bert-Simple-Transformers
Rank 1 / 216
Stars: ✭ 24 (-63.64%)
Mutual labels:  transformer, bert
Transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Stars: ✭ 55,742 (+84357.58%)
Mutual labels:  transformer, bert
Nlp Tutorial
Natural Language Processing Tutorial for Deep Learning Researchers
Stars: ✭ 9,895 (+14892.42%)
Mutual labels:  transformer, bert
transformer-models
Deep Learning Transformer models in MATLAB
Stars: ✭ 90 (+36.36%)
Mutual labels:  transformer, bert
Pytorch Sentiment Analysis
Tutorials on getting started with PyTorch and TorchText for sentiment analysis.
Stars: ✭ 3,209 (+4762.12%)
Mutual labels:  word-embeddings, bert
Xpersona
XPersona: Evaluating Multilingual Personalized Chatbot
Stars: ✭ 54 (-18.18%)
Mutual labels:  transformer, bert
semantic-document-relations
Implementation, trained models and result data for the paper "Pairwise Multi-Class Document Classification for Semantic Relations between Wikipedia Articles"
Stars: ✭ 21 (-68.18%)
Mutual labels:  transformer, bert
PDN
The official PyTorch implementation of "Pathfinder Discovery Networks for Neural Message Passing" (WebConf '21)
Stars: ✭ 44 (-33.33%)
Mutual labels:  transformer, bert
bert-as-a-service TFX
End-to-end pipeline with TFX to train and deploy a BERT model for sentiment analysis.
Stars: ✭ 32 (-51.52%)
Mutual labels:  transformer, bert
NLP-paper
🎨 🎨NLP 自然语言处理教程 🎨🎨 https://dataxujing.github.io/NLP-paper/
Stars: ✭ 23 (-65.15%)
Mutual labels:  transformer, bert
Kevinpro-NLP-demo
All NLP you Need Here. 个人实现了一些好玩的NLP demo,目前包含13个NLP应用的pytorch实现
Stars: ✭ 117 (+77.27%)
Mutual labels:  transformer, bert
Bert Pytorch
Google AI 2018 BERT pytorch implementation
Stars: ✭ 4,642 (+6933.33%)
Mutual labels:  transformer, bert
Awesome Sentence Embedding
A curated list of pretrained sentence and word embedding models
Stars: ✭ 1,973 (+2889.39%)
Mutual labels:  word-embeddings, bert

sister

SISTER (SImple SenTence EmbeddeR)

Installation

pip install sister

Basic Usage

import sister
sentence_embedding = sister.MeanEmbedding(lang="en")

sentence = "I am a dog."
vector = sentence_embedding(sentence)

If you have custom model file by yourself, you can load it too. (Data Format has to be loadable as gensim.models.KeyedVectors for word2vec model files)

import sister
from sister.word_embedders import Word2VecEmbedding

sentence_embedding = sister.MeanEmbedding(
    lang="ja", Word2VecEmbedding(model_path="/path/to/model")
)

sentence = "I am a dog."
vector = sentence_embedding(sentence)

Supported languages.

  • English
  • Japanese
  • French

In order to support a new language, please implement Tokenizer (inheriting sister.tokenizers.Tokenizer) and add fastText pre-trained url to word_embedders.get_fasttext() (List of model urls).

Bert models are supported for en, fr, ja (2020-06-29).

Actually Albert for English, CamemBERT for French and BERT for Japanese.
To use BERT, you need to install sister by pip install 'sister[bert]'.

import sister
bert_embedding = sister.BertEmbedding(lang="en")

sentence = "I am a dog."
vector = bert_embedding(sentence)

You can also give multiple sentences to it (more efficient).

import sister
bert_embedding = sister.BertEmbedding(lang="en")

sentences = ["I am a dog.", "I want be a cat."]
vectors = bert_embedding(sentences)
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].