Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → PolyAI-LDN → Polyai Models

PolyAI-LDN / Polyai Models

Licence: apache-2.0

Neural Models for Conversational AI

Programming Languages

139335 projects - #7 most used programming language

Labels

machine-learning tensorflow nlp natural-language-processing natural-language-understanding

Projects that are alternatives of or similar to Polyai Models

Xlnet extension tf

XLNet Extension in TensorFlow

Stars: ✭ 109 (-44.1%)

Mutual labels: natural-language-processing, natural-language-understanding

Awesome Hungarian Nlp

A curated list of NLP resources for Hungarian

Stars: ✭ 121 (-37.95%)

Mutual labels: natural-language-processing, natural-language-understanding

Deep Nlp Seminars

Materials for deep NLP course

Stars: ✭ 113 (-42.05%)

Mutual labels: natural-language-processing, natural-language-understanding

A Dead Simple BERT API for Python and Java (https://github.com/google-research/bert)

Stars: ✭ 106 (-45.64%)

Mutual labels: natural-language-processing, natural-language-understanding

Natural Language Processing Specialization

This repo contains my coursework, assignments, and Slides for Natural Language Processing Specialization by deeplearning.ai on Coursera

Stars: ✭ 151 (-22.56%)

Mutual labels: natural-language-processing, natural-language-understanding

Русскоязычный чатбот

Stars: ✭ 106 (-45.64%)

Mutual labels: natural-language-processing, natural-language-understanding

Turkish Morphology

A two-level morphological analyzer for Turkish.

Stars: ✭ 121 (-37.95%)

Mutual labels: natural-language-processing, natural-language-understanding

Spark Nlp Models

Models and Pipelines for the Spark NLP library

Stars: ✭ 88 (-54.87%)

Mutual labels: natural-language-processing, natural-language-understanding

Dialogflow Ruby Client

Ruby SDK for Dialogflow

Stars: ✭ 148 (-24.1%)

Mutual labels: natural-language-processing, natural-language-understanding

Pre-Trained Models for ToD-BERT

Stars: ✭ 143 (-26.67%)

Mutual labels: natural-language-processing, natural-language-understanding

Spokestack Python

Spokestack is a library that allows a user to easily incorporate a voice interface into any Python application.

Stars: ✭ 103 (-47.18%)

Mutual labels: natural-language-processing, natural-language-understanding

Efaqa Corpus Zh

❤️Emotional First Aid Dataset, 心理咨询问答、聊天机器人语料库

Stars: ✭ 170 (-12.82%)

Mutual labels: natural-language-processing, natural-language-understanding

Chinese nlu by using rasa nlu

使用 RASA NLU 来构建中文自然语言理解系统（NLU）| Use RASA NLU to build a Chinese Natural Language Understanding System (NLU)

Stars: ✭ 99 (-49.23%)

Mutual labels: natural-language-processing, natural-language-understanding

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Stars: ✭ 55,742 (+28485.64%)

Mutual labels: natural-language-processing, natural-language-understanding

Bert As Service

Mapping a variable-length sentence to a fixed-length vector using BERT model

Stars: ✭ 9,779 (+4914.87%)

Mutual labels: natural-language-processing, natural-language-understanding

DialoGLUE: A Natural Language Understanding Benchmark for Task-Oriented Dialogue

Stars: ✭ 120 (-38.46%)

Mutual labels: natural-language-processing, natural-language-understanding

Multi-Task Deep Neural Networks for Natural Language Understanding

Stars: ✭ 72 (-63.08%)

Mutual labels: natural-language-processing, natural-language-understanding

Dialogue Understanding

This repository contains PyTorch implementation for the baseline models from the paper Utterance-level Dialogue Understanding: An Empirical Study

Stars: ✭ 77 (-60.51%)

Mutual labels: natural-language-processing, natural-language-understanding

Character-based word embeddings model based on RNN for handling real world texts

Stars: ✭ 130 (-33.33%)

Mutual labels: natural-language-processing, natural-language-understanding

SLING - A natural language frame semantics parser

Stars: ✭ 1,892 (+870.26%)

Mutual labels: natural-language-processing, natural-language-understanding

View All Similar Projects ➔

polyai-models

Neural Models for Conversational AI

This repo shares models from PolyAI publications, including the ConveRT efficient dual-encoder model. These are shared as Tensorflow Hub modules, listed below. We also share example code and utility classes, though for many the Tensorflow Hub URLs will be enough.

Requirements
Models
Keras layers
Encoder client
Citations
Development

Requirements

Using these models requires Tensorflow Hub and Tensorflow Text. In particular, Tensorflow Text provides ops that allow the model to directly work on text, requiring no pre-processing or tokenization from the user. You must import tensorflow_text before loading the tensorflow hub modules, or you will see an error about 'ops missing from the python registry'. The models are compatible with any of the following combinations:

Tensorflow 1.14 and Tensorflow Text 0.6.0 (used for tests and examples this repo)
Tensorflow 1.15 and Tensorflow Text 1.15.x
Tensorflow 2.0 and Tensorflow Text 2.0.x

A list of available versions can be found on the Tensorflow Text github repo. Note for Tensorflow 2.0 you may need to disable eager execution with tf.compat.v1.disable_eager_execution().

Models

ConveRT

This is the ConveRT dual-encoder model, using subword representations and lighter-weight more efficient transformer-style blocks to encode text, as described in the ConveRT paper. It provides powerful representations for conversational data, and can also be used as a response ranker. The model costs under $100 to train from scratch, can be quantized to under 60MB, and is competitive with larger Transformer networks on conversational tasks. We share an unquantized version of the model, facilitating fine-tuning. Please get in touch if you are interested in using the quantized ConveRT model. The Tensorflow Hub url is:

module = tfhub.Module("https://github.com/PolyAI-LDN/polyai-models/releases/download/v1.0/model.tar.gz")

See the convert-examples.ipynb notebook for some examples of how to use this model.

TFHub signatures

default

Takes as input sentences, a string tensor of sentences to encode. Outputs 1024 dimensional vectors, giving a representation for each sentence. These are the output of the sqrt-N reduction in the shared tranformer encoder. These representations work well as input to classification models. Note that these vectors are not normalized in any way, so you might find l2 normalizing them helps for learning, especially when using SGD.

sentence_encodings = module(
  ["hello how are you?", "what is your name?", "thank you good bye"])

encode_context

Takes as input contexts, a string tensor of contexts to encode. Outputs 512 dimensional vectors, giving the context representation of each input. These are trained to have a high cosine-similarity with the response representations of good responses (from the encode_response signature)

context_encodings = module(
  ["hello how are you?", "what is your name?", "thank you good bye"],
  signature="encode_context")

encode_response

Takes as input responses, a string tensor of responses to encode. Outputs 512 dimensional vectors, giving the response representation of each input. These are trained to have a high cosine-similarity with the context representations of good corresponding contexts (from the encode_context signature)

response_encodings = module(
  ["i am well", "I am Matt", "bye!"],
  signature="encode_response")

encode_sequence

Takes as input sentences, a string tensor of sentences to encode. This outputs sequence encodings, a 3-tensor of shape [batch_size, max_sequence_length, 512], as well as the corresponding subword tokens, a utf8-encoded matrix of shape [batch_size, max_sequence_length]. The tokens matrix is padded with empty strings, which may help in masking the sequence tensor. The encoder_utils.py library has a few functions for dealing with these tokenizations, including a detokenization function, and a function that infers byte spans in the original strings.

output = module(
  ["i am well", "I am Matt", "bye!"],
  signature="encode_sequence", as_dict=True)
sequence_encodings = output['sequence_encoding']
tokens = output['tokens']

tokenize

Takes as input sentences, a string tensor of sentences to encode. This outputs the corresponding subword tokens, a utf8-encoded matrix of shape [batch_size, max_sequence_length]. The tokens matrix is padded with empty strings. Usually this process is internal to the network, but for some applications it may be useful to access the internal tokenization.

tokens = module(
  ["i am well", "I am Matt", "bye!"],
  signature="tokenize")

Multi-Context ConveRT

This is the multi-context ConveRT model from the ConveRT paper, that uses extra contexts from the conversational history to refine the context representations. This is an unquantized version of the model. The Tensorflow Hub url is:

module = tfhub.Module("https://github.com/PolyAI-LDN/polyai-models/releases/download/v1.0/model_multicontext.tar.gz")

TFHub signatures

This model has the same signatures as the ConveRT encoder, except for the encode_context signature that also takes the extra contexts as input. The extra contexts are the previous messages in the dialogue (typically at most 10) prior to the immediate context, and must be joined with spaces from most recent to oldest.

For example, consider the dialogue:

A: Hey! B: Hello how are you? A: Fine, strange weather recently right? B: Yeah

then the context representation is computed as:

context = ["Yeah"]
extra_context = ["Fine, strange weather recently right? Hello how are you? Hey!"]
context_encodings = module(
  {
    'context': context,
    'extra_context': extra_context,
  },
  signature="encode_context",
)

See encoder_client.py for code that computes these features.

ConveRT finetuned on Ubuntu

This is the multi-context ConveRT model, fine-tuned to the DSTC7 Ubuntu response ranking task. It has the exact same signatures as the extra context model, and has TFHub uri https://github.com/PolyAI-LDN/polyai-models/releases/download/v1.0/model_ubuntu.tar.gz. Note that this model requires prefixing the extra context features with "0: ", "1: ", "2: " etc.

The dstc7/evaluate_encoder.py script demonstrates using this encoder to reproduce the results from the ConveRT paper.

Intent Detection Benchmarks

A set of intent detectors trained on top of ConveRT and other sentence encoders can be found in the intent_detection directory. These are the intent detectors presented in Efficient Intent Detection with Dual Sentence Encoders.

Keras layers

Keras layers for the above encoder models are implemented in encoder_layers.py. These may be useful for building a model that extends the encoder models, and/or fine-tuning them to your own data.

Encoder client

A python class EncoderClient is implemented in encoder_client.py, that gives a simple interface for encoding sentences, contexts, and responses with the above models. It takes python strings as input, and numpy matrices as output:

client = encoder_client.EncoderClient(
    "http://models.poly-ai.com/convert/v1/model.tar.gz")

# We will find good responses to the following context.    
context_encodings = client.encode_contexts(["What's your name?"])

# Let's rank the following responses as candidates.
candidate_responses = ["No thanks.", "I'm Matt.", "Hey.", "I have a dog."]
response_encodings = client.encode_responses(candidate_responses)

# The scores are computed using the dot product.
scores = response_encodings.dot(context_encodings.T).flatten()

# Output the top scoring response.
top_idx = scores.argmax()
print(f"Best response: {candidate_responses[top_idx]}, score: {scores[top_idx]:.3f}")

# This should print "Best response: I'm Matt., score: 0.377".

Internally it implements caching, deduplication, and batching, to help speed up encoding. Note that because it does batching internally, you can pass very large lists of sentences to encode without going out of memory.

Citations

ConveRT: Efficient and Accurate Conversational Representations from Transformers

@article{Henderson2019convert,
    title={{ConveRT}: Efficient and Accurate Conversational Representations from Transformers},
    author={Matthew Henderson and I{\~{n}}igo Casanueva and Nikola Mrk\v{s}i\'{c} and Pei-Hao Su and Tsung-Hsien and Ivan Vuli\'{c}},
    journal={CoRR},
    volume={abs/1911.03688},
    year={2019},
    url={http://arxiv.org/abs/1911.03688},
}

Development

Setting up an environment for development:

Create a python 3 virtual environment

python3 -m venv ./venv

Install the requirements

. venv/bin/activate
pip install -r requirements.txt

Run the unit tests

python -m unittest discover -p '*_test.py' .

Pull requests will trigger a CircleCI build that:

runs flake8 and isort
runs the unit tests

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 195

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (4) 🔗