All Projects → tanulsingh → Humour.ai Language Model That Can Crack Jokes

tanulsingh / Humour.ai Language Model That Can Crack Jokes

Language Model that makes you Laugh .

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Humour.ai Language Model That Can Crack Jokes

gdc
Code for the ICLR 2021 paper "A Distributional Approach to Controlled Text Generation"
Stars: ✭ 94 (+422.22%)
Mutual labels:  nlg, language-model
Pykaldi
A Python wrapper for Kaldi
Stars: ✭ 756 (+4100%)
Mutual labels:  language-model
Nlp Paper
NLP Paper
Stars: ✭ 484 (+2588.89%)
Mutual labels:  language-model
Keras Language Modeling
📖 Some language modeling tools for Keras
Stars: ✭ 666 (+3600%)
Mutual labels:  language-model
Albert pytorch
A Lite Bert For Self-Supervised Learning Language Representations
Stars: ✭ 539 (+2894.44%)
Mutual labels:  language-model
Simplenlg
Java API for Natural Language Generation. Originally developed by Ehud Reiter at the University of Aberdeen’s Department of Computing Science and co-founder of Arria NLG. This git repo is the official SimpleNLG version.
Stars: ✭ 708 (+3833.33%)
Mutual labels:  nlg
Practical Pytorch
Go to https://github.com/pytorch/tutorials - this repo is deprecated and no longer maintained
Stars: ✭ 4,329 (+23950%)
Mutual labels:  nlg
Nlg Eval
Evaluation code for various unsupervised automated metrics for Natural Language Generation.
Stars: ✭ 822 (+4466.67%)
Mutual labels:  nlg
Lightnlp
基于Pytorch和torchtext的自然语言处理深度学习框架。
Stars: ✭ 739 (+4005.56%)
Mutual labels:  language-model
Dl Nlp Readings
My Reading Lists of Deep Learning and Natural Language Processing
Stars: ✭ 656 (+3544.44%)
Mutual labels:  language-model
Kobert
Korean BERT pre-trained cased (KoBERT)
Stars: ✭ 591 (+3183.33%)
Mutual labels:  language-model
Deberta
The implementation of DeBERTa
Stars: ✭ 541 (+2905.56%)
Mutual labels:  language-model
Tc Bot
User Simulation for Task-Completion Dialogues
Stars: ✭ 733 (+3972.22%)
Mutual labels:  nlg
Ctcdecoder
Connectionist Temporal Classification (CTC) decoding algorithms: best path, prefix search, beam search and token passing. Implemented in Python.
Stars: ✭ 529 (+2838.89%)
Mutual labels:  language-model
Lm Lstm Crf
Empower Sequence Labeling with Task-Aware Language Model
Stars: ✭ 778 (+4222.22%)
Mutual labels:  language-model
Tokenizers
💥 Fast State-of-the-Art Tokenizers optimized for Research and Production
Stars: ✭ 5,077 (+28105.56%)
Mutual labels:  language-model
Awesome Bert Nlp
A curated list of NLP resources focused on BERT, attention mechanism, Transformer networks, and transfer learning.
Stars: ✭ 567 (+3050%)
Mutual labels:  language-model
Chatito
🎯🗯 Generate datasets for AI chatbots, NLP tasks, named entity recognition or text classification models using a simple DSL!
Stars: ✭ 678 (+3666.67%)
Mutual labels:  nlg
Chinese Electra
Pre-trained Chinese ELECTRA(中文ELECTRA预训练模型)
Stars: ✭ 830 (+4511.11%)
Mutual labels:  language-model
Chatbot cn
基于金融-司法领域(兼有闲聊性质)的聊天机器人,其中的主要模块有信息抽取、NLU、NLG、知识图谱等,并且利用Django整合了前端展示,目前已经封装了nlp和kg的restful接口
Stars: ✭ 791 (+4294.44%)
Mutual labels:  nlg

Humour.ai

I have seen a lot of people do cool projects in Computer Vision (the more hyped one) but hardly I have ever seen something good in NLP. After learning about transformers, I thought I should do something in NLP. I have fine tuned GPT-2 with a language model head on short jokes scrapped from reddit.

Humor.ai tries to complete the sentences in a humourous way given some input words

I tested my model on unseen words and sentences and got some really cool and surprising results

The first one is really hilarious given the fact that model doesn't know my name 😂😂 Language Model that can make you Laugh

Image description

Image description

Image description

Data

The first challenge for any Machine Learning project is getting the data that would do the task. Fortunately I didn't have to do a lot in getting the data , I found this awesome dataset on Kaggle . It consists of short Jokes scrapped from reddit in well laid DataFrame

Pre-Processing

Open GPT-2 is a transformer type architecture which uses the decoder of transformers . It is well known for it's language modelling tasks and thus I used it to create Humour.ai

There are two ways in which data can be presented to the model, depending on the objective you want to achieve

  • Joke generator
  • Humorous Sentence Completion

Let's look at these two seperately

Joke Generation

In this task the model simply tries to generate jokes, given the length of joke and number of jokes you want it to generate. Here we append 'JOKE:' at the start of every joke in our dataframe and '<|endoftext|>' at the end of each joke which tells our model that our joke has ended. At the time of inference , we simply provide number of jokes and length of each joke and our model will print out jokes based on what it has learned

Humorous Sentence Completion

This is something new , a simple tweak to above mentioned task . In this our model tries to complete a sentence in a humorous way given any input word or words it has never seen before.

For this task , I took only the Jokes in our dataset which were question,answer types and started with Why,When,How,etc. Then processed the data in this format

<|soq|> question <|sep|> answer <|endoftext|>

It looks like an input to Question answering system , only the whole string is treated as one string , instead of getting different token_type_ids for Questions and Asnwers

Model

I have used HuggingFace Library for GPT-2 Model and the whole code is written in Pytorch. I will be more than happy to share if someone takes this model and writes its equivalent in Keras/TF (that would be a good exercise) .The modelling and inference are easy to understand and self-explanatory if one reads the HuggingFace Docs.

HyperParameters

I have tested two batch_sizes and two learning rates , the later works better.It takes about 5 hours to train the first model for second task(Humorous Sentence Completion) on GPU's and

Task Batch_Size MAX_LEN EPOCHS Learning Rate Train Time On GPU's Train Time on TPU's
Humorous Sentence Completion 32 64 4 3e-5 4.5 hours 2.5 hours
Humorous Sentence Completition 16 64 4 2e-5 5.5 hours 3 hours
Joke Generation 32 64 4 3e-5 6.5 hours 2.5 hours
Joke Generation 16 64 4 2e-5 7.5 hours 3 hours

End Notes

  • Feel Free to Fork, Experiment and play with the model . I have uploaded the code for the different tasks in different folders .
  • I will also be uploading trained weights so that anyone can load it and play with the model by just running the inference file
  • I will be uploading the codde for taining on TPU's soon
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].