All Projects → zake7749 → Fill-the-GAP

zake7749 / Fill-the-GAP

Licence: other
[ACL-WS] 4th place solution to gendered pronoun resolution challenge on Kaggle

Programming Languages

Jupyter Notebook
11667 projects

Projects that are alternatives of or similar to Fill-the-GAP

WSDM-Cup-2019
[ACM-WSDM] 3rd place solution at WSDM Cup 2019, Fake News Classification on Kaggle.
Stars: ✭ 62 (+376.92%)
Mutual labels:  natural-language-inference, bert, natural-language-understanding
gender-unbiased BERT-based pronoun resolution
Source code for the ACL workshop paper and Kaggle competition by Google AI team
Stars: ✭ 42 (+223.08%)
Mutual labels:  kaggle, bert, coreference-resolution
banglabert
This repository contains the official release of the model "BanglaBERT" and associated downstream finetuning code and datasets introduced in the paper titled "BanglaBERT: Language Model Pretraining and Benchmarks for Low-Resource Language Understanding Evaluation in Bangla" accpeted in Findings of the Annual Conference of the North American Chap…
Stars: ✭ 186 (+1330.77%)
Mutual labels:  natural-language-inference, bert
bert nli
A Natural Language Inference (NLI) model based on Transformers (BERT and ALBERT)
Stars: ✭ 97 (+646.15%)
Mutual labels:  natural-language-inference, bert
nlp-notebooks
A collection of natural language processing notebooks.
Stars: ✭ 19 (+46.15%)
Mutual labels:  natural-language-inference, natural-language-understanding
Bert As Service
Mapping a variable-length sentence to a fixed-length vector using BERT model
Stars: ✭ 9,779 (+75123.08%)
Mutual labels:  bert, natural-language-understanding
Mt Dnn
Multi-Task Deep Neural Networks for Natural Language Understanding
Stars: ✭ 1,871 (+14292.31%)
Mutual labels:  bert, natural-language-understanding
TextFeatureSelection
Python library for feature selection for text features. It has filter method, genetic algorithm and TextFeatureSelectionEnsemble for improving text classification models. Helps improve your machine learning models
Stars: ✭ 42 (+223.08%)
Mutual labels:  natural-language-inference, natural-language-understanding
label-studio-transformers
Label data using HuggingFace's transformers and automatically get a prediction service
Stars: ✭ 117 (+800%)
Mutual labels:  bert, natural-language-understanding
Nlp Recipes
Natural Language Processing Best Practices & Examples
Stars: ✭ 5,783 (+44384.62%)
Mutual labels:  natural-language-inference, natural-language-understanding
Discovery
Mining Discourse Markers for Unsupervised Sentence Representation Learning
Stars: ✭ 48 (+269.23%)
Mutual labels:  natural-language-inference, natural-language-understanding
Opencog
A framework for integrated Artificial Intelligence & Artificial General Intelligence (AGI)
Stars: ✭ 2,132 (+16300%)
Mutual labels:  natural-language-inference, natural-language-understanding
Tokenizers
💥 Fast State-of-the-Art Tokenizers optimized for Research and Production
Stars: ✭ 5,077 (+38953.85%)
Mutual labels:  bert, natural-language-understanding
classy
classy is a simple-to-use library for building high-performance Machine Learning models in NLP.
Stars: ✭ 61 (+369.23%)
Mutual labels:  bert, natural-language-understanding
Transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Stars: ✭ 55,742 (+428684.62%)
Mutual labels:  bert, natural-language-understanding
text2class
Multi-class text categorization using state-of-the-art pre-trained contextualized language models, e.g. BERT
Stars: ✭ 15 (+15.38%)
Mutual labels:  bert, natural-language-understanding
bert extension tf
BERT Extension in TensorFlow
Stars: ✭ 29 (+123.08%)
Mutual labels:  bert, natural-language-understanding
DiscEval
Discourse Based Evaluation of Language Understanding
Stars: ✭ 18 (+38.46%)
Mutual labels:  bert, natural-language-understanding
NSP-BERT
The code for our paper "NSP-BERT: A Prompt-based Zero-Shot Learner Through an Original Pre-training Task —— Next Sentence Prediction"
Stars: ✭ 166 (+1176.92%)
Mutual labels:  natural-language-inference, bert
Gluon Nlp
NLP made easy
Stars: ✭ 2,344 (+17930.77%)
Mutual labels:  natural-language-inference, natural-language-understanding

Fill-the-GAP

This is the 4th solution to the Gendered Pronoun Resolution Competition on Kaggle.

Solution Overview

1. Input Dropout

I've played with BERT in other tasks where I found there are some redundancies in BERT vector. Even though we only use a small portion (like 50%) of the BERT vector, we still can get desirable performance.

Based on this observation, I placed a dropout with a large rate just after the input layer, which can be considered as a kind of model boosting, just like training several prototypes with subsets that are randomly sampled from the BERT vector.

2. Word Encoder

As I mentioned in section 1, it might not be suitable to use the output directly because of redundancies. Therefore I use a word encoder to down-project the BERT vector into a lower-dimensional space where I can extract task-related features efficiently.

The word encoder is a simple affine transformation with SELU activation and it is shared for A, B, and P. I have tried to design the word encoder for names and pronouns independently or make the word encoder deeper with highway transformations but all of them results in overfitting.

This idea is also inspired by the multi-head transformation. I have implemented a multi-head NLI encoder but it only improved the performance by ~0.0005 and took much computation time. So maybe a single head is good enough for this task.

3. Answer selection using NLI architectures

I consider this task a sub-task of answer selection. Given queries A, B, and an answer P, we can model the relations between queries and answers with heuristic interaction:

I(Q, A) = [[Q; A], Q - A, Q * A]

and then extract features from the interaction vector I(Q, A) with a siamese encoder. The overall architecture would be like this:

Model

Finally, here is a simple performacne report of my models:

Model 5 fold CV on Stage 1
Base BERT 0.50
Base BERT + input dropout 0.45
Base BERT + input dropout + NLI 0.43
Base BERT + all 0.39
Large BERT + input dropout 0.39
Large BERT + all 0.32
Ensemble of Base BERT and Large BERT 0.30

Note

The code is still under cleaning. There still exists some dirty methods for the trade-off between efficiency and scalability. For notebook stage 0.1 ~ 0.6, it's not necessary to use a for loop to dump features from each layer. The offical API supports to dump all of them at the same time.

Citation

If you find this repository is useful for your research, please cite our paper:

@inproceedings{yang2019fill,
  title={Fill the GAP: Exploiting BERT for Pronoun Resolution},
  author={Yang, Kai-Chou and Niven, Timothy and Chou, Tzu Hsuan and Kao, Hung-Yu},
  booktitle={Proceedings of the First Workshop on Gender Bias in Natural Language Processing},
  pages={102--106},
  year={2019}
}
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].