All Projects → gentaiscool → few-shot-lm

gentaiscool / few-shot-lm

Licence: other
The source code of "Language Models are Few-shot Multilingual Learners" (MRL @ EMNLP 2021)

Programming Languages

python
139335 projects - #7 most used programming language
shell
77523 projects

Projects that are alternatives of or similar to few-shot-lm

Trankit
Trankit is a Light-Weight Transformer-based Python Toolkit for Multilingual Natural Language Processing
Stars: ✭ 311 (+871.88%)
Mutual labels:  multilingual, language-model
Transferlearning
Transfer learning / domain adaptation / domain generalization / multi-task learning etc. Papers, codes, datasets, applications, tutorials.-迁移学习
Stars: ✭ 8,481 (+26403.13%)
Mutual labels:  few-shot, few-shot-learning
Tokenizers
💥 Fast State-of-the-Art Tokenizers optimized for Research and Production
Stars: ✭ 5,077 (+15765.63%)
Mutual labels:  gpt, language-model
minGPT-TF
A minimal TF2 re-implementation of the OpenAI GPT training
Stars: ✭ 36 (+12.5%)
Mutual labels:  gpt, language-model
gpt-j-api
API for the GPT-J language model 🦜. Including a FastAPI backend and a streamlit frontend
Stars: ✭ 248 (+675%)
Mutual labels:  gpt, language-model
FSL-Mate
FSL-Mate: A collection of resources for few-shot learning (FSL).
Stars: ✭ 1,346 (+4106.25%)
Mutual labels:  few-shot, few-shot-learning
Black-Box-Tuning
ICML'2022: Black-Box Tuning for Language-Model-as-a-Service
Stars: ✭ 99 (+209.38%)
Mutual labels:  language-model, few-shot-learning
lowshot-shapebias
Learning low-shot object classification with explicit shape bias learned from point clouds
Stars: ✭ 37 (+15.63%)
Mutual labels:  few-shot, few-shot-learning
MeTAL
Official PyTorch implementation of "Meta-Learning with Task-Adaptive Loss Function for Few-Shot Learning" (ICCV2021 Oral)
Stars: ✭ 24 (-25%)
Mutual labels:  few-shot-learning
react-translator-component
React language translation module for developing a multilingual project.
Stars: ✭ 13 (-59.37%)
Mutual labels:  multilingual
finetuner
Finetuning any DNN for better embedding on neural search tasks
Stars: ✭ 442 (+1281.25%)
Mutual labels:  few-shot-learning
DataAugmentationNMT
Data Augmentation for Neural Machine Translation
Stars: ✭ 26 (-18.75%)
Mutual labels:  language-model
kwx
BERT, LDA, and TFIDF based keyword extraction in Python
Stars: ✭ 33 (+3.13%)
Mutual labels:  multilingual
ke-dialogue
KE-Dialogue: Injecting knowledge graph into a fully end-to-end dialogue system.
Stars: ✭ 39 (+21.88%)
Mutual labels:  gpt
Hands-On-Deep-Learning-Algorithms-with-Python
Hands-On Deep Learning Algorithms with Python, By Packt
Stars: ✭ 76 (+137.5%)
Mutual labels:  few-shot-learning
academic
Jekyll theme with a focus on simplicity, typography and flexibility
Stars: ✭ 71 (+121.88%)
Mutual labels:  multilingual
Meta-TTS
Official repository of https://arxiv.org/abs/2111.04040v1
Stars: ✭ 69 (+115.63%)
Mutual labels:  few-shot-learning
python-arpa
🐍 Python library for n-gram models in ARPA format
Stars: ✭ 35 (+9.38%)
Mutual labels:  language-model
MemoPainter-PyTorch
An unofficial implementation of MemoPainter(Coloring With Limited Data: Few-shot Colorization via Memory Augmented Networks) using PyTorch framework.
Stars: ✭ 63 (+96.88%)
Mutual labels:  few-shot-learning
CDFSL-ATA
[IJCAI 2021] Cross-Domain Few-Shot Classification via Adversarial Task Augmentation
Stars: ✭ 21 (-34.37%)
Mutual labels:  few-shot-learning

Language Models are Few-shot Multilingual Learners

License: MIT

Paper

This is the source code of the paper [Arxiv] [ACL Anthology]:

This code has been written using PyTorch. If you use source codes or datasets included in this toolkit in your work, please cite the following paper:

@inproceedings{winata-etal-2021-language,
    title = "Language Models are Few-shot Multilingual Learners",
    author = "Winata, Genta Indra  and
      Madotto, Andrea  and
      Lin, Zhaojiang  and
      Liu, Rosanne  and
      Yosinski, Jason  and
      Fung, Pascale",
    booktitle = "Proceedings of the 1st Workshop on Multilingual Representation Learning",
    month = nov,
    year = "2021",
    address = "Punta Cana, Dominican Republic",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2021.mrl-1.1",
    pages = "1--15",
}


Setup Environment

GPU Machine

pip install -r requirements.txt

GPU Machine for Running GPT-J 6B Model

apt install zstd

# the "slim" version contain only bf16 weights and no optimizer parameters, which minimizes bandwidth and memory
wget -c https://the-eye.eu/public/AI/GPT-J-6B/step_383500_slim.tar.zstd

tar -I zstd -xf step_383500_slim.tar.zstd

pip install -r mesh_transformer_jax/requirements.txt

# jax 0.2.12 is required due to a regression with xmap in 0.2.13
pip install mesh-transformer-jax/ jax==0.2.12

# cuda[your_cuda_version]
pip install jaxlib==0.1.67+cuda101 -f https://storage.googleapis.com/jax-releases/jax_releases.html

How to run

Zero-shot Cross-task

❱❱❱ CUDA_VISIBLE_DEVICES=0 python evaluate.py  --dataset snips --model_checkpoint facebook/bart-large-mnli --cuda --length 5 --label_type value --src_lang en --tgt_lang en --seed 42 --use_log_prob --use_confidence --is_cross_task

Finetune

❱❱❱ CUDA_VISIBLE_DEVICES=0 python finetune.py  --dataset snips --model_checkpoint bert-base-multilingual-uncased --cuda --label_type value --src_lang en --tgt_lang en --seed 42 
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].