All Projects → codertimo → KorQuAD-Question-Generation

codertimo / KorQuAD-Question-Generation

Licence: other
question generation model with KorQuAD dataset

Programming Languages

python
139335 projects - #7 most used programming language
Makefile
30231 projects

Projects that are alternatives of or similar to KorQuAD-Question-Generation

tensorflow-ml-nlp-tf2
텐서플로2와 머신러닝으로 시작하는 자연어처리 (로지스틱회귀부터 BERT와 GPT3까지) 실습자료
Stars: ✭ 245 (+807.41%)
Mutual labels:  gpt2, korquad
Transformer-QG-on-SQuAD
Implement Question Generator with SOTA pre-trained Language Models (RoBERTa, BERT, GPT, BART, T5, etc.)
Stars: ✭ 28 (+3.7%)
Mutual labels:  question-generation, gpt2
Roberta zh
RoBERTa中文预训练模型: RoBERTa for Chinese
Stars: ✭ 1,953 (+7133.33%)
Mutual labels:  gpt2
question-generation
Neural Models for Key Phrase Detection and Question Generation
Stars: ✭ 29 (+7.41%)
Mutual labels:  question-generation
RL-based-Graph2Seq-for-NQG
Code & data accompanying the ICLR 2020 paper "Reinforcement Learning Based Graph-to-Sequence Model for Natural Question Generation"
Stars: ✭ 104 (+285.19%)
Mutual labels:  question-generation
question generator
An NLP system for generating reading comprehension questions
Stars: ✭ 188 (+596.3%)
Mutual labels:  question-generation
text2text
Text2Text: Cross-lingual natural language processing and generation toolkit
Stars: ✭ 188 (+596.3%)
Mutual labels:  question-generation
OpenDialog
An Open-Source Package for Chinese Open-domain Conversational Chatbot (中文闲聊对话系统,一键部署微信闲聊机器人)
Stars: ✭ 94 (+248.15%)
Mutual labels:  gpt2
beir
A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.
Stars: ✭ 738 (+2633.33%)
Mutual labels:  question-generation
MLH-Quizzet
This is a smart Quiz Generator that generates a dynamic quiz from any uploaded text/PDF document using NLP. This can be used for self-analysis, question paper generation, and evaluation, thus reducing human effort.
Stars: ✭ 23 (-14.81%)
Mutual labels:  question-generation
FocusSeq2Seq
[EMNLP 2019] Mixture Content Selection for Diverse Sequence Generation (Question Generation / Abstractive Summarization)
Stars: ✭ 109 (+303.7%)
Mutual labels:  question-generation
Zero-shot-Fact-Verification
Codes for ACL-IJCNLP 2021 Paper "Zero-shot Fact Verification by Claim Generation"
Stars: ✭ 39 (+44.44%)
Mutual labels:  question-generation
MulQG
Multi-hop Question Generation with Graph Convolutional Network
Stars: ✭ 20 (-25.93%)
Mutual labels:  question-generation
Bertviz
Tool for visualizing attention in the Transformer model (BERT, GPT-2, Albert, XLNet, RoBERTa, CTRL, etc.)
Stars: ✭ 3,443 (+12651.85%)
Mutual labels:  gpt2
finetune-gpt2xl
Guide: Finetune GPT2-XL (1.5 Billion Parameters) and finetune GPT-NEO (2.7 B) on a single GPU with Huggingface Transformers using DeepSpeed
Stars: ✭ 353 (+1207.41%)
Mutual labels:  gpt2
Story Generator
A Streamlit web app that generates Rick and Morty stories using GPT2.
Stars: ✭ 28 (+3.7%)
Mutual labels:  gpt2
Tianchi2020ChineseMedicineQuestionGeneration
2020 阿里云天池大数据竞赛-中医药文献问题生成挑战赛
Stars: ✭ 20 (-25.93%)
Mutual labels:  question-generation
just-ask
[TPAMI Special Issue on ICCV 2021 Best Papers, Oral] Just Ask: Learning to Answer Questions from Millions of Narrated Videos
Stars: ✭ 57 (+111.11%)
Mutual labels:  question-generation
Neural-Scam-Artist
Web Scraping, Document Deduplication & GPT-2 Fine-tuning with a newly created scam dataset.
Stars: ✭ 18 (-33.33%)
Mutual labels:  gpt2
unsupervised-qa
Template-Based Question Generation from Retrieved Sentences for Improved Unsupervised Question Answering
Stars: ✭ 47 (+74.07%)
Mutual labels:  question-generation

Question Generation(QG) Model with KorQuAD

학습된 SKT-AI/KoGPT2 모델을 기반으로 질문 생성 QG(Question Generation) 모델을 만들었습니다. QG 모델을 만들기 위해 Question Answering 데이터셋인 KorQuAD v1.0을 사용하였습니다.

사용 방법

데이터 준비

학습/평가/생성을 위해서 KorQuAD v1.0 데이터셋을 다운 받습니다.

make prepare-dataset

학습

다음 커맨드를 이용해서 학습을 수행할 수 있습니다.

python -m scripts.run_fine_tune --train-batch-size 16 --eval-batch-size 16 --epochs 5

성능 평가 (dev 셋 PPL 측정)

MODEL_PATH = "artifacts/gpt2_xxxxxxxx/gpt2_step_x.pth"
python -m scripts.run_evaluation --model-path $MODEL_PATH --batch-size 50

질문 생성 (dev 셋에 대해서 질문 생성)

Decoding 결과

Question Generation POC 스프레드 시트: KorQuAD v1.0 dev 셋에 대해서 decoding 한 결과 입니다.

beam-search 를 기반으로 decoding 되었으며, beam_size 는 5를 사용하였습니다.

MODEL_PATH = "artifacts/gpt2_xxxxxxxx/gpt2_step_x.pth"
python -m scripts.run_generate --model-path $MODEL_PATH --output-path decoded.tsv

학습된 QG 모델 다운로드

Author

by Junseong Kim (Scatter Lab, Pingpong AI) [email protected]

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].