Alternatives and detailed information of transformer-abstractive-summarization

Andrew03 / transformer-abstractive-summarization

Licence: other

Code for the paper "Efficient Adaption of Pretrained Transformers for Abstractive Summarization"

Programming Languages

139335 projects - #7 most used programming language

Projects that are alternatives of or similar to transformer-abstractive-summarization

Its a social networking chat-bot trained on Reddit dataset . It supports open bounded queries developed on the concept of Neural Machine Translation. Beware of its being sarcastic just like its creator 😝 BDW it uses Pytorch framework and Python3.

Stars: ✭ 20 (-70.59%)

Mutual labels: pytorch-nlp

Awesome-Pytorch-Tutorials

Awesome Pytorch Tutorials

Stars: ✭ 23 (-66.18%)

Mutual labels: pytorch-nlp

Text-Classification-LSTMs-PyTorch

The aim of this repository is to show a baseline model for text classification by implementing a LSTM-based model coded in PyTorch. In order to provide a better understanding of the model, it will be used a Tweets dataset provided by Kaggle.

Stars: ✭ 45 (-33.82%)

Mutual labels: pytorch-nlp

py-lingualytics

A text analytics library with support for codemixed data

Stars: ✭ 36 (-47.06%)

Mutual labels: pytorch-nlp

Entity2Topic

[NAACL2018] Entity Commonsense Representation for Neural Abstractive Summarization

Stars: ✭ 20 (-70.59%)

Mutual labels: document-summarization

Intelligent Document Finder

Document Search Engine Tool

Stars: ✭ 45 (-33.82%)

Mutual labels: document-summarization

Pytorch Sentiment Analysis

Tutorials on getting started with PyTorch and TorchText for sentiment analysis.

Stars: ✭ 3,209 (+4619.12%)

Mutual labels: pytorch-nlp

Pytorch Nlp

Basic Utilities for PyTorch Natural Language Processing (NLP)

Stars: ✭ 1,996 (+2835.29%)

Mutual labels: pytorch-nlp

Pytorchdocs

PyTorch 官方中文教程包含 60 分钟快速入门教程，强化教程，计算机视觉，自然语言处理，生成对抗网络，强化学习。欢迎 Star，Fork！

Stars: ✭ 1,705 (+2407.35%)

Mutual labels: pytorch-nlp

Pytorch Seq2seq

Tutorials on implementing a few sequence-to-sequence (seq2seq) models with PyTorch and TorchText.

Stars: ✭ 3,418 (+4926.47%)

Mutual labels: pytorch-nlp

pytorch-transformer-chatbot

PyTorch v1.2에서 생긴 Transformer API 를 이용한 간단한 Chitchat 챗봇

Stars: ✭ 44 (-35.29%)

Mutual labels: pytorch-nlp

nlp classification

Implementing nlp papers relevant to classification with PyTorch, gluonnlp

Stars: ✭ 224 (+229.41%)

Mutual labels: pytorch-nlp

Code for the paper "Efficient Adaption of Pretrained Transformers for Abstractive Summarization"

Requirements

To run the training script in train.py you will need in addition:

PyTorch (version >=0.4)
tqdm
pyrouge
newsroom
tensorflow (cpu version is ok)
nltk
spacy (and 'en' model)

You can download the weights of the OpenAI pre-trained version by cloning Alec Radford's repo and placing the model folder containing the pre-trained weights in the present repo.

In order to run this code, you will need to pre-process the datasets using bpe through the scripts provided in scripts

Dataset Preprocessing

The training and evaluation scripts expect 3 total output files: train_encoded.jsonl, val_encoded.jsonl, and test_encoded.jsonl

CNN/Daily Mail

The data and splits used in the paper can be downloaded from OpenNMT. First, remove the start and end sentence tags using the sed command in the link provided. To process the data, run the following command:

python scripts/encode_cnndm.py --src_file {source file} --tgt_file {target file} --out_file {output file}

XSum

The data and splits used in the paper can be scraped using XSum. Run the commands up through Extract text from HTML Files section. To process the data, run the following command:

python scripts/encode_xsum.py --summary_dir {summary directory} --splits_file {split file} --train_file {train file} --val_file {val file} --test_file {test_file}

Newsroom

The data and splits used in the paper can be downloaded from Newsroom. To process the data, run the following command:

python scripts/encode_newsroom.py --in_file {input split file} --out_file {output file}

Training

To train a model, run the following command:

python train.py \
  --data_dir {directory containing encoded data} \
  --output_dir {name of folder to save data in} \
  --experiment_name {name of experiment to save data with} \
  --show_progress \
  --doc_model \
  --num_epochs_dat 10 \
  --num_epochs_ft 10 \
  --n_batch 16 \
  --accum_iter 4 \
  --use_pretrain

to train the pre-trained document embedding model over dataset for 10 epochs using domain adaptive training, and 10 epochs using fine tuning. The model will be trained with a effective batch size of 64, since the actual batch size is 16 and we accumulate gradients over 4 batches. Batch size must be divisible by the number of gpus available. Training is currently optimized for multi-gpu usage, and may not work for single gpu machines.

Evaluation

To evaluate a model, run the following command:

python evaluate.py \
  --data_file {path to encoded data file encoded data} \
  --checkpoint {checkpoint to load model weights from} \
  --beam {beam size to do beam search with} \
  --doc_model \
  --save_file {file to output results to} \
  --n_batch {batch size for evaluation, must be divisible by number of gpus}

to evaluate the document embedding model on the test set. Evaluation is currently optimized for multi-gpu usage, and may not work for single gpu machines. Since the evaluation script will leave out some examples if the number of data points isn't divisible by the number of gpus, you might need to run the create_small_test.py script to get the last few files that are being left out and aggregate results at the end.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Andrew03 / transformer-abstractive-summarization

Programming Languages

Labels

Projects that are alternatives of or similar to transformer-abstractive-summarization

Requirements

Dataset Preprocessing

CNN/Daily Mail

XSum

Newsroom

Training

Evaluation