Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → EdinburghNLP → Xsum

EdinburghNLP / Xsum

Licence: mit

Topic-Aware Convolutional Neural Networks for Extreme Summarization

Programming Languages

139335 projects - #7 most used programming language

Labels

convolutional-neural-networks

Projects that are alternatives of or similar to Xsum

Image To 3d Bbox

Build a CNN network to predict 3D bounding box of car from 2D image.

Stars: ✭ 200 (-9.5%)

Mutual labels: convolutional-neural-networks

MetalImage is more faster and powerful than opengles for iOS. It is very similar to GPUImage framework, but perform a better 3D rendering and multithreads computing abilities.

Stars: ✭ 207 (-6.33%)

Mutual labels: convolutional-neural-networks

Weakly Supervised Segmentation with Tensorflow. Implements instance segmentation as described in Simple Does It: Weakly Supervised Instance and Semantic Segmentation, by Khoreva et al. (CVPR 2017).

Stars: ✭ 212 (-4.07%)

Mutual labels: convolutional-neural-networks

Cnn Relation Extraction

Tensorflow Implementation of Convolutional Neural Network for Relation Extraction (COLING 2014, NAACL 2015)

Stars: ✭ 203 (-8.14%)

Mutual labels: convolutional-neural-networks

Caffe Deepbinarycode

Supervised Semantics-preserving Deep Hashing (TPAMI18)

Stars: ✭ 206 (-6.79%)

Mutual labels: convolutional-neural-networks

Cnn From Scratch

A scratch implementation of Convolutional Neural Network in Python using only numpy and validated over CIFAR-10 & MNIST Dataset

Stars: ✭ 210 (-4.98%)

Mutual labels: convolutional-neural-networks

Cnn 3d Images Tensorflow

3D image classification using CNN (Convolutional Neural Network)

Stars: ✭ 199 (-9.95%)

Mutual labels: convolutional-neural-networks

Deep Alignment Network A Convolutional Neural Network For Robust Face Alignment

This is a Tensorflow implementations of paper "Deep Alignment Network: A convolutional neural network for robust face alignment".

Stars: ✭ 219 (-0.9%)

Mutual labels: convolutional-neural-networks

A Complete and Simple Implementation of MobileNet-V2 in PyTorch

Stars: ✭ 206 (-6.79%)

Mutual labels: convolutional-neural-networks

Transfer Learning Suite

Transfer Learning Suite in Keras. Perform transfer learning using any built-in Keras image classification model easily!

Stars: ✭ 212 (-4.07%)

Mutual labels: convolutional-neural-networks

Neural networks toolbox focused on medical image analysis

Stars: ✭ 203 (-8.14%)

Mutual labels: convolutional-neural-networks

Graph Cnn In 3d Point Cloud Classification

Code for A GRAPH-CNN FOR 3D POINT CLOUD CLASSIFICATION (ICASSP 2018)

Stars: ✭ 206 (-6.79%)

Mutual labels: convolutional-neural-networks

Im2latex Tensorflow

Tensorflow implementation of the HarvardNLP paper - What You Get Is What You See: A Visual Markup Decompiler (https://arxiv.org/pdf/1609.04938v1.pdf)

Stars: ✭ 207 (-6.33%)

Mutual labels: convolutional-neural-networks

iSeeBetter: Spatio-Temporal Video Super Resolution using Recurrent-Generative Back-Projection Networks | Python3 | PyTorch | GANs | CNNs | ResNets | RNNs | Published in Springer Journal of Computational Visual Media, September 2020, Tsinghua University Press

Stars: ✭ 202 (-8.6%)

Mutual labels: convolutional-neural-networks

Grad cam plus plus

A generalized gradient-based CNN visualization technique

Stars: ✭ 216 (-2.26%)

Mutual labels: convolutional-neural-networks

Traffic Sign Detection

Traffic Sign Detection. Code for the paper entitled "Evaluation of deep neural networks for traffic sign detection systems".

Stars: ✭ 200 (-9.5%)

Mutual labels: convolutional-neural-networks

CNN image style transfer 🎨.

Stars: ✭ 210 (-4.98%)

Mutual labels: convolutional-neural-networks

Retrieval 2017 Cam

Class-Weighted Convolutional Features for Image Retrieval (BMVC 2017)

Stars: ✭ 219 (-0.9%)

Mutual labels: convolutional-neural-networks

MatConvNet implementation for incorporating a 3D Morphable Model (3DMM) into a Spatial Transformer Network (STN)

Stars: ✭ 218 (-1.36%)

Mutual labels: convolutional-neural-networks

Dynamic Training Bench

Simplify the training and tuning of Tensorflow models

Stars: ✭ 210 (-4.98%)

Mutual labels: convolutional-neural-networks

View All Similar Projects ➔

Extreme Summarization

This repository contains data and code for our EMNLP 2018 paper "Don't Give Me the Details, Just the Summary! Topic-Aware Convolutional Neural Networks for Extreme Summarization". Please contact me at [email protected] for any question.

Please cite this paper if you use our code or data.

@InProceedings{xsum-emnlp,
  author =      "Shashi Narayan and Shay B. Cohen and Mirella Lapata",
  title =       "Don't Give Me the Details, Just the Summary! {T}opic-Aware Convolutional Neural Networks for Extreme Summarization",
  booktitle =   "Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing ",
  year =        "2018",
  address =     "Brussels, Belgium",
}

Extreme Summarization (XSum) dataset

You can always build the dataset using the instructions below. The original dataset is also available upon request.

Instructions to download and preprocess the extreme summarization dataset are here.

Looking for a Running Demo of Our System?

A running demo of our abstractive system can be found here.

Pretrained models and Test Predictions (Narayan et al., EMNLP 2018)

Pretrained ConvS2S model and dictionary files (1.1GB)
Pretrained Topic-ConvS2S model and dictionary files (1.2GB)
Pretrained Gensim LDA model (200MB)
Our model Predictions
Human Evaluation Data

Topic-Aware Convolutional Model for Extreme Summarization

This repository contains PyTorch code for our Topic-ConvS2S model. Our code builds on an earlier copy of Facebook AI Research Sequence-to-Sequence Toolkit.

We also release the code for the ConvS2S model. It uses optimized hyperparameters for extreme summarization. Our release facilitates the replication of our experiments, such as training from scratch or predicting with released pretrained models, as reported in the paper.

Installation

Our code requires PyTorch version 0.4.0 or 0.4.1. Please follow the instructions here: https://github.com/pytorch/pytorch#installation.

After PyTorch is installed, you can install ConvS2S and Topic-ConvS2S:

# Install ConvS2S
cd ./XSum-ConvS2S
pip install -r requirements.txt
python setup.py build
python setup.py develop

# Install Topic-ConvS2S
cd ../XSum-Topic-ConvS2S
pip install -r requirements.txt
python setup.py build
python setup.py develop

Training a New Model

Data Preprocessing

We partition the extracted datset into training, development and test sets. The input document is truncated to 400 tokens and the length of the summary is limited to 90 tokens. Both document and summary files are lowercased.

ConvS2S

python scripts/xsum-preprocessing-convs2s.py

It generates the following files in the "data-convs2s" directory:

train.document and train.summary
validation.document and validation.summary
test.document and test.summary

Lines in document and summary files are paired as (input document, corresponding output summary).

TEXT=./data-convs2s
python XSum-ConvS2S/preprocess.py --source-lang document --target-lang summary --trainpref $TEXT/train --validpref $TEXT/validation --testpref $TEXT/test --destdir ./data-convs2s-bin --joined-dictionary --nwordstgt 50000 --nwordssrc 50000

This will create binarized data that will be used for model training. It also generates source and target dictionary files. In this case, both files are identical (due to "--joined-dictionary") and have 50000 tokens.

Topic-ConvS2S

python scripts/xsum-preprocessing-topic-convs2s.py

It generates the following files in the "data-topic-convs2s" directory:

train.document, train.summary, train.document-lemma and train.doc-topics
validation.document, validation.summary, validation.document-lemma and validation.doc-topics
test.document, test.summary, test.document-lemma and test.doc-topics

Lines in document, summary, document-lemma and doc-topics files are paired as (input document, output summary, input lemmatized document, document topic vector).

TEXT=./data-topic-convs2s
python XSum-Topic-ConvS2S/preprocess.py --source-lang document --target-lang summary --trainpref $TEXT/train --validpref $TEXT/validation --testpref $TEXT/test --destdir ./data-topic-convs2s --joined-dictionary --nwordstgt 50000 --nwordssrc 50000 --output-format raw

This will generate source and target dictionary files. In this case, both files are identical (due to "--joined-dictionary") and have 50000 tokens. It operates on the raw format data.

Model Training

By default, the code will use all available GPUs on your machine. We have used CUDA_VISIBLE_DEVICES environment variable to select specific GPU(s).

ConvS2S

CUDA_VISIBLE_DEVICES=1 python XSum-ConvS2S/train.py ./data-convs2s-bin --source-lang document --target-lang summary --max-sentences 32 --arch fconv --criterion label_smoothed_cross_entropy --max-epoch 200 --clip-norm 0.1 --lr 0.10 --dropout 0.2 --save-dir ./checkpoints-convs2s --no-progress-bar --log-interval 10

Topic-ConvS2S

CUDA_VISIBLE_DEVICES=1 python XSum-Topic-ConvS2S/train.py ./data-topic-convs2s --source-lang document --target-lang summary --doctopics doc-topics --max-sentences 32 --arch fconv --criterion label_smoothed_cross_entropy --max-epoch 200 --clip-norm 0.1 --lr 0.10 --dropout 0.2 --save-dir ./checkpoints-topic-convs2s --no-progress-bar --log-interval 10

Generation with Pre-trained Models

ConvS2S

CUDA_VISIBLE_DEVICES=1 python XSum-ConvS2S/generate.py ./data-convs2s --path ./checkpoints-convs2s/checkpoint-best.pt --batch-size 1 --beam 10 --replace-unk --source-lang document --target-lang summary > test-output-convs2s-checkpoint-best.pt

Make sure that ./data-convs2s also has the source and target dictionary files.

Topic-ConvS2S

CUDA_VISIBLE_DEVICES=1 python XSum-Topic-ConvS2S/generate.py ./data-topic-convs2s --path ./checkpoints-topic-convs2s/checkpoint_best.pt --batch-size 1 --beam 10 --replace-unk --source-lang document --target-lang summary --doctopics doc-topics --encoder-embed-dim 512 > test-output-topic-convs2s-checkpoint-best.pt

Make sure that ./data-topic-convs2s has the test files to decode, the source and target dictionary files.

Extract final hypothesis

python scripts/extract-hypothesis-fairseq.py -o test-output-convs2s-checkpoint-best.pt -f final-test-output-convs2s-checkpoint-best.pt
python scripts/extract-hypothesis-fairseq.py -o test-output-topic-convs2s-checkpoint-best.pt -f final-test-output-topic-convs2s-checkpoint-best.pt

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 221

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (15) 🔗