All Projects → magic282 → Neusum

magic282 / Neusum

Code for the ACL 2018 paper "Neural Document Summarization by Jointly Learning to Score and Select Sentences"

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Neusum

Summarization Papers
Summarization Papers
Stars: ✭ 238 (+66.43%)
Mutual labels:  natural-language-processing, summarization
Pythonrouge
Python wrapper for evaluating summarization quality by ROUGE package
Stars: ✭ 155 (+8.39%)
Mutual labels:  natural-language-processing, summarization
Text summarization with tensorflow
Implementation of a seq2seq model for summarization of textual data. Demonstrated on amazon reviews, github issues and news articles.
Stars: ✭ 226 (+58.04%)
Mutual labels:  natural-language-processing, summarization
Textrank
TextRank implementation for Python 3.
Stars: ✭ 1,008 (+604.9%)
Mutual labels:  natural-language-processing, summarization
Nlp Papers
Papers and Book to look at when starting NLP 📚
Stars: ✭ 111 (-22.38%)
Mutual labels:  natural-language-processing, summarization
Pytextrank
Python implementation of TextRank for phrase extraction and summarization of text documents
Stars: ✭ 1,675 (+1071.33%)
Mutual labels:  natural-language-processing, summarization
Unified Summarization
Official codes for the paper: A Unified Model for Extractive and Abstractive Summarization using Inconsistency Loss.
Stars: ✭ 114 (-20.28%)
Mutual labels:  natural-language-processing, summarization
Files2rouge
Calculating ROUGE score between two files (line-by-line)
Stars: ✭ 120 (-16.08%)
Mutual labels:  natural-language-processing, summarization
Zamia Ai
Free and open source A.I. system based on Python, TensorFlow and Prolog.
Stars: ✭ 133 (-6.99%)
Mutual labels:  natural-language-processing
Deeplearning.ai
Stars: ✭ 139 (-2.8%)
Mutual labels:  natural-language-processing
Awesome Ai Services
An overview of the AI-as-a-service landscape
Stars: ✭ 133 (-6.99%)
Mutual labels:  natural-language-processing
Cocoaai
🤖 The Cocoa Artificial Intelligence Lab
Stars: ✭ 134 (-6.29%)
Mutual labels:  natural-language-processing
Learn To Select Data
Code for Learning to select data for transfer learning with Bayesian Optimization
Stars: ✭ 140 (-2.1%)
Mutual labels:  natural-language-processing
Tokenizer
Fast and customizable text tokenization library with BPE and SentencePiece support
Stars: ✭ 132 (-7.69%)
Mutual labels:  natural-language-processing
Paper Survey
📚Survey of previous research and related works on machine learning (especially Deep Learning) in Japanese
Stars: ✭ 140 (-2.1%)
Mutual labels:  natural-language-processing
Scattertext Pydata
Notebooks for the Seattle PyData 2017 talk on Scattertext
Stars: ✭ 132 (-7.69%)
Mutual labels:  natural-language-processing
Uda
Unsupervised Data Augmentation (UDA)
Stars: ✭ 1,877 (+1212.59%)
Mutual labels:  natural-language-processing
Onnxt5
Summarization, translation, sentiment-analysis, text-generation and more at blazing speed using a T5 version implemented in ONNX.
Stars: ✭ 143 (+0%)
Mutual labels:  summarization
Nlpaug
Data augmentation for NLP
Stars: ✭ 2,761 (+1830.77%)
Mutual labels:  natural-language-processing
Kaggle Crowdflower
1st Place Solution for CrowdFlower Product Search Results Relevance Competition on Kaggle.
Stars: ✭ 1,708 (+1094.41%)
Mutual labels:  natural-language-processing

NeuSum

This repository contains code for the ACL 2018 paper "Neural Document Summarization by Jointly Learning to Score and Select Sentences"

About this code

PyTorch version: This code requires PyTorch v0.3.x.

Python version: This code requires Python3.

How to run

Prepare the dataset and code

Make a folder for the code and data:

NEUSUM_HOME=~/workspace/neusum
mkdir -p $NEUSUM_HOME/code
cd $NEUSUM_HOME/code
git clone --recursive https://github.com/magic282/NeuSum.git

After preparation, the workspace looks like:

neusum
├── code
│   └── NeuSum
│       └── neusum_pt
│           ├── neusum
│           └── PyRouge
└── data
    └── cnndm
        ├── dev
        ├── glove
        ├── models
        └── train

The paper used CNN / Daily Mail dataset.

About the CNN Daily Mail Dataset

About the CNN Daily Mail Dataset 2

Setup the environment

Package Requirements:

nltk numpy pytorch

Warning: Older versions of NLTK have a bug in the PorterStemmer. Therefore, a fresh installation or update of NLTK is recommended.

A Docker image is also provided.

Docker image

docker pull magic282/pytorch:0.3.0

Run training

The file run.sh is an example. Modify it according to your configuration.

Without Docker

bash $NEUSUM_HOME/code/NeuSum/neusum_pt/run.sh $NEUSUM_HOME/data/cnndm $NEUSUM_HOME/code/NeuSum/neusum_pt

With Docker

nvidia-docker run --rm -ti -v $NEUSUM_HOME:/workspace magic282/pytorch:0.3.0

Then inside the docker:

bash code/NeuSum/neusum_pt/run.sh /workspace/data/cnndm /workspace/code/NeuSum/neusum_pt
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].