All Projects β†’ isomap β†’ factedit

isomap / factedit

Licence: other
🧐 Code & Data for Fact-based Text Editing (Iso et al; ACL 2020)

Programming Languages

python
139335 projects - #7 most used programming language
Jsonnet
166 projects

Projects that are alternatives of or similar to factedit

Accelerated Text
Accelerated Text is a no-code natural language generation platform. It will help you construct document plans which define how your data is converted to textual descriptions varying in wording and structure.
Stars: ✭ 256 (+1500%)
Mutual labels:  text-generation, natural-language-generation, nlg
Kenlg Reading
Reading list for knowledge-enhanced text generation, with a survey
Stars: ✭ 257 (+1506.25%)
Mutual labels:  text-generation, natural-language-generation, nlg
Awesome Nlg
A curated list of resources dedicated to Natural Language Generation (NLG)
Stars: ✭ 211 (+1218.75%)
Mutual labels:  natural-language-generation, nlg
PlanSum
[AAAI2021] Unsupervised Opinion Summarization with Content Planning
Stars: ✭ 25 (+56.25%)
Mutual labels:  text-generation, natural-language-generation
RapLyrics-Back
Model training, custom generative function and training for raplyrics.eu - A rap music lyrics generation project
Stars: ✭ 14 (-12.5%)
Mutual labels:  text-generation, nlg
awesome-nlg
A curated list of resources dedicated to Natural Language Generation (NLG)
Stars: ✭ 386 (+2312.5%)
Mutual labels:  natural-language-generation, nlg
Nlg Eval
Evaluation code for various unsupervised automated metrics for Natural Language Generation.
Stars: ✭ 822 (+5037.5%)
Mutual labels:  natural-language-generation, nlg
Entity2Topic
[NAACL2018] Entity Commonsense Representation for Neural Abstractive Summarization
Stars: ✭ 20 (+25%)
Mutual labels:  text-generation, natural-language-generation
uctf
Unsupervised Controllable Text Generation (Applied to text Formalization)
Stars: ✭ 19 (+18.75%)
Mutual labels:  natural-language-generation, nlg
Describing a knowledge base
Code for Describing a Knowledge Base
Stars: ✭ 42 (+162.5%)
Mutual labels:  text-generation, natural-language-generation
Paperrobot
Code for PaperRobot: Incremental Draft Generation of Scientific Ideas
Stars: ✭ 372 (+2225%)
Mutual labels:  text-generation, natural-language-generation
spring
SPRING is a seq2seq model for Text-to-AMR and AMR-to-Text (AAAI2021).
Stars: ✭ 103 (+543.75%)
Mutual labels:  natural-language-generation, data-to-text
Simplenlg
Java API for Natural Language Generation. Originally developed by Ehud Reiter at the University of Aberdeen’s Department of Computing Science and co-founder of Arria NLG. This git repo is the official SimpleNLG version.
Stars: ✭ 708 (+4325%)
Mutual labels:  natural-language-generation, nlg
Practical Pytorch
Go to https://github.com/pytorch/tutorials - this repo is deprecated and no longer maintained
Stars: ✭ 4,329 (+26956.25%)
Mutual labels:  natural-language-generation, nlg
Gluon Nlp
NLP made easy
Stars: ✭ 2,344 (+14550%)
Mutual labels:  natural-language-generation, nlg
Question generation
Neural question generation using transformers
Stars: ✭ 356 (+2125%)
Mutual labels:  natural-language-generation, nlg
syntaxmaker
The NLG tool for Finnish
Stars: ✭ 19 (+18.75%)
Mutual labels:  natural-language-generation, nlg
Gumbel-CRF
Implementation of NeurIPS 20 paper: Latent Template Induction with Gumbel-CRFs
Stars: ✭ 51 (+218.75%)
Mutual labels:  text-generation, data-to-text
transformer-drg-style-transfer
This repository have scripts and Jupyter-notebooks to perform all the different steps involved in Transforming Delete, Retrieve, Generate Approach for Controlled Text Style Transfer
Stars: ✭ 97 (+506.25%)
Mutual labels:  text-generation, nlg
question generator
An NLP system for generating reading comprehension questions
Stars: ✭ 188 (+1075%)
Mutual labels:  natural-language-generation, nlg

Fact-based Text Editing

Conference arXiv Slide

Code and Datasets for Fact-based Text Editing (Iso et al; ACL 2020).

Dataset

Datasets are created from publicly availlable table-to-text datasets. The dataset created from "webnlg" referred to as "webedit", and the dataset created from "rotowire(-modified)" referred to as the "rotoedit" data.

To extract the data, run tar -jxvf webedit.tar.bz2 to form a webedit/ directory (and similarly for rotoedit.tar.bz2).

Model overview

The model, which we call FactEditor, consists of three components, a buffer for storing the draft text and its representations, a stream for storing the revised text and its representations, and a triples for storing the triples and their representations.

FactEditor scans the text in the buffer, copies the parts of text from the buffer into the stream if they are described in the triples in the memory, deletes the parts of the text if they are not mentioned in the triples, and inserts new parts of next into the stream which is only presented in the triples.

Usage

Dependencies

  • The code was written for Python 3.X and requires AllenNLP.
  • Dependencies can be installed using requirements.txt.

Training

Set your config file path and serialization dirctory as environment variables:

export CONFIG=<path to the config file>
export SERIALIZATION_DIR=<path to the serialization_dir>

Then you can train FactEditor:

allennlp train $CONFIG \
            -s $SERIALIZATION_DIR \
            --include-package editor

For example, the following is the sample script for training the model with WebEdit dataset:

allennlp train config/webedit.jsonnet \
            -s models/webedit \
            --include-package editor 

Decoding

Set the dataset you want to decode and the model checkpoint you want to use as environment variables:

export INPUT_FILE=<path to the dev/test file>
export ARCHIVE_FILE=<path to the model archive file>

Then you can decode with FactEditor:

python predict.py $INPUT_FILE \
                  $ARCHIVE_FILE \
                  --cuda_device -1

To run on a GPU, run with --cuda_device 0 (or any other CUDA devices).

To run the model with a pretrained checkpoint the development set of WebEdit data:

python predict.py ./data/webedit/dev.jsonl \
                  ./models/webedit.tar.gz \
                  --cuda_device -1

References

@InProceedings{iso2020fact,
    author = {Iso, Hayate and
              Qiao, Chao and
              Li, Hang},
    title = {Fact-based Text Editing},
    booktitle = {Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL)},
    pages={171--182},
    year = {2020}
  }
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].