isomap / factedit

Licence: other

🧐 Code & Data for Fact-based Text Editing (Iso et al; ACL 2020)

Programming Languages

python

139335 projects - #7 most used programming language

Jsonnet

166 projects

Projects that are alternatives of or similar to factedit

Accelerated Text

Accelerated Text is a no-code natural language generation platform. It will help you construct document plans which define how your data is converted to textual descriptions varying in wording and structure.

Stars: ✭ 256 (+1500%)

Mutual labels: text-generation, natural-language-generation, nlg

Kenlg Reading

Reading list for knowledge-enhanced text generation, with a survey

Stars: ✭ 257 (+1506.25%)

Mutual labels: text-generation, natural-language-generation, nlg

Awesome Nlg

A curated list of resources dedicated to Natural Language Generation (NLG)

Stars: ✭ 211 (+1218.75%)

Mutual labels: natural-language-generation, nlg

PlanSum

[AAAI2021] Unsupervised Opinion Summarization with Content Planning

Stars: ✭ 25 (+56.25%)

Mutual labels: text-generation, natural-language-generation

RapLyrics-Back

Model training, custom generative function and training for raplyrics.eu - A rap music lyrics generation project

Stars: ✭ 14 (-12.5%)

Mutual labels: text-generation, nlg

awesome-nlg

A curated list of resources dedicated to Natural Language Generation (NLG)

Stars: ✭ 386 (+2312.5%)

Mutual labels: natural-language-generation, nlg

Nlg Eval

Evaluation code for various unsupervised automated metrics for Natural Language Generation.

Stars: ✭ 822 (+5037.5%)

Mutual labels: natural-language-generation, nlg

Entity2Topic

[NAACL2018] Entity Commonsense Representation for Neural Abstractive Summarization

Stars: ✭ 20 (+25%)

Mutual labels: text-generation, natural-language-generation

uctf

Unsupervised Controllable Text Generation (Applied to text Formalization)

Stars: ✭ 19 (+18.75%)

Mutual labels: natural-language-generation, nlg

Describing a knowledge base

Code for Describing a Knowledge Base

Stars: ✭ 42 (+162.5%)

Mutual labels: text-generation, natural-language-generation

Paperrobot

Code for PaperRobot: Incremental Draft Generation of Scientific Ideas

Stars: ✭ 372 (+2225%)

Mutual labels: text-generation, natural-language-generation

spring

SPRING is a seq2seq model for Text-to-AMR and AMR-to-Text (AAAI2021).

Stars: ✭ 103 (+543.75%)

Mutual labels: natural-language-generation, data-to-text

Simplenlg

Java API for Natural Language Generation. Originally developed by Ehud Reiter at the University of Aberdeen’s Department of Computing Science and co-founder of Arria NLG. This git repo is the official SimpleNLG version.

Stars: ✭ 708 (+4325%)

Mutual labels: natural-language-generation, nlg

Practical Pytorch

Go to https://github.com/pytorch/tutorials - this repo is deprecated and no longer maintained

Stars: ✭ 4,329 (+26956.25%)

Mutual labels: natural-language-generation, nlg

Gluon Nlp

NLP made easy

Stars: ✭ 2,344 (+14550%)

Mutual labels: natural-language-generation, nlg

Question generation

Neural question generation using transformers

Stars: ✭ 356 (+2125%)

Mutual labels: natural-language-generation, nlg

syntaxmaker

The NLG tool for Finnish

Stars: ✭ 19 (+18.75%)

Mutual labels: natural-language-generation, nlg

Gumbel-CRF

Implementation of NeurIPS 20 paper: Latent Template Induction with Gumbel-CRFs

Stars: ✭ 51 (+218.75%)

Mutual labels: text-generation, data-to-text

transformer-drg-style-transfer

This repository have scripts and Jupyter-notebooks to perform all the different steps involved in Transforming Delete, Retrieve, Generate Approach for Controlled Text Style Transfer

Stars: ✭ 97 (+506.25%)

Mutual labels: text-generation, nlg

question generator

An NLP system for generating reading comprehension questions

Stars: ✭ 188 (+1075%)

Mutual labels: natural-language-generation, nlg

View All Similar Projects ➔

Fact-based Text Editing

Code and Datasets for Fact-based Text Editing (Iso et al; ACL 2020).

Dataset

Datasets are created from publicly availlable table-to-text datasets. The dataset created from "webnlg" referred to as "webedit", and the dataset created from "rotowire(-modified)" referred to as the "rotoedit" data.

To extract the data, run tar -jxvf webedit.tar.bz2 to form a webedit/ directory (and similarly for rotoedit.tar.bz2).

Model overview

The model, which we call FactEditor, consists of three components, a buffer for storing the draft text and its representations, a stream for storing the revised text and its representations, and a triples for storing the triples and their representations.

FactEditor scans the text in the buffer, copies the parts of text from the buffer into the stream if they are described in the triples in the memory, deletes the parts of the text if they are not mentioned in the triples, and inserts new parts of next into the stream which is only presented in the triples.

Usage

Dependencies

The code was written for Python 3.X and requires AllenNLP.
Dependencies can be installed using requirements.txt.

Training

Set your config file path and serialization dirctory as environment variables:

export CONFIG=<path to the config file>
export SERIALIZATION_DIR=<path to the serialization_dir>

Then you can train FactEditor:

allennlp train $CONFIG \
            -s $SERIALIZATION_DIR \
            --include-package editor

For example, the following is the sample script for training the model with WebEdit dataset:

allennlp train config/webedit.jsonnet \
            -s models/webedit \
            --include-package editor

Decoding

Set the dataset you want to decode and the model checkpoint you want to use as environment variables:

export INPUT_FILE=<path to the dev/test file>
export ARCHIVE_FILE=<path to the model archive file>

Then you can decode with FactEditor:

python predict.py $INPUT_FILE \
                  $ARCHIVE_FILE \
                  --cuda_device -1

To run on a GPU, run with --cuda_device 0 (or any other CUDA devices).

To run the model with a pretrained checkpoint the development set of WebEdit data:

python predict.py ./data/webedit/dev.jsonl \
                  ./models/webedit.tar.gz \
                  --cuda_device -1

References

@InProceedings{iso2020fact,
    author = {Iso, Hayate and
              Qiao, Chao and
              Li, Hang},
    title = {Fact-based Text Editing},
    booktitle = {Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL)},
    pages={171--182},
    year = {2020}
  }

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

isomap / factedit

Programming Languages

Labels

Projects that are alternatives of or similar to factedit

Fact-based Text Editing

Dataset

Model overview

Usage

Dependencies

Training

Decoding

References