All Projects → rowanz → Grover

rowanz / Grover

Licence: apache-2.0
Code for Defending Against Neural Fake News, https://rowanzellers.com/grover/

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Grover

amrlib
A python library that makes AMR parsing, generation and visualization simple.
Stars: ✭ 107 (-86.18%)
Mutual labels:  text-generation
Gpt2client
✍🏻 gpt2-client: Easy-to-use TensorFlow Wrapper for GPT-2 117M, 345M, 774M, and 1.5B Transformer Models 🤖 📝
Stars: ✭ 322 (-58.4%)
Mutual labels:  text-generation
Textgenrnn
Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code.
Stars: ✭ 4,584 (+492.25%)
Mutual labels:  text-generation
skip-thought-gan
Generating Text through Adversarial Training(GAN) using Skip-Thought Vectors
Stars: ✭ 44 (-94.32%)
Mutual labels:  text-generation
Kenlg Reading
Reading list for knowledge-enhanced text generation, with a survey
Stars: ✭ 257 (-66.8%)
Mutual labels:  text-generation
Nlp Projects
word2vec, sentence2vec, machine reading comprehension, dialog system, text classification, pretrained language model (i.e., XLNet, BERT, ELMo, GPT), sequence labeling, information retrieval, information extraction (i.e., entity, relation and event extraction), knowledge graph, text generation, network embedding
Stars: ✭ 360 (-53.49%)
Mutual labels:  text-generation
pytorch-transformer-chatbot
PyTorch v1.2에서 생긴 Transformer API 를 이용한 간단한 Chitchat 챗봇
Stars: ✭ 44 (-94.32%)
Mutual labels:  text-generation
Cdial Gpt
A Large-scale Chinese Short-Text Conversation Dataset and Chinese pre-training dialog models
Stars: ✭ 596 (-23%)
Mutual labels:  text-generation
Accelerated Text
Accelerated Text is a no-code natural language generation platform. It will help you construct document plans which define how your data is converted to textual descriptions varying in wording and structure.
Stars: ✭ 256 (-66.93%)
Mutual labels:  text-generation
Gpt2 Chinese
Chinese version of GPT2 training code, using BERT tokenizer.
Stars: ✭ 4,592 (+493.28%)
Mutual labels:  text-generation
text-generation-transformer
text generation based on transformer
Stars: ✭ 36 (-95.35%)
Mutual labels:  text-generation
Textbox
TextBox is an open-source library for building text generation system.
Stars: ✭ 257 (-66.8%)
Mutual labels:  text-generation
Awesome Text Generation
A curated list of recent models of text generation and application
Stars: ✭ 370 (-52.2%)
Mutual labels:  text-generation
ebe-dataset
Evidence-based Explanation Dataset (AACL-IJCNLP 2020)
Stars: ✭ 16 (-97.93%)
Mutual labels:  text-generation
Textgan Pytorch
TextGAN is a PyTorch framework for Generative Adversarial Networks (GANs) based text generation models.
Stars: ✭ 479 (-38.11%)
Mutual labels:  text-generation
download-tweets-ai-text-gen-plus
Python script to download public Tweets from a given Twitter account into a format suitable for AI text generation
Stars: ✭ 26 (-96.64%)
Mutual labels:  text-generation
Tg Reading List
A text generation reading list maintained by Tsinghua Natural Language Processing Group.
Stars: ✭ 352 (-54.52%)
Mutual labels:  text-generation
Texar Pytorch
Integrating the Best of TF into PyTorch, for Machine Learning, Natural Language Processing, and Text Generation. This is part of the CASL project: http://casl-project.ai/
Stars: ✭ 636 (-17.83%)
Mutual labels:  text-generation
Leakgan
The codes of paper "Long Text Generation via Adversarial Training with Leaked Information" on AAAI 2018. Text generation using GAN and Hierarchical Reinforcement Learning.
Stars: ✭ 533 (-31.14%)
Mutual labels:  text-generation
Paperrobot
Code for PaperRobot: Incremental Draft Generation of Scientific Ideas
Stars: ✭ 372 (-51.94%)
Mutual labels:  text-generation

Grover

UPDATE, Sept 17 2019. We got into NeurIPS (camera ready coming soon!) and we've made Grover-Mega publicly available without you needing to fill out the form. You can download it using download_model.py.

(aka, code for Defending Against Neural Fake News)

Grover is a model for Neural Fake News -- both generation and detection. However, it probably can also be used for other generation tasks.

Visit our project page at rowanzellers.com/grover, the AI2 online demo, or read the full paper at arxiv.org/abs/1905.12616.

teaser

What's in this repo?

We are releasing the following:

  • Code for the Grover generator (in lm/). This involves training the model as a language model across fields.
  • Code for the Grover discriminator in discrimination/. Without much changing, you can run Grover as a discriminator to detect Neural Fake News.
  • Code for generating from a Grover model, in sample/.
  • Code for making your own RealNews dataset in realnews/.
  • Model checkpoints freely available online for all of the Grover models. For using the RealNews dataset for research, please submit this form and message me on contact me on Twitter or through email. You will need to use a valid account that has google cloud enabled, otherwise, I won't be able to give you access 😢

Scroll down 👇 for some easy-to-use instructions for setting up Grover to generate news articles.

Setting up your environment

NOTE: If you just care about making your own RealNews dataset, you will need to set up your environment separately just for that, using an AWS machine (see realnews/.)

There are a few ways you can run Grover:

  • Generation mode (inference). This requires a GPU because I wasn't able to get top-p sampling, or caching of transformer hidden states, to work on a TPU.
  • LM Validation mode (perplexity). This could be run on a GPU or a TPU, but I've only tested this with TPU inference.
  • LM Training mode. This requires a large TPU pod.
  • Discrimination mode (training). This requires a TPU pod.
  • Discrimination mode (inference). This could be run on a GPU or a TPU, but I've only tested this with TPU inference.

NOTE: You might be able to get things to work using different hardware. However, it might be a lot of work engineering wise and I don't recommend it if possible. Please don't contact me with requests like this, as there's not much help I can give you.

I used Python3.6 for everything. Usually I set it up using the following commands:

curl -o ~/miniconda.sh -O  https://repo.continuum.io/miniconda/Miniconda3-4.5.4-Linux-x86_64.sh  && \
     chmod +x ~/miniconda.sh && \
     ~/miniconda.sh -b -p ~/conda && \
     rm ~/miniconda.sh && \
     ~/conda/bin/conda install -y python=3.6

Then pip install -r requirements-gpu.txt if you're installing on a GPU, or pip install requirements-tpu.txt for TPU.

Misc notes/tips:

  • If you have a lot of projects on your machine, you might want to use an anaconda environment to handle them all. Use conda create -n grover python=3.6 to create an environment named grover. To enter the environment use source activate grover. To leave use source deactivate.
  • I'm using tensorflow 1.13.1 which requires Cuda 10.0. You'll need to install that from the nvidia website. I usually install it into /usr/local/cuda-10.0/, so you will need to run export LD_LIBRARY_PATH=/usr/local/cuda-10.0/lib64 so tensorflow knows where to find it.
  • I always have my pythonpath as the root directory. While in the grover directory, run export PYTHONPATH=$(pwd) to set it.

Quickstart: setting up Grover for generation!

  1. Set up your environment. Here's the easy way, assuming anaconda is installed: conda create -y -n grover python=3.6 && source activate grover && pip install -r requirements-gpu.txt
  2. Download the model using python download_model.py base
  3. Now generate: PYTHONPATH=$(pwd) python sample/contextual_generate.py -model_config_fn lm/configs/base.json -model_ckpt models/base/model.ckpt -metadata_fn sample/april2019_set_mini.jsonl -out_fn april2019_set_mini_out.jsonl

Congrats! You can view the generations, conditioned on the domain/headline/date/authors, in april2019_set_mini_out.jsonl.

FAQ: What's the deal with the release of Grover?

Our core position is that it is important to release possibly-dangerous models to researchers. At the same time, we believe Grover-Mega isn't particularly useful to anyone who isn't doing research in this area, particularly as we have an online web demo available and the model is computationally expensive. We previously were a bit stricter and limited initial use of Grover-Mega to researchers. Now that several months have passed since we put the paper on arxiv, and since several other large-scale language models have been publicly released, we figured that there is little harm in fully releasing Grover-Mega.

Bibtex

@inproceedings{zellers2019grover,
    title={Defending Against Neural Fake News},
    author={Zellers, Rowan and Holtzman, Ari and Rashkin, Hannah and Bisk, Yonatan and Farhadi, Ali and Roesner, Franziska and Choi, Yejin},
    booktitle={Advances in Neural Information Processing Systems 32},
    year={2019}
}
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].