All Projects → Glorf → Recipenlg

Glorf / Recipenlg

Set of scripts and notebooks used to produce results visible in RecipeNLG paper

Projects that are alternatives of or similar to Recipenlg

Augmentor
Image augmentation library in Python for machine learning.
Stars: ✭ 4,594 (+847.22%)
Mutual labels:  jupyter-notebook
Nothotdog Classifier
What would you say if I told you there is a app on the market that tell you if you have a hotdog or not a hotdog.
Stars: ✭ 479 (-1.24%)
Mutual labels:  jupyter-notebook
Practical Statistics For Data Scientists
Code repository for O'Reilly book
Stars: ✭ 475 (-2.06%)
Mutual labels:  jupyter-notebook
Introtodeeplearning
Lab Materials for MIT 6.S191: Introduction to Deep Learning
Stars: ✭ 4,955 (+921.65%)
Mutual labels:  jupyter-notebook
Pretty Midi
Utility functions for handling MIDI data in a nice/intuitive way.
Stars: ✭ 477 (-1.65%)
Mutual labels:  jupyter-notebook
Monk v1
Monk is a low code Deep Learning tool and a unified wrapper for Computer Vision.
Stars: ✭ 480 (-1.03%)
Mutual labels:  jupyter-notebook
Deeplearning.ai Natural Language Processing Specialization
This repository contains my full work and notes on Coursera's NLP Specialization (Natural Language Processing) taught by the instructor Younes Bensouda Mourri and Łukasz Kaiser offered by deeplearning.ai
Stars: ✭ 473 (-2.47%)
Mutual labels:  jupyter-notebook
Csrnet Pytorch
CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes
Stars: ✭ 484 (-0.21%)
Mutual labels:  jupyter-notebook
Epipolarpose
Self-Supervised Learning of 3D Human Pose using Multi-view Geometry (CVPR2019)
Stars: ✭ 477 (-1.65%)
Mutual labels:  jupyter-notebook
Python For Probability Statistics And Machine Learning
Jupyter Notebooks for Springer book "Python for Probability, Statistics, and Machine Learning"
Stars: ✭ 481 (-0.82%)
Mutual labels:  jupyter-notebook
Docs
TensorFlow documentation
Stars: ✭ 4,999 (+930.72%)
Mutual labels:  jupyter-notebook
Tracking My Phone
🕵️‍♀️ 监视我的手机:数据都去哪儿了?
Stars: ✭ 476 (-1.86%)
Mutual labels:  jupyter-notebook
Bayesian Stats Modelling Tutorial
How to do Bayesian statistical modelling using numpy and PyMC3
Stars: ✭ 480 (-1.03%)
Mutual labels:  jupyter-notebook
Jupytext
Jupyter Notebooks as Markdown Documents, Julia, Python or R scripts
Stars: ✭ 4,969 (+924.54%)
Mutual labels:  jupyter-notebook
Stn.pytorch
pytorch version of spatial transformer networks
Stars: ✭ 481 (-0.82%)
Mutual labels:  jupyter-notebook
Co Learning Lounge
Welcome to the one point community-driven encyclopedia for anything in technology.
Stars: ✭ 473 (-2.47%)
Mutual labels:  jupyter-notebook
Bayesian Analysis Recipes
A collection of Bayesian data analysis recipes using PyMC3
Stars: ✭ 479 (-1.24%)
Mutual labels:  jupyter-notebook
Tensorflow Book
Accompanying source code for Machine Learning with TensorFlow. Refer to the book for step-by-step explanations.
Stars: ✭ 4,448 (+817.11%)
Mutual labels:  jupyter-notebook
Deep Learning Specialization Coursera
Deep Learning Specialization by Andrew Ng on Coursera.
Stars: ✭ 483 (-0.41%)
Mutual labels:  jupyter-notebook
Jupyterhub Deploy Docker
Reference deployment of JupyterHub with docker
Stars: ✭ 479 (-1.24%)
Mutual labels:  jupyter-notebook

RecipeNLG: A Cooking Recipes Dataset for Semi-Structured Text Generation

This is an archive of code which was used to produce dataset and results available in our INLG 2020 paper: RecipeNLG: A Cooking Recipes Dataset for Semi-Structured Text Generation

What's exciting about it?

The dataset we publish contains 2231142 cooking recipes (>2 millions). It's processed in more careful way and provides more samples than any other dataset in the area.

Where is the dataset?

Please visit the website of our project: recipenlg.cs.put.poznan.pl to download it.
NOTE: The dataset contains all the data we gathered including from other datasets. To access only our gathered recipes (with no 12 instead of 1/2 etc), filter the dataset for source=Gathered. It results in approx 1.6M recipes of better quality.

I've used the dataset in my research. How to cite you?

Use the following BibTeX entry:

@inproceedings{bien-etal-2020-recipenlg,
    title = "{R}ecipe{NLG}: A Cooking Recipes Dataset for Semi-Structured Text Generation",
    author = "Bie{\'n}, Micha{\l}  and
      Gilski, Micha{\l}  and
      Maciejewska, Martyna  and
      Taisner, Wojciech  and
      Wisniewski, Dawid  and
      Lawrynowicz, Agnieszka",
    booktitle = "Proceedings of the 13th International Conference on Natural Language Generation",
    month = dec,
    year = "2020",
    address = "Dublin, Ireland",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/2020.inlg-1.4",
    pages = "22--28",
}

Where are your models?

The pyTorch model is available in HuggingFace model hub as mbien/recipenlg. You can therefore easily import it into your solution as follows:

from transformers import AutoTokenizer, AutoModelWithLMHead
tokenizer = AutoTokenizer.from_pretrained("mbien/recipenlg")
model = AutoModelWithLMHead.from_pretrained("mbien/recipenlg")

You can also check the generation performance interactively on our website (link above).
The SpaCy NER model is available in the ner directory

Could you explain X and Y?

Yes, sure! If you feel some information is missing in our paper, please check first in our thesis, which is much more detailed. In case of further questions, you're invited to send us a github issue, we will respond as fast as we can!

How to run the code?

We worked on the project interactively, and our core result is a new dataset. That's why the repo is rather a set of loosely connected python files and jupyter notebooks than a working runnable solution itself. However if you feel some part crucial for the reproduction is missing or you are dedicated to make the experience smoother, send us a feature request or (preferably), a pull request.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].