All Projects → sebastianruder → Nlp Progress

sebastianruder / Nlp Progress

Licence: mit
Repository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the most common NLP tasks.

Programming Languages

python
139335 projects - #7 most used programming language
HTML
75241 projects
ruby
36898 projects - #4 most used programming language

Projects that are alternatives of or similar to Nlp Progress

Spark Nlp
State of the Art Natural Language Processing
Stars: ✭ 2,518 (-87.1%)
Mutual labels:  natural-language-processing, named-entity-recognition, machine-translation
Nlg Eval
Evaluation code for various unsupervised automated metrics for Natural Language Generation.
Stars: ✭ 822 (-95.79%)
Mutual labels:  dialogue, natural-language-processing, machine-translation
Vncorenlp
A Vietnamese natural language processing toolkit (NAACL 2018)
Stars: ✭ 354 (-98.19%)
Mutual labels:  natural-language-processing, named-entity-recognition
Msr Nlp Projects
This is a list of open-source projects at Microsoft Research NLP Group
Stars: ✭ 92 (-99.53%)
Mutual labels:  dialogue, natural-language-processing
InformationExtractionSystem
Information Extraction System can perform NLP tasks like Named Entity Recognition, Sentence Simplification, Relation Extraction etc.
Stars: ✭ 27 (-99.86%)
Mutual labels:  named-entity-recognition, nlp-tasks
Rnnlg
RNNLG is an open source benchmark toolkit for Natural Language Generation (NLG) in spoken dialogue system application domains. It is released by Tsung-Hsien (Shawn) Wen from Cambridge Dialogue Systems Group under Apache License 2.0.
Stars: ✭ 487 (-97.5%)
Mutual labels:  dialogue, natural-language-processing
Dialogue Understanding
This repository contains PyTorch implementation for the baseline models from the paper Utterance-level Dialogue Understanding: An Empirical Study
Stars: ✭ 77 (-99.61%)
Mutual labels:  dialogue, natural-language-processing
SequenceToSequence
A seq2seq with attention dialogue/MT model implemented by TensorFlow.
Stars: ✭ 11 (-99.94%)
Mutual labels:  machine-translation, dialogue
Spacy Lookup
Named Entity Recognition based on dictionaries
Stars: ✭ 212 (-98.91%)
Mutual labels:  natural-language-processing, named-entity-recognition
Spacy Streamlit
👑 spaCy building blocks and visualizers for Streamlit apps
Stars: ✭ 360 (-98.16%)
Mutual labels:  natural-language-processing, named-entity-recognition
Trade Dst
Source code for transferable dialogue state generator (TRADE, Wu et al., 2019). https://arxiv.org/abs/1905.08743
Stars: ✭ 287 (-98.53%)
Mutual labels:  dialogue, natural-language-processing
Ner
Named Entity Recognition
Stars: ✭ 288 (-98.52%)
Mutual labels:  natural-language-processing, named-entity-recognition
Multiwoz
Source code for end-to-end dialogue model from the MultiWOZ paper (Budzianowski et al. 2018, EMNLP)
Stars: ✭ 384 (-98.03%)
Mutual labels:  dialogue, natural-language-processing
Pytorch Bert Crf Ner
KoBERT와 CRF로 만든 한국어 개체명인식기 (BERT+CRF based Named Entity Recognition model for Korean)
Stars: ✭ 236 (-98.79%)
Mutual labels:  natural-language-processing, named-entity-recognition
Dilated Cnn Ner
Dilated CNNs for NER in TensorFlow
Stars: ✭ 222 (-98.86%)
Mutual labels:  natural-language-processing, named-entity-recognition
Tod Bert
Pre-Trained Models for ToD-BERT
Stars: ✭ 143 (-99.27%)
Mutual labels:  dialogue, natural-language-processing
Bytenet Tensorflow
ByteNet for character-level language modelling
Stars: ✭ 319 (-98.37%)
Mutual labels:  natural-language-processing, machine-translation
Pytorch graph Rel
A PyTorch implementation of GraphRel
Stars: ✭ 204 (-98.95%)
Mutual labels:  natural-language-processing, named-entity-recognition
Hardware Aware Transformers
[ACL 2020] HAT: Hardware-Aware Transformers for Efficient Natural Language Processing
Stars: ✭ 206 (-98.94%)
Mutual labels:  natural-language-processing, machine-translation
Chatbot ner
chatbot_ner: Named Entity Recognition for chatbots.
Stars: ✭ 273 (-98.6%)
Mutual labels:  natural-language-processing, named-entity-recognition

Tracking Progress in Natural Language Processing

Table of contents

English

Vietnamese

Hindi

Chinese

For more tasks, datasets and results in Chinese, check out the Chinese NLP website.

French

Russian

Spanish

Portuguese

Korean

Nepali

Bengali

Persian

Turkish

German

This document aims to track the progress in Natural Language Processing (NLP) and give an overview of the state-of-the-art (SOTA) across the most common NLP tasks and their corresponding datasets.

It aims to cover both traditional and core NLP tasks such as dependency parsing and part-of-speech tagging as well as more recent ones such as reading comprehension and natural language inference. The main objective is to provide the reader with a quick overview of benchmark datasets and the state-of-the-art for their task of interest, which serves as a stepping stone for further research. To this end, if there is a place where results for a task are already published and regularly maintained, such as a public leaderboard, the reader will be pointed there.

If you want to find this document again in the future, just go to nlpprogress.com or nlpsota.com in your browser.

Contributing

Guidelines

Results   Results reported in published papers are preferred; an exception may be made for influential preprints.

Datasets   Datasets should have been used for evaluation in at least one published paper besides the one that introduced the dataset.

Code   We recommend to add a link to an implementation if available. You can add a Code column (see below) to the table if it does not exist. In the Code column, indicate an official implementation with Official. If an unofficial implementation is available, use Link (see below). If no implementation is available, you can leave the cell empty.

Adding a new result

If you would like to add a new result, you can just click on the small edit button in the top-right corner of the file for the respective task (see below).

Click on the edit button to add a file

This allows you to edit the file in Markdown. Simply add a row to the corresponding table in the same format. Make sure that the table stays sorted (with the best result on top). After you've made your change, make sure that the table still looks ok by clicking on the "Preview changes" tab at the top of the page. If everything looks good, go to the bottom of the page, where you see the below form.

Fill out the file change information

Add a name for your proposed change, an optional description, indicate that you would like to "Create a new branch for this commit and start a pull request", and click on "Propose file change".

Adding a new dataset or task

For adding a new dataset or task, you can also follow the steps above. Alternatively, you can fork the repository. In both cases, follow the steps below:

  1. If your task is completely new, create a new file and link to it in the table of contents above.
  2. If not, add your task or dataset to the respective section of the corresponding file (in alphabetical order).
  3. Briefly describe the dataset/task and include relevant references.
  4. Describe the evaluation setting and evaluation metric.
  5. Show how an annotated example of the dataset/task looks like.
  6. Add a download link if available.
  7. Copy the below table and fill in at least two results (including the state-of-the-art) for your dataset/task (change Score to the metric of your dataset). If your dataset/task has multiple metrics, add them to the right of Score.
  8. Submit your change as a pull request.
Model Score Paper / Source Code

Wish list

These are tasks and datasets that are still missing:

  • Bilingual dictionary induction
  • Discourse parsing
  • Keyphrase extraction
  • Knowledge base population (KBP)
  • More dialogue tasks
  • Semi-supervised learning
  • Frame-semantic parsing (FrameNet full-sentence analysis)

Exporting into a structured format

You can extract all the data into a structured, machine-readable JSON format with parsed tasks, descriptions and SOTA tables.

The instructions are in structured/README.md.

Instructions for building the site locally

Instructions for building the website locally using Jekyll can be found here.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].