All Projects → IlyaGusev → gazeta

IlyaGusev / gazeta

Licence: other
Gazeta: Dataset for automatic summarization of Russian news / Газета: набор данных для автоматического реферирования на русском языке

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to gazeta

xl-sum
This repository contains the code, data, and models of the paper titled "XL-Sum: Large-Scale Multilingual Abstractive Summarization for 44 Languages" published in Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021.
Stars: ✭ 160 (+540%)
Mutual labels:  text-summarization, abstractive-text-summarization, abstractive-summarization, text-summarisation, summarization-corpora, summarization-dataset
DocSum
A tool to automatically summarize documents abstractively using the BART or PreSumm Machine Learning Model.
Stars: ✭ 58 (+132%)
Mutual labels:  text-summarization, summarization, abstractive-text-summarization, abstractive-summarization
PlanSum
[AAAI2021] Unsupervised Opinion Summarization with Content Planning
Stars: ✭ 25 (+0%)
Mutual labels:  text-summarization, summarization, abstractive-text-summarization, abstractive-summarization
Copycat-abstractive-opinion-summarizer
ACL 2020 Unsupervised Opinion Summarization as Copycat-Review Generation
Stars: ✭ 76 (+204%)
Mutual labels:  summarization, abstractive-text-summarization, abstractive-summarization
Entity2Topic
[NAACL2018] Entity Commonsense Representation for Neural Abstractive Summarization
Stars: ✭ 20 (-20%)
Mutual labels:  text-summarization, summarization, abstractive-summarization
nlp-akash
Natural Language Processing notes and implementations.
Stars: ✭ 66 (+164%)
Mutual labels:  text-summarization, summarization
TextRank-node
No description or website provided.
Stars: ✭ 21 (-16%)
Mutual labels:  text-summarization, summarization
seq3
Source code for the NAACL 2019 paper "SEQ^3: Differentiable Sequence-to-Sequence-to-Sequence Autoencoder for Unsupervised Abstractive Sentence Compression"
Stars: ✭ 121 (+384%)
Mutual labels:  summarization, abstractive-summarization
Transformersum
Models to perform neural summarization (extractive and abstractive) using machine learning transformers and a tool to convert abstractive summarization datasets to the extractive task.
Stars: ✭ 107 (+328%)
Mutual labels:  text-summarization, summarization
awesome-text-summarization
Text summarization starting from scratch.
Stars: ✭ 86 (+244%)
Mutual labels:  text-summarization, abstractive-summarization
data-summ-cnn dailymail
non-anonymized cnn/dailymail dataset for text summarization
Stars: ✭ 12 (-52%)
Mutual labels:  summarization, abstractive-text-summarization
Textrank
TextRank implementation for Python 3.
Stars: ✭ 1,008 (+3932%)
Mutual labels:  text-summarization, summarization
Text summarization with tensorflow
Implementation of a seq2seq model for summarization of textual data. Demonstrated on amazon reviews, github issues and news articles.
Stars: ✭ 226 (+804%)
Mutual labels:  text-summarization, summarization
factsumm
FactSumm: Factual Consistency Scorer for Abstractive Summarization
Stars: ✭ 83 (+232%)
Mutual labels:  summarization, abstractive-summarization
Pythonrouge
Python wrapper for evaluating summarization quality by ROUGE package
Stars: ✭ 155 (+520%)
Mutual labels:  text-summarization, summarization
Text-Summarization
Abstractive and Extractive Text summarization using Transformers.
Stars: ✭ 38 (+52%)
Mutual labels:  text-summarization, abstractive-summarization
ds
👨‍🔬 In Russian: Обновляемая структурированная подборка бесплатных ресурсов по тематикам Data Science: курсы, книги, открытые данные, блоги и готовые решения.
Stars: ✭ 102 (+308%)
Mutual labels:  russian-language
code summarization public
source code for 'Improving automatic source code summarization via deep reinforcement learning'
Stars: ✭ 71 (+184%)
Mutual labels:  summarization
Intelligent Document Finder
Document Search Engine Tool
Stars: ✭ 45 (+80%)
Mutual labels:  text-summarization
FewSum
Few-shot learning framework for opinion summarization published at EMNLP 2020.
Stars: ✭ 29 (+16%)
Mutual labels:  summarization

Gazeta dataset

Paper: Dataset for Automatic Summarization of Russian News

Download

Dropbox:

UPDATE:

Other sources:

Trained MBART model:

https://huggingface.co/IlyaGusev/mbart_ru_sum_gazeta

Additional notes

  • Legal basis for distribution of the dataset:
  • Cleaning: Open In Colab
  • Data analysis: Open In Colab
  • Summarization methods: Open In Colab
  • Other Russian summarization datasets:

Contacts

Citation

@InProceedings{Gusev2020gazeta,
    author="Gusev, Ilya",
    title="Dataset for Automatic Summarization of Russian News",
    booktitle="Artificial Intelligence and Natural Language",
    year="2020",
    publisher="Springer International Publishing",
    address="Cham",
    pages="{122--134}",
    isbn="978-3-030-59082-6",
    doi={10.1007/978-3-030-59082-6_9}
}
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].