chakki-works / Sumeval
Licence: apache-2.0
Well tested & Multi-language evaluation framework for text summarization.
Stars: ✭ 483
Programming Languages
python
139335 projects - #7 most used programming language
Projects that are alternatives of or similar to Sumeval
nlp-akash
Natural Language Processing notes and implementations.
Stars: ✭ 66 (-86.34%)
Mutual labels: text-summarization
gazeta
Gazeta: Dataset for automatic summarization of Russian news / Газета: набор данных для автоматического реферирования на русском языке
Stars: ✭ 25 (-94.82%)
Mutual labels: text-summarization
TextRank-node
No description or website provided.
Stars: ✭ 21 (-95.65%)
Mutual labels: text-summarization
Intelligent Document Finder
Document Search Engine Tool
Stars: ✭ 45 (-90.68%)
Mutual labels: text-summarization
Entity2Topic
[NAACL2018] Entity Commonsense Representation for Neural Abstractive Summarization
Stars: ✭ 20 (-95.86%)
Mutual labels: text-summarization
Text-Summarization
Abstractive and Extractive Text summarization using Transformers.
Stars: ✭ 38 (-92.13%)
Mutual labels: text-summarization
Macropodus
自然语言处理工具Macropodus,基于Albert+BiLSTM+CRF深度学习网络架构,中文分词,词性标注,命名实体识别,新词发现,关键词,文本摘要,文本相似度,科学计算器,中文数字阿拉伯数字(罗马数字)转换,中文繁简转换,拼音转换。tookit(tool) of NLP,CWS(chinese word segnment),POS(Part-Of-Speech Tagging),NER(name entity recognition),Find(new words discovery),Keyword(keyword extraction),Summarize(text summarization),Sim(text similarity),Calculate(scientific calculator),Chi2num(chinese number to arabic number)
Stars: ✭ 309 (-36.02%)
Mutual labels: text-summarization
NLP Toolkit
Library of state-of-the-art models (PyTorch) for NLP tasks
Stars: ✭ 92 (-80.95%)
Mutual labels: text-summarization
Text-Summarization
A text document will be provided and it'll produce it's summary
Stars: ✭ 30 (-93.79%)
Mutual labels: text-summarization
Text-Summarization-Repo
텍스트 요약 분야의 주요 연구 주제, Must-read Papers, 이용 가능한 model 및 data 등을 추천 자료와 함께 정리한 저장소입니다.
Stars: ✭ 213 (-55.9%)
Mutual labels: text-summarization
xl-sum
This repository contains the code, data, and models of the paper titled "XL-Sum: Large-Scale Multilingual Abstractive Summarization for 44 Languages" published in Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021.
Stars: ✭ 160 (-66.87%)
Mutual labels: text-summarization
summarize-webpage
A small NLP SAAS project that summarize a webpage
Stars: ✭ 34 (-92.96%)
Mutual labels: text-summarization
DocSum
A tool to automatically summarize documents abstractively using the BART or PreSumm Machine Learning Model.
Stars: ✭ 58 (-87.99%)
Mutual labels: text-summarization
awesome-text-summarization
Text summarization starting from scratch.
Stars: ✭ 86 (-82.19%)
Mutual labels: text-summarization
allsummarizer
Multilingual automatic text summarizer using statistical approach and extraction
Stars: ✭ 28 (-94.2%)
Mutual labels: text-summarization
Text summurization abstractive methods
Multiple implementations for abstractive text summurization , using google colab
Stars: ✭ 359 (-25.67%)
Mutual labels: text-summarization
Keras Text Summarization
Text summarization using seq2seq in Keras
Stars: ✭ 260 (-46.17%)
Mutual labels: text-summarization
abtextsum
Abstractive text summarization based on deep learning and semantic content generalization
Stars: ✭ 14 (-97.1%)
Mutual labels: text-summarization
evaluation framework for Text Summarization.
Well tested & Multi-language
- Well tested
- The ROUGE-X scores are tested compare with original Perl script (ROUGE-1.5.5.pl).
- The BLEU score is calculated by SacréBLEU, that produces the same values as official script (
mteval-v13a.pl
) used by WMT.
- Multi-language
- Not only English, Japanese are also supported. The other language is extensible easily.
Of course, implementation is Pure Python!
How to use
from sumeval.metrics.rouge import RougeCalculator
rouge = RougeCalculator(stopwords=True, lang="en")
rouge_1 = rouge.rouge_n(
summary="I went to the Mars from my living town.",
references="I went to Mars",
n=1)
rouge_2 = rouge.rouge_n(
summary="I went to the Mars from my living town.",
references=["I went to Mars", "It's my living town"],
n=2)
rouge_l = rouge.rouge_l(
summary="I went to the Mars from my living town.",
references=["I went to Mars", "It's my living town"])
# You need spaCy to calculate ROUGE-BE
rouge_be = rouge.rouge_be(
summary="I went to the Mars from my living town.",
references=["I went to Mars", "It's my living town"])
print("ROUGE-1: {}, ROUGE-2: {}, ROUGE-L: {}, ROUGE-BE: {}".format(
rouge_1, rouge_2, rouge_l, rouge_be
).replace(", ", "\n"))
from sumeval.metrics.bleu import BLEUCalculator
bleu = BLEUCalculator()
score = bleu.bleu("I am waiting on the beach",
"He is walking on the beach")
bleu_ja = BLEUCalculator(lang="ja")
score_ja = bleu_ja.bleu("私はビーチで待ってる", "彼がベンチで待ってる")
From the command line
sumeval r-nlb "I'm living New York its my home town so awesome" "My home town is awesome"
output.
{
"options": {
"stopwords": true,
"stemming": false,
"word_limit": -1,
"length_limit": -1,
"alpha": 0.5,
"input-summary": "I'm living New York its my home town so awesome",
"input-references": [
"My home town is awesome"
]
},
"averages": {
"ROUGE-1": 0.7499999999999999,
"ROUGE-2": 0.6666666666666666,
"ROUGE-L": 0.7499999999999999,
"ROUGE-BE": 0
},
"scores": [
{
"ROUGE-1": 0.7499999999999999,
"ROUGE-2": 0.6666666666666666,
"ROUGE-L": 0.7499999999999999,
"ROUGE-BE": 0
}
]
}
Undoubtedly you can use file input. Please see more detail by sumeval -h
.
Install
pip install sumeval
Dependencies
- BLEU is depends on SacréBLEU
- To calculate
ROUGE-BE
,spaCy
is required. - To use lang
ja
,janome
orMeCab
is required.- Especially to get score of
ROUGE-BE
,GiNZA
is needed additionally.
- Especially to get score of
- To use lang
zh
,jieba
is required.- Especially to get score of
ROUGE-BE
,pyhanlp
is needed additionally.
- Especially to get score of
Test
sumeval
uses two packages to test the score.
-
pythonrouge
- It calls original perl script
pip install git+https://github.com/tagucci/pythonrouge.git
-
rougescore
- It's simple python implementation for rouge score
pip install git+git://github.com/bdusell/rougescore.git
Welcome Contribution 🎉
Add supported language
The tokenization and dependency parse process for each language is located on sumeval/metrics/lang
.
You can make language class by inheriting BaseLang
.
Note that the project description data, including the texts, logos, images, and/or trademarks,
for each open source project belongs to its rightful owner.
If you wish to add or remove any projects, please contact us at [email protected].