Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → chakki-works → Sumeval

chakki-works / Sumeval

Licence: apache-2.0

Well tested & Multi-language evaluation framework for text summarization.

Programming Languages

139335 projects - #7 most used programming language

Labels

machine-learning text-summarization

Projects that are alternatives of or similar to Sumeval

Natural Language Processing notes and implementations.

Stars: ✭ 66 (-86.34%)

Mutual labels: text-summarization

Gazeta: Dataset for automatic summarization of Russian news / Газета: набор данных для автоматического реферирования на русском языке

Stars: ✭ 25 (-94.82%)

Mutual labels: text-summarization

No description or website provided.

Stars: ✭ 21 (-95.65%)

Mutual labels: text-summarization

Intelligent Document Finder

Document Search Engine Tool

Stars: ✭ 45 (-90.68%)

Mutual labels: text-summarization

[NAACL2018] Entity Commonsense Representation for Neural Abstractive Summarization

Stars: ✭ 20 (-95.86%)

Mutual labels: text-summarization

TextRank implementation for C#

Stars: ✭ 29 (-94%)

Mutual labels: text-summarization

Text-Summarization

Abstractive and Extractive Text summarization using Transformers.

Stars: ✭ 38 (-92.13%)

Mutual labels: text-summarization

自然语言处理工具Macropodus，基于Albert+BiLSTM+CRF深度学习网络架构，中文分词，词性标注，命名实体识别，新词发现，关键词，文本摘要，文本相似度，科学计算器，中文数字阿拉伯数字(罗马数字)转换，中文繁简转换，拼音转换。tookit(tool) of NLP，CWS(chinese word segnment)，POS(Part-Of-Speech Tagging)，NER(name entity recognition)，Find(new words discovery)，Keyword(keyword extraction)，Summarize(text summarization)，Sim(text similarity)，Calculate(scientific calculator)，Chi2num(chinese number to arabic number)

Stars: ✭ 309 (-36.02%)

Mutual labels: text-summarization

Library of state-of-the-art models (PyTorch) for NLP tasks

Stars: ✭ 92 (-80.95%)

Mutual labels: text-summarization

Text-Summarization

A text document will be provided and it'll produce it's summary

Stars: ✭ 30 (-93.79%)

Mutual labels: text-summarization

Text-Summarization-Repo

텍스트 요약 분야의 주요 연구 주제, Must-read Papers, 이용 가능한 model 및 data 등을 추천 자료와 함께 정리한 저장소입니다.

Stars: ✭ 213 (-55.9%)

Mutual labels: text-summarization

This repository contains the code, data, and models of the paper titled "XL-Sum: Large-Scale Multilingual Abstractive Summarization for 44 Languages" published in Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021.

Stars: ✭ 160 (-66.87%)

Mutual labels: text-summarization

summarize-webpage

A small NLP SAAS project that summarize a webpage

Stars: ✭ 34 (-92.96%)

Mutual labels: text-summarization

A tool to automatically summarize documents abstractively using the BART or PreSumm Machine Learning Model.

Stars: ✭ 58 (-87.99%)

Mutual labels: text-summarization

awesome-text-summarization

Text summarization starting from scratch.

Stars: ✭ 86 (-82.19%)

Mutual labels: text-summarization

Multilingual automatic text summarizer using statistical approach and extraction

Stars: ✭ 28 (-94.2%)

Mutual labels: text-summarization

In a nutshell, this is a Text Summarizer

Stars: ✭ 29 (-94%)

Mutual labels: text-summarization

Text summurization abstractive methods

Multiple implementations for abstractive text summurization , using google colab

Stars: ✭ 359 (-25.67%)

Mutual labels: text-summarization

Keras Text Summarization

Text summarization using seq2seq in Keras

Stars: ✭ 260 (-46.17%)

Mutual labels: text-summarization

Abstractive text summarization based on deep learning and semantic content generalization

Stars: ✭ 14 (-97.1%)

Mutual labels: text-summarization

View All Similar Projects ➔

Well tested & Multi-language
evaluation framework for Text Summarization.

Well tested
- The ROUGE-X scores are tested compare with original Perl script (ROUGE-1.5.5.pl).
- The BLEU score is calculated by SacréBLEU, that produces the same values as official script (mteval-v13a.pl) used by WMT.
Multi-language
- Not only English, Japanese are also supported. The other language is extensible easily.

Of course, implementation is Pure Python!

How to use

from sumeval.metrics.rouge import RougeCalculator


rouge = RougeCalculator(stopwords=True, lang="en")

rouge_1 = rouge.rouge_n(
            summary="I went to the Mars from my living town.",
            references="I went to Mars",
            n=1)

rouge_2 = rouge.rouge_n(
            summary="I went to the Mars from my living town.",
            references=["I went to Mars", "It's my living town"],
            n=2)

rouge_l = rouge.rouge_l(
            summary="I went to the Mars from my living town.",
            references=["I went to Mars", "It's my living town"])

# You need spaCy to calculate ROUGE-BE

rouge_be = rouge.rouge_be(
            summary="I went to the Mars from my living town.",
            references=["I went to Mars", "It's my living town"])

print("ROUGE-1: {}, ROUGE-2: {}, ROUGE-L: {}, ROUGE-BE: {}".format(
    rouge_1, rouge_2, rouge_l, rouge_be
).replace(", ", "\n"))

from sumeval.metrics.bleu import BLEUCalculator


bleu = BLEUCalculator()
score = bleu.bleu("I am waiting on the beach",
                  "He is walking on the beach")

bleu_ja = BLEUCalculator(lang="ja")
score_ja = bleu_ja.bleu("私はビーチで待ってる", "彼がベンチで待ってる")

From the command line

sumeval r-nlb "I'm living New York its my home town so awesome" "My home town is awesome"

output.

{
  "options": {
    "stopwords": true,
    "stemming": false,
    "word_limit": -1,
    "length_limit": -1,
    "alpha": 0.5,
    "input-summary": "I'm living New York its my home town so awesome",
    "input-references": [
      "My home town is awesome"
    ]
  },
  "averages": {
    "ROUGE-1": 0.7499999999999999,
    "ROUGE-2": 0.6666666666666666,
    "ROUGE-L": 0.7499999999999999,
    "ROUGE-BE": 0
  },
  "scores": [
    {
      "ROUGE-1": 0.7499999999999999,
      "ROUGE-2": 0.6666666666666666,
      "ROUGE-L": 0.7499999999999999,
      "ROUGE-BE": 0
    }
  ]
}

Undoubtedly you can use file input. Please see more detail by sumeval -h.

Install

pip install sumeval

Dependencies

BLEU is depends on SacréBLEU
To calculate ROUGE-BE, spaCy is required.
To use lang ja, janome or MeCab is required.
- Especially to get score of ROUGE-BE, GiNZA is needed additionally.
To use lang zh, jieba is required.
- Especially to get score of ROUGE-BE, pyhanlp is needed additionally.

Test

sumeval uses two packages to test the score.

pythonrouge
- It calls original perl script
- pip install git+https://github.com/tagucci/pythonrouge.git
rougescore
- It's simple python implementation for rouge score
- pip install git+git://github.com/bdusell/rougescore.git

Welcome Contribution 🎉

Add supported language

The tokenization and dependency parse process for each language is located on sumeval/metrics/lang.

You can make language class by inheriting BaseLang.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 483

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (3) 🔗