Alternatives and detailed information of ConDigSum

Xinfra Monitor monitors the availability of Kafka clusters by producing synthetic workloads using end-to-end pipelines to obtain derived vital statistics - E2E latency, service produce/consume availability, offsets commit availability & latency, message loss rate and more.

Stars: ✭ 1,817 (+2830.65%)

Mutual labels: topic

Kafkawize

Kafkawize : A Self service Apache Kafka Topic Management tool/portal. A Web application which automates the process of creating and browsing Kafka topics, acls, schemas by introducing roles/authorizations to users of various teams of an org.

Stars: ✭ 79 (+27.42%)

Mutual labels: topic

Swoole Jobs

🚀Dynamic multi process worker queue base on swoole, like gearman but high performance.

Stars: ✭ 574 (+825.81%)

Mutual labels: topic

Kafka Visualizer

A web client for visualizing your Apache Kafka topics live.

Stars: ✭ 98 (+58.06%)

Mutual labels: topic

Logi Kafkamanager

一站式Apache Kafka集群指标监控与运维管控平台

Stars: ✭ 3,280 (+5190.32%)

Mutual labels: topic

Ldagibbssampling

Open Source Package for Gibbs Sampling of LDA

Stars: ✭ 218 (+251.61%)

Mutual labels: topic

Hackerqueue

Your favorite tech sites compiled down to topics you find interesting.

Stars: ✭ 55 (-11.29%)

Mutual labels: topic

Weibo Topic Spider

微博超级话题爬虫，微博词频统计+情感分析+简单分类，新增肺炎超话爬取数据

Stars: ✭ 128 (+106.45%)

Mutual labels: topic

Ieml

IEML semantic language - a meaning-representation system based on semantic primitives and a regular grammar. Basic semantic relationships between concepts are automatically computed from syntactic similarities.

Stars: ✭ 41 (-33.87%)

Mutual labels: topic

Bertopic

Leveraging BERT and c-TF-IDF to create easily interpretable topics.

Stars: ✭ 745 (+1101.61%)

Mutual labels: topic

Qcloud Iot Sdk Embedded C

SDK for connecting to Tencent Cloud IoT from a device using embedded C.

Stars: ✭ 109 (+75.81%)

Mutual labels: topic

Wsify

Just a tiny, simple and real-time self-hosted pub/sub messaging service

Stars: ✭ 452 (+629.03%)

Mutual labels: topic

Proposal Smart Pipelines

Old archived draft proposal for smart pipelines. Go to the new Hack-pipes proposal at js-choi/proposal-hack-pipes.

Stars: ✭ 177 (+185.48%)

Mutual labels: topic

Hacker News Digest

📰 A responsive interface of Hacker News with summaries and thumbnails.

Stars: ✭ 278 (+348.39%)

Mutual labels: topic

Presentations

Holds and organizes all past, present, and future presentations at the meetup

Stars: ✭ 30 (-51.61%)

Mutual labels: topic

Edamontology

EDAM is an ontology of bioinformatics types of data including identifiers, data formats, operations and topics.

Stars: ✭ 80 (+29.03%)

Mutual labels: topic

SRB

Code for "Improving Semantic Relevance for Sequence-to-Sequence Learning of Chinese Social Media Text Summarization"

Stars: ✭ 41 (-33.87%)

Mutual labels: summarization

View All Similar Projects ➔

Topic-Aware Contrastive Learning for Abstractive Dialogue Summarization

Junpeng Liu, Yanyan Zou, Hainan Zhang, Hongshen Chen, Zhuoye Ding, Caixia Yuan, Xiaojie Wang EMNLP 2021 Paper

Requirements and Installation

Conda is highly recommended to manage your Python environment.

Python 3.6
Pytorch >= 1.6.0
Files2ROUGE

pip install --editable ./
pip install requests rouge==1.0.0
pip install transformers==4.4.0 bert-score==0.3.8

Training `ConDigSum` model

Before training ConDigSum, please download BART-Large from here, and update PRETRAIN_PATH to the path of model.pt in training scripts.

For SAMSum and MediaSum datasets, you can download preprocessed data files directly (SAMSum, MediaSum), which results in train_sh/SAMSumInd/ and train_sh/mediasum/.

Change working directory and download:

cd train_sh
wget -N 'https://dl.fbaipublicfiles.com/fairseq/gpt2_bpe/encoder.json'
wget -N 'https://dl.fbaipublicfiles.com/fairseq/gpt2_bpe/vocab.bpe'
wget -N 'https://dl.fbaipublicfiles.com/fairseq/gpt2_bpe/dict.txt'

SAMSum dataset

# SAMSum
./train_samsum.sh [training_comment] [gpu_id]

MediaSum dataset

# MediaSum
./train_mediasum.sh [training_comment] [gpu_id]

Custom dataset

To facilitate training on custom datasets, a demo dataset is provided in train_sh/customdata/ directory, please prepare your own data files following the *.jsonl files. Then, pre-processing steps are as follows:

./bpe.sh
./binarize.sh

Testing `ConDigSum` model

Downloading pretrained ConDigSum models

Pretrained models and predictions are provided at Google Drive: SAMSum, MediaSum. After downloading, train_sh/SAMSum.condigsum/checkpoint_best.pt and train_sh/MediaSum.condigsum/checkpoint_best.pt will be gotten.

Evaluating models

# dataname=SAMSumInd or dataname=mediasum or dataname=customdata
# checkpoint_dir=SAMSum.condigsum or checkpoint_dir=MediaSum.condigsum

# generate predictions
cd train_sh
CUDA_VISIBLE_DEVICES=${GPU} python ./test.py --log_dir ${checkpoint_dir} --dataset ${dataname}

# get file2rouge scores
files2rouge ${dataname}/test.target ${checkpoint_dir}/test.hypo

# calculate bert-score scores
CUDA_VISIBLE_DEVICES=${GPU} bert-score -r ${dataname}/test.target -c ${checkpoint_dir}/test.hypo --lang en --rescale_with_baseline

Citation

@inproceedings{liu-etal-2021-topic-aware,
    title = "Topic-Aware Contrastive Learning for Abstractive Dialogue Summarization",
    author = "Liu, Junpeng  and
      Zou, Yanyan  and
      Zhang, Hainan  and
      Chen, Hongshen  and
      Ding, Zhuoye  and
      Yuan, Caixia  and
      Wang, Xiaojie",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2021",
    month = nov,
    year = "2021",
    address = "Punta Cana, Dominican Republic",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2021.findings-emnlp.106",
    doi = "10.18653/v1/2021.findings-emnlp.106",
    pages = "1229--1243",
    abstract = "Unlike well-structured text, such as news reports and encyclopedia articles, dialogue content often comes from two or more interlocutors, exchanging information with each other. In such a scenario, the topic of a conversation can vary upon progression and the key information for a certain topic is often scattered across multiple utterances of different speakers, which poses challenges to abstractly summarize dialogues. To capture the various topic information of a conversation and outline salient facts for the captured topics, this work proposes two topic-aware contrastive learning objectives, namely coherence detection and sub-summary generation objectives, which are expected to implicitly model the topic change and handle information scattering challenges for the dialogue summarization task. The proposed contrastive objectives are framed as auxiliary tasks for the primary dialogue summarization task, united via an alternative parameter updating strategy. Extensive experiments on benchmark datasets demonstrate that the proposed simple method significantly outperforms strong baselines and achieves new state-of-the-art performance. The code and trained models are publicly available via .",
}

MISC

To install Files2ROUGE on centos system, you may need to install dependencies to avoid some problems.

yum install -y "perl(XML::Parser)"
yum install -y "perl(XML::LibXML)"
yum install -y "perl(DB_File)"

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Junpliu / ConDigSum

Programming Languages

Labels

Projects that are alternatives of or similar to ConDigSum

Topic-Aware Contrastive Learning for Abstractive Dialogue Summarization

Requirements and Installation

Training `ConDigSum` model

SAMSum dataset

MediaSum dataset

Custom dataset

Testing `ConDigSum` model

Downloading pretrained ConDigSum models

Evaluating models

Citation

MISC

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Junpliu / ConDigSum

Programming Languages

Labels

Projects that are alternatives of or similar to ConDigSum

Topic-Aware Contrastive Learning for Abstractive Dialogue Summarization

Requirements and Installation

Training ConDigSum model

SAMSum dataset

MediaSum dataset

Custom dataset

Testing ConDigSum model

Downloading pretrained ConDigSum models

Evaluating models

Citation

MISC

Training `ConDigSum` model

Testing `ConDigSum` model