0x7o / text2keywords

Licence: MIT license

Trained T5 and T5-large model for creating keywords from text

Programming Languages

python

139335 projects - #7 most used programming language

Projects that are alternatives of or similar to text2keywords

ttt

A package for fine-tuning Transformers with TPUs, written in Tensorflow2.0+

Stars: ✭ 35 (-33.96%)

Mutual labels: transformers, t5

🧑‍🏫 50! Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, ...), gans(cyclegan, stylegan2, ...), 🎮 reinforcement learning (ppo, dqn), capsnet, distillation, ... 🧠

Stars: ✭ 5,720 (+10692.45%)

Mutual labels: transformers, transformer

golgotha

Contextualised Embeddings and Language Modelling using BERT and Friends using R

Stars: ✭ 39 (-26.42%)

Mutual labels: transformers, transformer

t5-japanese

Codes to pre-train Japanese T5 models

Stars: ✭ 39 (-26.42%)

Mutual labels: transformer, t5

question generator

An NLP system for generating reading comprehension questions

Stars: ✭ 188 (+254.72%)

Mutual labels: transformers, t5

COVID-19-Tweet-Classification-using-Roberta-and-Bert-Simple-Transformers

Rank 1 / 216

Stars: ✭ 24 (-54.72%)

Mutual labels: transformers, transformer

trapper

State-of-the-art NLP through transformer models in a modular design and consistent APIs.

Stars: ✭ 28 (-47.17%)

Mutual labels: transformers, transformer

fastT5

⚡ boost inference speed of T5 models by 5x & reduce the model size by 3x.

Stars: ✭ 421 (+694.34%)

Mutual labels: transformer, t5

Transformer-MM-Explainability

[ICCV 2021- Oral] Official PyTorch implementation for Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers, a novel method to visualize any Transformer-based network. Including examples for DETR, VQA.

Stars: ✭ 484 (+813.21%)

Mutual labels: transformers, transformer

Text-Summarization

Abstractive and Extractive Text summarization using Transformers.

Stars: ✭ 38 (-28.3%)

Mutual labels: transformers, t5

chef-transformer

Chef Transformer 🍲 .

Stars: ✭ 29 (-45.28%)

Mutual labels: transformers, t5

MinTL

MinTL: Minimalist Transfer Learning for Task-Oriented Dialogue Systems

Stars: ✭ 61 (+15.09%)

Mutual labels: transformer

GoEmotions-pytorch

Pytorch Implementation of GoEmotions 😍😢😱

Stars: ✭ 95 (+79.25%)

Mutual labels: transformers

BERT-NER

Using pre-trained BERT models for Chinese and English NER with 🤗Transformers

Stars: ✭ 114 (+115.09%)

Mutual labels: transformers

MISE

Multimodal Image Synthesis and Editing: A Survey

Stars: ✭ 214 (+303.77%)

Mutual labels: transformers

CSV2RDF

Streaming, transforming, SPARQL-based CSV to RDF converter. Apache license.

Stars: ✭ 48 (-9.43%)

Mutual labels: transformer

ParsBigBird

Persian Bert For Long-Range Sequences

Stars: ✭ 58 (+9.43%)

Mutual labels: transformers

remixer-pytorch

Implementation of the Remixer Block from the Remixer paper, in Pytorch

Stars: ✭ 37 (-30.19%)

Mutual labels: transformers

OverlapPredator

[CVPR 2021, Oral] PREDATOR: Registration of 3D Point Clouds with Low Overlap.

Stars: ✭ 293 (+452.83%)

Mutual labels: transformer

optimum

🏎️ Accelerate training and inference of 🤗 Transformers with easy to use hardware optimization tools

Stars: ✭ 567 (+969.81%)

Mutual labels: transformers

View All Similar Projects ➔

text to keywords

Trained T5-base and T5-large model for creating keywords from text. Supported languages: ru

Pretraining Large version | Pretraining Base version

habr article

Usage

Example usage (the code returns a list with keywords. duplicates are possible):

pip install transformers sentencepiece

from itertools import groupby
import torch
from transformers import T5ForConditionalGeneration, T5Tokenizer
model_name = "0x7194633/keyt5-large" # or 0x7194633/keyt5-base
tokenizer = T5Tokenizer.from_pretrained(model_name)
model = T5ForConditionalGeneration.from_pretrained(model_name)

def generate(text, **kwargs):
    inputs = tokenizer(text, return_tensors='pt')
    with torch.no_grad():
        hypotheses = model.generate(**inputs, num_beams=5, **kwargs)
    s = tokenizer.decode(hypotheses[0], skip_special_tokens=True)
    s = s.replace('; ', ';').replace(' ;', ';').lower().split(';')[:-1]
    s = [el for el, _ in groupby(s)]
    return s

article = """Reuters сообщил об отмене 3,6 тыс. авиарейсов из-за «омикрона» и погоды
Наибольшее число отмен авиарейсов 2 января пришлось на американские авиакомпании 
SkyWest и Southwest, у каждой — более 400 отмененных рейсов. При этом среди 
отмененных 2 января авиарейсов — более 2,1 тыс. рейсов в США. Также свыше 6400 
рейсов были задержаны."""

print(generate(article, top_p=1.0, max_length=64))  
# ['авиаперевозки', 'отмена авиарейсов', 'отмена рейсов', 'отмена авиарейсов', 'отмена рейсов', 'отмена авиарейсов']

Training

To teach the keyT5-base and keyT5-large models, you will need a table in csv format, like this:

KeyT5 models were trained on ~7000 compressed habr.com articles. data.csv collect.py Exclusively supports the Russian language!

X	Y
Some text that is fed to the input	The text that should come out
Some text that is fed to the input	The text that should come out

Go to the training notebook and learn more about it:

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

0x7o / text2keywords

Programming Languages

Labels

Projects that are alternatives of or similar to text2keywords

text to keywords

Usage

Training