All Projects → boudinfl → Pke

boudinfl / Pke

Licence: gpl-3.0
Python Keyphrase Extraction module

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Pke

Forte
Forte is a flexible and powerful NLP builder FOR TExt. This is part of the CASL project: http://casl-project.ai/
Stars: ✭ 89 (-89.59%)
Mutual labels:  information-retrieval, natural-language-processing
Drl4nlp.scratchpad
Notes on Deep Reinforcement Learning for Natural Language Processing papers
Stars: ✭ 26 (-96.96%)
Mutual labels:  information-retrieval, natural-language-processing
Awesome Hungarian Nlp
A curated list of NLP resources for Hungarian
Stars: ✭ 121 (-85.85%)
Mutual labels:  information-retrieval, natural-language-processing
Scdv
Text classification with Sparse Composite Document Vectors.
Stars: ✭ 54 (-93.68%)
Mutual labels:  information-retrieval, natural-language-processing
Cdqa
⛔ [NOT MAINTAINED] An End-To-End Closed Domain Question Answering System.
Stars: ✭ 500 (-41.52%)
Mutual labels:  information-retrieval, natural-language-processing
Gensim
Topic Modelling for Humans
Stars: ✭ 12,763 (+1392.75%)
Mutual labels:  information-retrieval, natural-language-processing
Neuralqa
NeuralQA: A Usable Library for Question Answering on Large Datasets with BERT
Stars: ✭ 185 (-78.36%)
Mutual labels:  information-retrieval, natural-language-processing
Vec4ir
Word Embeddings for Information Retrieval
Stars: ✭ 188 (-78.01%)
Mutual labels:  information-retrieval, natural-language-processing
Awesome Persian Nlp Ir
Curated List of Persian Natural Language Processing and Information Retrieval Tools and Resources
Stars: ✭ 460 (-46.2%)
Mutual labels:  information-retrieval, natural-language-processing
Catalyst
Accelerated deep learning R&D
Stars: ✭ 2,804 (+227.95%)
Mutual labels:  information-retrieval, natural-language-processing
Talisman
Straightforward fuzzy matching, information retrieval and NLP building blocks for JavaScript.
Stars: ✭ 584 (-31.7%)
Mutual labels:  information-retrieval, natural-language-processing
Deep Semantic Similarity Model
My Keras implementation of the Deep Semantic Similarity Model (DSSM)/Convolutional Latent Semantic Model (CLSM) described here: http://research.microsoft.com/pubs/226585/cikm2014_cdssm_final.pdf.
Stars: ✭ 509 (-40.47%)
Mutual labels:  information-retrieval, natural-language-processing
Knowledge Graphs
A collection of research on knowledge graphs
Stars: ✭ 845 (-1.17%)
Mutual labels:  information-retrieval, natural-language-processing
Entity Recognition Datasets
A collection of corpora for named entity recognition (NER) and entity recognition tasks. These annotated datasets cover a variety of languages, domains and entity types.
Stars: ✭ 891 (+4.21%)
Mutual labels:  natural-language-processing
Spacy Transformers
🛸 Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy
Stars: ✭ 919 (+7.49%)
Mutual labels:  natural-language-processing
String To Tree Nmt
Source code and data for the paper "Towards String-to-Tree Neural Machine Translation"
Stars: ✭ 16 (-98.13%)
Mutual labels:  natural-language-processing
Mesimp
Codes for "Training Simplification and Model Simplification for Deep Learning: A Minimal Effort Back Propagation Method"
Stars: ✭ 16 (-98.13%)
Mutual labels:  natural-language-processing
Date Info
API to let user fetch the events that happen(ed) on a specific date
Stars: ✭ 7 (-99.18%)
Mutual labels:  information-retrieval
Covid 19 Bert Researchpapers Semantic Search
BERT semantic search engine for searching literature research papers for coronavirus covid-19 in google colab
Stars: ✭ 23 (-97.31%)
Mutual labels:  natural-language-processing
Awesome Ai Ml Dl
Awesome Artificial Intelligence, Machine Learning and Deep Learning as we learn it. Study notes and a curated list of awesome resources of such topics.
Stars: ✭ 831 (-2.81%)
Mutual labels:  natural-language-processing

pke - python keyphrase extraction

pke is an open source python-based keyphrase extraction toolkit. It provides an end-to-end keyphrase extraction pipeline in which each component can be easily modified or extended to develop new models. pke also allows for easy benchmarking of state-of-the-art keyphrase extraction models, and ships with supervised models trained on the SemEval-2010 dataset.

Build Status

Table of Contents

Installation

To pip install pke from github:

pip install git+https://github.com/boudinfl/pke.git

pke also requires external resources that can be obtained using:

python -m nltk.downloader stopwords
python -m nltk.downloader universal_tagset
python -m spacy download en # download the english model

As of April 2019, pke only supports Python 3.6+.

Minimal example

pke provides a standardized API for extracting keyphrases from a document. Start by typing the 5 lines below. For using another model, simply replace pke.unsupervised.TopicRank with another model (list of implemented models).

import pke

# initialize keyphrase extraction model, here TopicRank
extractor = pke.unsupervised.TopicRank()

# load the content of the document, here document is expected to be in raw
# format (i.e. a simple text file) and preprocessing is carried out using spacy
extractor.load_document(input='/path/to/input.txt', language='en')

# keyphrase candidate selection, in the case of TopicRank: sequences of nouns
# and adjectives (i.e. `(Noun|Adj)*`)
extractor.candidate_selection()

# candidate weighting, in the case of TopicRank: using a random walk algorithm
extractor.candidate_weighting()

# N-best selection, keyphrases contains the 10 highest scored candidates as
# (keyphrase, score) tuples
keyphrases = extractor.get_n_best(n=10)

A detailed example is provided in the examples/ directory.

Getting started

Tutorials and code documentation are available at https://boudinfl.github.io/pke/.

Implemented models

pke currently implements the following keyphrase extraction models:

Citing pke

If you use pke, please cite the following paper:

@InProceedings{boudin:2016:COLINGDEMO,
  author    = {Boudin, Florian},
  title     = {pke: an open source python-based keyphrase extraction toolkit},
  booktitle = {Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: System Demonstrations},
  month     = {December},
  year      = {2016},
  address   = {Osaka, Japan},
  pages     = {69--73},
  url       = {http://aclweb.org/anthology/C16-2015}
}
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].