Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → AlirezaTheH → perke

AlirezaTheH / perke

Licence: MIT license

A keyphrase extractor for Persian

Programming Languages

139335 projects - #7 most used programming language

Labels

nlp natural-language-processing information-retrieval text-mining data-mining keyword persian persian-language computational-linguistics text-processing data-processing keyword-extraction keyphrase-extraction keyword-extractor keyphrase keyphrase-extractor perisan-nlp

Projects that are alternatives of or similar to perke

CogComp's light-weight Python NLP annotators

Stars: ✭ 115 (+91.67%)

Mutual labels: text-mining, data-mining, text-processing

PersianStemmer-Python

PersianStemmer-Python

Stars: ✭ 43 (-28.33%)

Mutual labels: information-retrieval, persian, persian-language

Large, curated set of benchmark datasets for evaluating automatic keyphrase extraction algorithms.

Stars: ✭ 125 (+108.33%)

Mutual labels: information-retrieval, keyword-extraction, keyphrase-extraction

corpusexplorer2.0

Korpuslinguistik war noch nie so einfach...

Stars: ✭ 16 (-73.33%)

Mutual labels: text-mining, data-mining, text-processing

advanced-text-mining

TEANAPS 라이브러리를 활용한 자연어 처리와 텍스트 분석 방법론에 대해 다룹니다.

Stars: ✭ 15 (-75%)

Mutual labels: text-mining, data-mining, text-processing

자연어 처리와 텍스트 분석을 위한 오픈소스 파이썬 라이브러리 입니다.

Stars: ✭ 91 (+51.67%)

Mutual labels: text-mining, data-mining, text-processing

RMDL: Random Multimodel Deep Learning for Classification

Stars: ✭ 375 (+525%)

Mutual labels: information-retrieval, text-mining, data-mining

Artificial Adversary

🗣️ Tool to generate adversarial text examples and test machine learning models against them

Stars: ✭ 348 (+480%)

Mutual labels: text-mining, data-mining, text-processing

Extract indicators of compromise from text, including "escaped" ones.

Stars: ✭ 148 (+146.67%)

Mutual labels: text-mining, data-mining, text-processing

📚 social networks from novels

Stars: ✭ 72 (+20%)

Mutual labels: information-retrieval, data-mining

learning2hash.github.io

Website for "A survey of learning to hash for Computer Vision" https://learning2hash.github.io

Stars: ✭ 14 (-76.67%)

Mutual labels: information-retrieval, text-mining

Evildork targeting your fiancee👁️

Stars: ✭ 46 (-23.33%)

Mutual labels: information-retrieval, keyword

ml-nlp-services

机器学习、深度学习、自然语言处理

Stars: ✭ 23 (-61.67%)

Mutual labels: information-retrieval, data-mining

AILA-Artificial-Intelligence-for-Legal-Assistance

Python implementations of the various methods used in FIRE 2019 conference.

Stars: ✭ 39 (-35%)

Mutual labels: information-retrieval, data-mining

Kex is a python library for unsupervised keyword extraction from a document, providing an easy interface and benchmarks on 15 public datasets.

Stars: ✭ 46 (-23.33%)

Mutual labels: information-retrieval, keyword-extraction

Gwu data mining

Materials for GWU DNSC 6279 and DNSC 6290.

Stars: ✭ 217 (+261.67%)

Mutual labels: text-mining, data-mining

A large scale feature extraction tool for text-based machine learning

Stars: ✭ 25 (-58.33%)

Mutual labels: information-retrieval, text-processing

Wordtokenizers.jl

High performance tokenizers for natural language processing and other related tasks

Stars: ✭ 63 (+5%)

Mutual labels: information-retrieval, data-mining

Dan Jurafsky Chris Manning Nlp

My solution to the Natural Language Processing course made by Dan Jurafsky, Chris Manning in Winter 2012.

Stars: ✭ 124 (+106.67%)

Mutual labels: information-retrieval, text-processing

Awesome Hungarian Nlp

A curated list of NLP resources for Hungarian

Stars: ✭ 121 (+101.67%)

Mutual labels: information-retrieval, text-mining

View All Similar Projects ➔

Perke

Perke is a Python keyphrase extraction package for Persian language. It provides an end-to-end keyphrase extraction pipeline in which each component can be easily modified or extended to develop new models.

Installation

The easiest way to install is from PyPI:
```
pip install perke
```
Alternatively, you can install directly from GitHub:
```
pip install git+https://github.com/alirezatheh/perke.git
```
Perke also requires a trained POS tagger model. We use hazm's tagger model. You can easily download latest hazm's resources (tagger and parser models) using the following command:
```
python -m perke download
```
Alternatively, you can use another model with same tag names and structure, and put it in the resources directory.

Simple Example

Perke provides a standardized API for extracting keyphrases from a text. Start by typing the 4 lines below to use TextRank keyphrase extractor.

from perke.unsupervised.graph_based import TextRank

# Define the set of valid part of speech tags to occur in the model.
valid_pos_tags = {'N', 'Ne', 'AJ', 'AJe'}

# 1. Create a TextRank extractor.
extractor = TextRank(valid_pos_tags=valid_pos_tags)

# 2. Load the text.
extractor.load_text(input='text or path/to/input_file',
                    word_normalization_method=None)

# 3. Build the graph representation of the text and weight the
#    words. Keyphrase candidates are composed from the 33 percent
#    highest weighted words.
extractor.weight_candidates(window_size=2, top_t_percent=0.33)

# 4. Get the 10 highest weighted candidates as keyphrases.
keyphrases = extractor.get_n_best(n=10)

For other models, see the examples directory.

Documentation

Documentation and references are available at Read The Docs.

Implemented Models

Perke currently, implements the following keyphrase extraction models:

Unsupervised models
- Graph-based models
  - TextRank: article by Mihalcea and Tarau, 2004
  - SingleRank: article by Wan and Xiao, 2008
  - TopicRank: article by Bougouin, Boudin and Daille, 2013
  - PositionRank: article by Florescu and Caragea, 2017
  - MultipartiteRank: article by Boudin, 2018

Acknowledgements

Perke is inspired by pke.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 60

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (0) 🔗