All Projects → kampsy → gwizo

kampsy / gwizo

Licence: other
Simple Go implementation of the Porter Stemmer algorithm with powerful features.

Programming Languages

go
31211 projects - #10 most used programming language
shell
77523 projects

Projects that are alternatives of or similar to gwizo

stemmify
Ruby module that converts a word to its approximate root form with the Porter stemmer. For example, observing and observation reduce to observ.
Stars: ✭ 54 (+107.69%)
Mutual labels:  stemmer, porter-stemmer-algorithm
perstem
Persian stemmer and morphological analyzer
Stars: ✭ 18 (-30.77%)
Mutual labels:  stemmer
Kelime kok ayirici
Derin Öğrenme Tabanlı - seq2seq - Türkçe için kelime kökü bulma web uygulaması - Turkish Stemmer (tr_stemmer)
Stars: ✭ 76 (+192.31%)
Mutual labels:  stemmer
rust-stemmers
A rust implementation of some popular snowball stemming algorithms
Stars: ✭ 85 (+226.92%)
Mutual labels:  nlp-stemming
Php Stemmer
Native PHP Stemmer
Stars: ✭ 84 (+223.08%)
Mutual labels:  stemmer
lara-hungarian-nlp
NLP class for rapid ChatBot development in Hungarian language
Stars: ✭ 27 (+3.85%)
Mutual labels:  stemmer
Arabic Light Stemmer
Arabic light stemmer. Light stemming for Arabic words removes prefixes and suffixes and normalizes words
Stars: ✭ 14 (-46.15%)
Mutual labels:  stemmer
lancaster-stemmer
Lancaster stemming algorithm
Stars: ✭ 22 (-15.38%)
Mutual labels:  stemmer
topic modelling financial news
Topic modelling on financial news with Natural Language Processing
Stars: ✭ 51 (+96.15%)
Mutual labels:  nlp-stemming
Cadmium
Natural Language Processing (NLP) library for Crystal
Stars: ✭ 172 (+561.54%)
Mutual labels:  stemmer
Stemmer
Fast Porter stemmer implementation
Stars: ✭ 86 (+230.77%)
Mutual labels:  stemmer
Stemmer
An English (Porter2) stemming implementation in Elixir.
Stars: ✭ 134 (+415.38%)
Mutual labels:  stemmer
hunspell
High-Performance Stemmer, Tokenizer, and Spell Checker for R
Stars: ✭ 101 (+288.46%)
Mutual labels:  stemmer
Qutuf
Qutuf (قُطُوْف): An Arabic Morphological analyzer and Part-Of-Speech tagger as an Expert System.
Stars: ✭ 84 (+223.08%)
Mutual labels:  stemmer
lorca
Natural Language Processing for Spanish in Node.js. Stemmer, sentiment analysis, readability, tf-idf with batteries, concordance and more!
Stars: ✭ 95 (+265.38%)
Mutual labels:  stemmer
Nlp Js Tools French
POS Tagger, lemmatizer and stemmer for french language in javascript
Stars: ✭ 32 (+23.08%)
Mutual labels:  stemmer
sastrawijs
Indonesian language stemmer. Javascript port of PHP Sastrawi project.
Stars: ✭ 30 (+15.38%)
Mutual labels:  stemmer
CISTEM
Stemmer for German
Stars: ✭ 33 (+26.92%)
Mutual labels:  stemmer
PersianStemmer-Python
PersianStemmer-Python
Stars: ✭ 43 (+65.38%)
Mutual labels:  stemmer
Turkish.php
Turkish Suffix Library for PHP - Türkçe Çekim ve Yapım Ekleri
Stars: ✭ 57 (+119.23%)
Mutual labels:  vowel

gwizo

home

Gwizo version GoDoc License Twitter

Package gwizo implements Porter Stemmer algorithm, M. "An algorithm for suffix stripping." Program 14.3 (1980): 130-137. Martin Porter, the algorithm's inventor, maintains a web page about the algorithm at http://www.tartarus.org/~martin/PorterStemmer/

Installation

To install, simply run in a terminal:

go get github.com/kampsy/gwizo

Stem

Stem: stem the word.

package main

import (
  "fmt"
  "github.com/kampsy/gwizo"
)

func main() {
  stem := gwizo.Stem("abilities")
  fmt.Printf("Stem: %s\n", stem)
}
$ go run main.go

Stem: able

Vowels, Consonants and Measure

gwizo returns a type Token which has two fileds, VowCon which is the vowel consonut pattern and the Measure value [v]vc{m}[c]

  package main

  import (
    "fmt"
    "github.com/kampsy/gwizo"
    "strings"
  )

func main() {
  word := "abilities"
  token := gwizo.Parse(word)

  // VowCon
  fmt.Printf("%s has Pattern %s \n", word, token.VowCon)

  // Measure value [v]vc{m}[c]
  fmt.Printf("%s has Measure value %d \n", word, token.Measure)

  // Number of Vowels
  v := strings.Count(token.VowCon, "v")
  fmt.Printf("%s Has %d Vowels \n", word, v)

  // Number of Consonants
  c := strings.Count(token.VowCon, "c")
  fmt.Printf("%s Has %d Consonants\n", word, c)
}
$ go run main.go

abilities has Pattern vcvcvcvvc
abilities has Measure value 4
abilities Has 5 Vowels
abilities Has 4 Consonants

File Stem Performance.

  package main

  import (
    "fmt"
    "github.com/kampsy/gwizo"
    "bufio"
    "io/ioutil"
    "strings"
    "os"
    "time"
  )

  func main() {
    curr := time.Now()
    writeOut()
    elaps := time.Since(curr)
    fmt.Println("============================")
    fmt.Println("Done After:", elaps)
    fmt.Println("============================")
  }

  func writeOut() {
    re, err := ioutil.ReadFile("input.txt")
    if err != nil {
      fmt.Println(err)
    }

    file := strings.NewReader(fmt.Sprintf("%s", re))
    scanner := bufio.NewScanner(file)
    out, err := os.Create("stem.txt")
    if err != nil {
      fmt.Println(err)
    }
    defer out.Close()
    for scanner.Scan() {
      txt := scanner.Text()
      stem := gwizo.Stem(txt)
      out.WriteString(fmt.Sprintf("%s\n", stem))
      fmt.Println(txt, "--->", str)
    }
    if err := scanner.Err(); err != nil {
      fmt.Println(err)
    }
  }
$ go run main.go
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].