All Projects → CurrySoftware → rust-stemmers

CurrySoftware / rust-stemmers

Licence: MIT license
A rust implementation of some popular snowball stemming algorithms

Programming Languages

rust
11053 projects

Projects that are alternatives of or similar to rust-stemmers

ComposeAE
Official code for WACV 2021 paper - Compositional Learning of Image-Text Query for Image Retrieval
Stars: ✭ 49 (-42.35%)
Mutual labels:  information-retrieval
ImageRetrieval
Content Based Image Retrieval Techniques (e.g. knn, svm using MatLab GUI)
Stars: ✭ 51 (-40%)
Mutual labels:  information-retrieval
beir
A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.
Stars: ✭ 738 (+768.24%)
Mutual labels:  information-retrieval
IR-exercises
Solutions of the various test exams of the Information Retrieval course
Stars: ✭ 28 (-67.06%)
Mutual labels:  information-retrieval
gpl
Powerful unsupervised domain adaptation method for dense retrieval. Requires only unlabeled corpus and yields massive improvement: "GPL: Generative Pseudo Labeling for Unsupervised Domain Adaptation of Dense Retrieval" https://arxiv.org/abs/2112.07577
Stars: ✭ 216 (+154.12%)
Mutual labels:  information-retrieval
ConvDR
Code repo for SIGIR 2021 paper "Few-Shot Conversational Dense Retrieval"
Stars: ✭ 36 (-57.65%)
Mutual labels:  information-retrieval
Trinity
Trinity IR Infrastructure
Stars: ✭ 227 (+167.06%)
Mutual labels:  information-retrieval
netizenship
a commandline #OSINT tool to find the online presence of a username in popular social media websites like Facebook, Instagram, Twitter, etc.
Stars: ✭ 33 (-61.18%)
Mutual labels:  information-retrieval
query-wellformedness
25,100 queries from the Paralex corpus (Fader et al., 2013) annotated with human ratings of whether they are well-formed natural language questions.
Stars: ✭ 80 (-5.88%)
Mutual labels:  information-retrieval
LuceneTutorial
A simple tutorial of Lucene for LIS 501 Introduction to Text Mining students at the University of Wisconsin-Madison (Fall 2021).
Stars: ✭ 62 (-27.06%)
Mutual labels:  information-retrieval
FinBERT-QA
Financial Domain Question Answering with pre-trained BERT Language Model
Stars: ✭ 70 (-17.65%)
Mutual labels:  information-retrieval
patzilla
PatZilla is a modular patent information research platform and data integration toolkit with a modern user interface and access to multiple data sources.
Stars: ✭ 71 (-16.47%)
Mutual labels:  information-retrieval
SENet-for-Weakly-Supervised-Relation-Extraction
No description or website provided.
Stars: ✭ 39 (-54.12%)
Mutual labels:  information-retrieval
sigir19-neural-ir
Source code for: On the Effect of Low-Frequency Terms on Neural-IR Models, SIGIR'19
Stars: ✭ 44 (-48.24%)
Mutual labels:  information-retrieval
awesome-pretrained-models-for-information-retrieval
A curated list of awesome papers related to pre-trained models for information retrieval (a.k.a., pretraining for IR).
Stars: ✭ 278 (+227.06%)
Mutual labels:  information-retrieval
Conceptualsearch
Train a Word2Vec model or LSA model, and Implement Conceptual Search\Semantic Search in Solr\Lucene - Simon Hughes Dice.com, Dice Tech Jobs
Stars: ✭ 245 (+188.24%)
Mutual labels:  information-retrieval
perke
A keyphrase extractor for Persian
Stars: ✭ 60 (-29.41%)
Mutual labels:  information-retrieval
COVID19-IRQA
No description or website provided.
Stars: ✭ 32 (-62.35%)
Mutual labels:  information-retrieval
topic modelling financial news
Topic modelling on financial news with Natural Language Processing
Stars: ✭ 51 (-40%)
Mutual labels:  nlp-stemming
solr
Apache Solr open-source search software
Stars: ✭ 651 (+665.88%)
Mutual labels:  information-retrieval

Rust Stemmers

This crate implements some stemmer algorithms found in the snowball project which are compiled to rust using the rust-backend of the snowball compiler.

Supported Algorithms

  • Arabic
  • Armenian
  • Danish
  • Dutch
  • English
  • French
  • German
  • Greek
  • Hungarian
  • Italian
  • Norwegian
  • Portuguese
  • Romanian
  • Russian
  • Spanish
  • Swedish
  • Tamil
  • Turkish

Usage

extern crate rust_stemmers;
use rust_stemmers::{Algorithm, Stemmer};

// Create a stemmer for the english language
let en_stemmer = Stemmer::create(Algorithm::English);

// Stemm the word "fruitlessly"
// Please be aware that all algorithms expect their input to only contain lowercase characters.
assert_eq!(en_stemmer.stem("fruitlessly"), "fruitless");

Related Projects

  • The stemmer crate provides bindings to the C Snowball implementation.
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].