A library and microservice implementing the health and care terminology SNOMED CT with support for cross-maps, inference, fast full-text search, autocompletion, compositional grammar and the expression constraint language.

Stars: ✭ 131 (+385.19%)

Mutual labels: lucene

LogiEM

面向Elasticsearch研发与运维人员，围绕集群、索引构建的零侵入、多租户的Elasticsearch GUI管控平台

Stars: ✭ 209 (+674.07%)

Mutual labels: lucene

Code4java

Repository for my java projects.

Stars: ✭ 164 (+507.41%)

Mutual labels: lucene

jease

Jease is a Java CMS framework based on Object Database

Stars: ✭ 25 (-7.41%)

Mutual labels: lucene

cloud-note

无道云笔记，原生JSP的仿有道云笔记项目

Stars: ✭ 66 (+144.44%)

Mutual labels: lucene

solr

Apache Solr open-source search software

Stars: ✭ 651 (+2311.11%)

Mutual labels: lucene

Examine

A .NET indexing and search engine powered by Lucene.Net

Stars: ✭ 208 (+670.37%)

Mutual labels: lucene

Clavin

CLAVIN (Cartographic Location And Vicinity INdexer) is an open source software package for document geoparsing and georesolution that employs context-based geographic entity resolution.

Stars: ✭ 237 (+777.78%)

Mutual labels: lucene

lqt

Lucene Query Tool

Stars: ✭ 19 (-29.63%)

Mutual labels: lucene

Smartstorenet

Open Source ASP.NET MVC Enterprise eCommerce Shopping Cart Solution

Stars: ✭ 2,363 (+8651.85%)

Mutual labels: lucene

luceneappengine

This project provides a directory useful to build Lucene and Google App Engine powered applications

Stars: ✭ 16 (-40.74%)

Mutual labels: lucene

Eclipse Instasearch

Eclipse plug-in for fast code search

Stars: ✭ 165 (+511.11%)

Mutual labels: lucene

RedisDirectory

🔒 A simple redis storage engine for lucene - 基于Redis的Lucene索引存储引擎 - Star me if you like it!

Stars: ✭ 18 (-33.33%)

Mutual labels: lucene

lucene-postings-format

At-a-glance overview diagrams of Apache Lucene's default PostingsFormat (inverted index binary format).

Stars: ✭ 65 (+140.74%)

Mutual labels: lucene

IndexWikipedia

A simple utility to index wikipedia dumps using Lucene.

Stars: ✭ 20 (-25.93%)

Mutual labels: lucene

Valley-eCommerce-prototype

An eCommerce website prototype with a layered architecture and MVC using Spring Boot v1.2, Spring Security, Hibernate, and Apache Lucene for full-text searching. for front-end: Bootstrap, Typeahead.js and Graph.js using Thymeleaf as RE.

Stars: ✭ 28 (+3.7%)

Mutual labels: lucene

View All Similar Projects ➔

lucene-arabic-analyzer

Apache Lucene analyzer for Arabic language with root based stemmer.

lucene-arabic-analyzer

Introduction

Stemming algorithms are used in information retrieval systems, text classifiers, indexers and text mining to extract roots of different words, so that words derived from the same stem or root are grouped together.

Version 2.x is based on Alkhlil Morpho System.
Version 1.x is based on Khoja stemmer.

ArabicRootExtractorAnalyzer is responsible to do the following:

Normalize input text by removing diacritics: e.g. "الْعَالَمِينَ" will be converted to "العالمين".
Extract word's root: e.g. "العالمين" will be converted to "علم".

This way, documents will be indexed depending on its words roots, so, when you want to search in the index, you can input "علم" or "عالم" to get all documents containing "الْعَالَمِينَ".

Installation

Maven

<dependency>
  <groupId>com.github.msarhan</groupId>
  <artifactId>lucene-arabic-analyzer</artifactId>
  <version>[VERSION]</version>
</dependency>

Usage

//Initialize the index
Directory index = new RAMDirectory();
Analyzer analyzer = new ArabicRootExtractorAnalyzer();
IndexWriterConfig config = new IndexWriterConfig(analyzer);
IndexWriter writer = new IndexWriter(index, config);

Document doc = new Document();
doc.add(new StringField("number", "1", Field.Store.YES));
doc.add(new TextField("title", "بِسْمِ اللَّهِ الرَّحْمَنِ الرَّحِيمِ", Field.Store.YES));
writer.addDocument(doc);

doc = new Document();
doc.add(new StringField("number", "2", Field.Store.YES));
doc.add(new TextField("title", "الْحَمْدُ لِلَّهِ رَبِّ الْعَالَمِينَ", Field.Store.YES));
writer.addDocument(doc);

doc = new Document();
doc.add(new StringField("number", "3", Field.Store.YES));
doc.add(new TextField("title", "الرَّحْمَنِ الرَّحِيمِ", Field.Store.YES));
writer.addDocument(doc);
writer.close();
//~

//Query the index
String queryStr = "راحم";
Query query = new QueryParser("title", analyzer)
    .parse(queryStr);

int hitsPerPage = 5;
IndexReader reader = DirectoryReader.open(index);
IndexSearcher searcher = new IndexSearcher(reader);
TopDocs docs = searcher.search(query, hitsPerPage, Sort.INDEXORDER);

ScoreDoc[] hits = docs.scoreDocs;
//~

//Print results
System.out.println("Found " + hits.length + " hits:");
for (ScoreDoc hit : hits) {
    int docId = hit.doc;
    Document d = searcher.doc(docId);
    System.out.printf("\t(%s): %s\n", d.get("number"), d.get("title"));
}
//~

Usage of `ArabicRootExtractorStemmer`

ArabicRootExtractorStemmer stemmer = new ArabicRootExtractorStemmer();

assertTrue(stemmer.stem("الرَّحْمَنِ").stream().anyMatch(s -> s.equals("رحم")));
assertTrue(stemmer.stem("الْعَالَمِينَ").stream().anyMatch(s -> s.equals("علم")));
assertTrue(stemmer.stem("الْمُؤْمِنِينَ").stream().anyMatch(s -> s.equals("ءمن")));
assertTrue(stemmer.stem("يَتَنَازَعُونَ").stream().anyMatch(s -> s.equals("نزع")));

Integration with Elasticsearch

To use this Analyzer with Elasticsearch, use elasticsearch-arabic-analyzer plugin.

Building

# Install AlKhalil jar files in your local maven repository
cd alkhalil && ./maven-install.sh

# The resulting jar file will include Alkhalil dependencies
mvn package

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

msarhan / lucene-arabic-analyzer

Programming Languages

Labels

Projects that are alternatives of or similar to lucene-arabic-analyzer

lucene-arabic-analyzer

Introduction

Installation

Usage

Usage of `ArabicRootExtractorStemmer`

Integration with Elasticsearch

Building

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

msarhan / lucene-arabic-analyzer

Programming Languages

Labels

Projects that are alternatives of or similar to lucene-arabic-analyzer

lucene-arabic-analyzer

Introduction

Installation

Usage

Usage of ArabicRootExtractorStemmer

Integration with Elasticsearch

Building

Usage of `ArabicRootExtractorStemmer`