Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → jolicode → Emoji Search

jolicode / Emoji Search

Licence: other

😄 Emoji synonyms to build your own emoji-capable search engine (elasticsearch, solr)

Labels

plugin elasticsearch emoji analyzer elasticsearch-plugin

Projects that are alternatives of or similar to Emoji Search

Elasticsearch Analysis Openkoreantext

Korean analysis plugin that integrates open-korean-text module into elasticsearch.

Stars: ✭ 101 (-45.11%)

Mutual labels: elasticsearch, analyzer

Elasticsearch Thulac Plugin

thulac analysis plugin for elasticsearch

Stars: ✭ 129 (-29.89%)

Mutual labels: elasticsearch, plugin

Elasticsearch Reindexing

Elasticsearch plugin for reindexing

Stars: ✭ 106 (-42.39%)

Mutual labels: elasticsearch, elasticsearch-plugin

Sentinl

Kibana Alert & Report App for Elasticsearch

Stars: ✭ 1,233 (+570.11%)

Mutual labels: elasticsearch, plugin

Esparser

PHP write SQL to convert DSL to query Elasticsearch

Stars: ✭ 142 (-22.83%)

Mutual labels: elasticsearch, elasticsearch-plugin

Syliuselasticsearchplugin

Elasticsearch integration for Sylius apps.

Stars: ✭ 88 (-52.17%)

Mutual labels: elasticsearch, elasticsearch-plugin

Performance Analyzer

📈 OpenDistro for Elasticsearch Performance Analyzer

Stars: ✭ 128 (-30.43%)

Mutual labels: elasticsearch, elasticsearch-plugin

Elasticsearch Learning To Rank

Plugin to integrate Learning to Rank (aka machine learning for better relevance) with Elasticsearch

Stars: ✭ 1,147 (+523.37%)

Mutual labels: elasticsearch, elasticsearch-plugin

Mirage

🎨 GUI for simplifying Elasticsearch Query DSL

Stars: ✭ 2,143 (+1064.67%)

Mutual labels: elasticsearch, elasticsearch-plugin

Elastiknn

Elasticsearch plugin for nearest neighbor search. Store vectors and run similarity search using exact and approximate algorithms.

Stars: ✭ 139 (-24.46%)

Mutual labels: elasticsearch, elasticsearch-plugin

Kibananestedsupportplugin

A plugin for Kibana 5.5 and beyond that adds support for nested field search and aggregation.

Stars: ✭ 78 (-57.61%)

Mutual labels: elasticsearch, plugin

Elasticsearch Analysis Ik

The IK Analysis plugin integrates Lucene IK analyzer into elasticsearch, support customized dictionary.

Stars: ✭ 13,078 (+7007.61%)

Mutual labels: elasticsearch, analyzer

Elasticsearch Analysis Hanlp

HanLP Analysis for Elasticsearch

Stars: ✭ 77 (-58.15%)

Mutual labels: elasticsearch, elasticsearch-plugin

Zentity

Entity resolution for Elasticsearch.

Stars: ✭ 97 (-47.28%)

Mutual labels: elasticsearch, elasticsearch-plugin

Jmeter Elasticsearch Backend Listener

JMeter plugin that lets you send sample results to an ElasticSearch engine to enable live monitoring of load tests.

Stars: ✭ 72 (-60.87%)

Mutual labels: elasticsearch, plugin

Elasticsearch Analysis Kuromoji Ipadic Neologd

Elasticsearch's Analyzer for Kuromoji with Neologd

Stars: ✭ 109 (-40.76%)

Mutual labels: elasticsearch, elasticsearch-plugin

Elasticsearch Ukrainian Lemmatizer

Ukrainian lemmatizer plugin for ElasticSearch

Stars: ✭ 44 (-76.09%)

Mutual labels: elasticsearch, plugin

Emojitaco

Alfred Emoji Script with Taco and other unicode 9 emoji

Stars: ✭ 51 (-72.28%)

Mutual labels: plugin, emoji

Elasticsearch Dataformat

Excel/CSV/BulkJSON downloads on Elasticsearch.

Stars: ✭ 135 (-26.63%)

Mutual labels: elasticsearch, elasticsearch-plugin

Graph Aided Search

Elasticsearch plugin offering Neo4j integration for Personalized Search

Stars: ✭ 153 (-16.85%)

Mutual labels: elasticsearch, elasticsearch-plugin

View All Similar Projects ➔

Emoji, flags and emoticons support for Elasticsearch

Add support for emoji and flags in any Lucene compatible search engine!

If you wish to search 🍩 to find donuts in your documents, you came to the right place. This project offer synonym files ready for usage in Elasticsearch analyzer.

Requirements to index emoji in Elasticsearch

Version	Requirements
Elasticsearch >= 6.7	The standard tokenizer now understand Emoji 🎉 thanks to Lucene 7.7.0 - no plugin needed !
Elasticsearch >= 6.4 and < 6.7	You need to install the official ICU Plugin. See our blog post about this change.
Elasticsearch < 6.4	You need our custom ICU Tokenizer Plugin, see our blog post (2016).

Run the following test to verify that you get 4 EMOJI tokens:

GET _analyze
{
  "text": ["🍩 🇫🇷 👩‍🚒 🚣🏾‍♀"]
}

The Synonyms, flags and emoticons

What you need to search with emoji is a way to expand them to words that can match searches and documents, in your language. That's the goal of the synonym dictionaries.

We build Solr / Lucene compatible synonyms files in all languages supported by Unicode CLDR so you can set them up in an analyzer. It looks like this:

👩‍🚒 => 👩‍🚒, firefighter, firetruck, woman
👩‍✈ => 👩‍✈, pilot, plane, woman
🥓 => 🥓, bacon, meat, food
🥔 => 🥔, potato, vegetable, food
😅 => 😅, cold, face, open, smile, sweat
😆 => 😆, face, laugh, mouth, open, satisfied, smile
🚎 => 🚎, bus, tram, trolley
🇫🇷 => 🇫🇷, france
🇬🇧 => 🇬🇧, united kingdom

For emoticons, use this mapping with a char_filter to replace emoticons by emoji.

Installation

Download the emoji and emoticon file you want from this repository and store them in PATH_ES/config/analysis (or anywhere Elasticsearch can read).

config
├── analysis
│   ├── cldr-emoji-annotation-synonyms-en.txt
│   └── emoticons.txt
├── elasticsearch.yml
...

Use them like this (this is a complete english example with Elasticsearch >= 6.7):

PUT /tweets
{
  "settings": {
    "analysis": {
      "filter": {
        "english_emoji": {
          "type": "synonym",
          "synonyms_path": "analysis/cldr-emoji-annotation-synonyms-en.txt" 
        },
        "emoji_variation_selector_filter": {
          "type": "pattern_replace",
          "pattern": "\\uFE0E|\\uFE0F",
          "replace": ""
        },
        "english_stop": {
          "type":       "stop",
          "stopwords":  "_english_"
        },
        "english_keywords": {
          "type":       "keyword_marker",
          "keywords":   ["example"]
        },
        "english_stemmer": {
          "type":       "stemmer",
          "language":   "english"
        },
        "english_possessive_stemmer": {
          "type":       "stemmer",
          "language":   "possessive_english"
        }
      },
      "analyzer": {
        "english_with_emoji": {
          "tokenizer": "standard",
          "filter": [
            "english_possessive_stemmer",
            "lowercase",
            "emoji_variation_selector_filter",
            "english_emoji",
            "english_stop",
            "english_keywords",
            "english_stemmer"
          ]
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "content": {
        "type": "text",
        "analyzer": "english_with_emoji"
      }
    }
  }
}

You can now test the result with:

GET tweets/_analyze
{
  "field": "content",
  "text": "🍩 🇫🇷 👩‍🚒 🚣🏾‍♀"
}

How to contribute

Build from CLDR SVN

You will need:

php cli
php zip and curl extensions

Edit the tag in tools/build-released.php and run php tools/build-released.php.

Update emoticons

Run php tools/build-emoticon.php.

Licenses

Emoji data courtesy of CLDR. See unicode-license.txt for details. Some modifications are done on the data, see here. Emoticon data based on https://github.com/wooorm/emoticon/ (MIT).

This repository in distributed under MIT License. Feel free to use and contribute as you please!

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 184

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (5) 🔗