Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → sing1ee → Elasticsearch Jieba Plugin

sing1ee / Elasticsearch Jieba Plugin

Licence: mit

jieba analysis plugin for elasticsearch 7.0.0, 6.4.0, 6.0.0, 5.4.0，5.3.0, 5.2.2, 5.2.1, 5.2, 5.1.2, 5.1.1

Programming Languages

java

68154 projects - #9 most used programming language

Labels

elasticsearch dict jieba

Projects that are alternatives of or similar to Elasticsearch Jieba Plugin

Api.rss

RSS as RESTful. This service allows you to transform RSS feed into an awesome API.

Stars: ✭ 340 (-10.29%)

Mutual labels: elasticsearch

Serverless Photo Recognition

A collection of 3 lambda functions that are invoked by Amazon S3 or Amazon API Gateway to analyze uploaded images with Amazon Rekognition and save picture labels to ElasticSearch (written in Kotlin)

Stars: ✭ 345 (-8.97%)

Mutual labels: elasticsearch

Elastic Builder

A Node.js implementation of the elasticsearch Query DSL 👷

Stars: ✭ 367 (-3.17%)

Mutual labels: elasticsearch

Springboot Learning

基于Gradle构建，使用SpringBoot在各个场景的应用，包括集成消息中间件、前后端分离、数据库、缓存、分布式锁、分布式事务等

Stars: ✭ 340 (-10.29%)

Mutual labels: elasticsearch

Elk Docker

Docker configuration for ELK monitoring stack with Curator and Beats data shippers support

Stars: ✭ 342 (-9.76%)

Mutual labels: elasticsearch

Rent House

租房系统，前后端分离，高仿自如。结合elasticsearch与百度api实现地图找房，距离找房等实用功能. 后端java技术栈，前端采用 react + typescript

Stars: ✭ 351 (-7.39%)

Mutual labels: elasticsearch

Candy Api

GetCandy E-Commerce API

Stars: ✭ 339 (-10.55%)

Mutual labels: elasticsearch

Bottleneckosmosis

瓶颈渗透,web渗透,red红队,fuzz param,注释,js字典,ctf

Stars: ✭ 368 (-2.9%)

Mutual labels: dict

Praeco

Elasticsearch alerting made simple.

Stars: ✭ 342 (-9.76%)

Mutual labels: elasticsearch

Flare

An analytical framework for network traffic and behavioral analytics

Stars: ✭ 363 (-4.22%)

Mutual labels: elasticsearch

Elasticsearch Java

Elasticsearch Java API 手册

Stars: ✭ 341 (-10.03%)

Mutual labels: elasticsearch

Sigma

Generic Signature Format for SIEM Systems

Stars: ✭ 4,418 (+1065.7%)

Mutual labels: elasticsearch

Awesome Monitoring

INFRASTRUCTURE、OPERATION SYSTEM and APPLICATION monitoring tools for Operations.

Stars: ✭ 356 (-6.07%)

Mutual labels: elasticsearch

Kbn network

Network Plugin for Kibana

Stars: ✭ 339 (-10.55%)

Mutual labels: elasticsearch

Elasticsearch

The missing elasticsearch ORM for Laravel, Lumen and Native php applications

Stars: ✭ 375 (-1.06%)

Mutual labels: elasticsearch

Pix Dict Api

API do DICT - Diretório de Identificadores de Contas Transacionais

Stars: ✭ 340 (-10.29%)

Mutual labels: dict

Xapiand

Xapiand: A RESTful Search Engine

Stars: ✭ 347 (-8.44%)

Mutual labels: elasticsearch

Elasticsearchdsl

Query DSL library for Elasticsearch

Stars: ✭ 373 (-1.58%)

Mutual labels: elasticsearch

Abc

Power of appbase.io via CLI, with nifty imports from your favorite data sources

Stars: ✭ 375 (-1.06%)

Mutual labels: elasticsearch

Journalbeat

Journalbeat is a log shipper from systemd/journald to Logstash/Elasticsearch

Stars: ✭ 362 (-4.49%)

Mutual labels: elasticsearch

View All Similar Projects ➔

elasticsearch-jieba-plugin

jieba analysis plugin for elasticsearch: 7.7.0, 7.4.2, 7.3.0, 7.0.0, 6.4.0, 6.0.0, 5.4.0, 5.3.0, 5.2.2, 5.2.1, 5.2.0, 5.1.2, 5.1.1

特点

支持动态添加字典，不重启ES。

如果是ES6.4.0的版本，请使用6.4.0分支最新的代码，或者master分支最新代码，也可以下载6.4.1的release，强烈推荐升级！

6.4.1的release，解决了PositionIncrement问题。详细说明见ES分词PositionIncrement解析

版本对应

分支	tag	elasticsearch版本	Release Link
7.7.0	tag v7.7.1	v7.7.0	Download: v7.7.0
7.4.2	tag v7.4.2	v7.4.2	Download: v7.4.2
7.3.0	tag v7.3.0	v7.3.0	Download: v7.3.0
7.0.0	tag v7.0.0	v7.0.0	Download: v7.0.0
6.4.0	tag v6.4.1	v6.4.0	Download: v6.4.1
6.4.0	tag v6.4.0	v6.4.0	Download: v6.4.0
6.0.0	tag v6.0.0	v6.0.0	Download: v6.0.1
5.4.0	tag v5.4.0	v5.4.0	Download: v5.4.0
5.3.0	tag v5.3.0	v5.3.0	Download: v5.3.0
5.2.2	tag v5.2.2	v5.2.2	Download: v5.2.2
5.2.1	tag v5.2.1	v5.2.1	Download: v5.2.1
5.2	tag v5.2.0	v5.2.0	Download: v5.2.0
5.1.2	tag v5.1.2	v5.1.2	Download: v5.1.2
5.1.1	tag v5.1.1	v5.1.1	Download: v5.1.1

more details

choose right version source code.
run

git clone https://github.com/sing1ee/elasticsearch-jieba-plugin.git --recursive
./gradlew clean pz

copy the zip file to plugin directory

cp build/distributions/elasticsearch-jieba-plugin-5.1.2.zip ${path.home}/plugins

unzip and rm zip file

unzip elasticsearch-jieba-plugin-5.1.2.zip
rm elasticsearch-jieba-plugin-5.1.2.zip

start elasticsearch

./bin/elasticsearch

Custom User Dict

Just put you dict file with suffix .dict into ${path.home}/plugins/jieba/dic. Your dict file should like this:

小清新 3
百搭 3
显瘦 3
隨身碟 100
your_word word_freq

Using stopwords

find stopwords.txt in ${path.home}/plugins/jieba/dic.
create folder named stopwords under ${path.home}/config

mkdir -p {path.home}/config/stopwords

copy stopwords.txt into the folder just created

cp ${path.home}/plugins/jieba/dic/stopwords.txt {path.home}/config/stopwords

create index:

PUT http://localhost:9200/jieba_index

{
  "settings": {
    "analysis": {
      "filter": {
        "jieba_stop": {
          "type":        "stop",
          "stopwords_path": "stopwords/stopwords.txt"
        },
        "jieba_synonym": {
          "type":        "synonym",
          "synonyms_path": "synonyms/synonyms.txt"
        }
      },
      "analyzer": {
        "my_ana": {
          "tokenizer": "jieba_index",
          "filter": [
            "lowercase",
            "jieba_stop",
            "jieba_synonym"
          ]
        }
      }
    }
  }
}

test analyzer:

PUT http://localhost:9200/jieba_index/_analyze
{
  "analyzer" : "my_ana",
  "text" : "黄河之水天上来"
}

Response as follow:

{
    "tokens": [
        {
            "token": "黄河",
            "start_offset": 0,
            "end_offset": 2,
            "type": "word",
            "position": 0
        },
        {
            "token": "黄河之水天上来",
            "start_offset": 0,
            "end_offset": 7,
            "type": "word",
            "position": 0
        },
        {
            "token": "之水",
            "start_offset": 2,
            "end_offset": 4,
            "type": "word",
            "position": 1
        },
        {
            "token": "天上",
            "start_offset": 4,
            "end_offset": 6,
            "type": "word",
            "position": 2
        },
        {
            "token": "上来",
            "start_offset": 5,
            "end_offset": 7,
            "type": "word",
            "position": 2
        }
    ]
}

NOTE

migrate from jieba-solr

Roadmap

I will add more analyzer support:

stanford chinese analyzer
fudan nlp analyzer
...

If you have some ideas, you should create an issue. Then, we will do it together.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 379

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (2) 🔗

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

sing1ee / Elasticsearch Jieba Plugin

Programming Languages

Labels

Projects that are alternatives of or similar to Elasticsearch Jieba Plugin

elasticsearch-jieba-plugin

特点

简单的修改，即可适配不同版本的ES

支持动态添加字典，ES不需要重启

有关jieba_index和jieba_search的应用

新分词支持

如果是ES6.4.0的版本，请使用6.4.0分支最新的代码，或者master分支最新代码，也可以下载6.4.1的release，强烈推荐升级！

6.4.1的release，解决了PositionIncrement问题。详细说明见ES分词PositionIncrement解析

版本对应

more details

Custom User Dict

Using stopwords

NOTE

Roadmap