shikeio / Elasticsearch Analysis Hanlp
Licence: apache-2.0
Stars: ✭ 39
Programming Languages
java
68154 projects - #9 most used programming language
Labels
Projects that are alternatives of or similar to Elasticsearch Analysis Hanlp
Elastik Nearest Neighbors
Go to: https://github.com/alexklibisz/elastiknn
Stars: ✭ 249 (+538.46%)
Mutual labels: elasticsearch-plugin
vector-search-plugin
Elasticsearch plugin for fast nearest neighbours of vectors (Similar use as FAISS)
Stars: ✭ 102 (+161.54%)
Mutual labels: elasticsearch-plugin
Alerting
📟 Open Distro for Elasticsearch Alerting Plugin
Stars: ✭ 259 (+564.1%)
Mutual labels: elasticsearch-plugin
elasticsearch-langfield
This plugin provides a useful feature for multi-language
Stars: ✭ 13 (-66.67%)
Mutual labels: elasticsearch-plugin
rosette-elasticsearch-plugin
Document Enrichment plugin for Elasticsearch
Stars: ✭ 25 (-35.9%)
Mutual labels: elasticsearch-plugin
elasticsearch-dynamic-synonym
Elasticsearch Plugin for Dynaic Synonym Token Filter.
Stars: ✭ 38 (-2.56%)
Mutual labels: elasticsearch-plugin
Mirage
🎨 GUI for simplifying Elasticsearch Query DSL
Stars: ✭ 2,143 (+5394.87%)
Mutual labels: elasticsearch-plugin
Elasticsearch Readonlyrest Plugin
Free Elasticsearch security plugin and Kibana security plugin: super-easy Kibana multi-tenancy, Encryption, Authentication, Authorization, Auditing
Stars: ✭ 917 (+2251.28%)
Mutual labels: elasticsearch-plugin
elasticsearch-sudachi
The Japanese analysis plugin for elasticsearch
Stars: ✭ 129 (+230.77%)
Mutual labels: elasticsearch-plugin
elasticsearch-analysis-synonym
NGramSynonymTokenizer for Elasticsearch
Stars: ✭ 25 (-35.9%)
Mutual labels: elasticsearch-plugin
docker-curator
docker images for elasticsearch curator
Stars: ✭ 23 (-41.03%)
Mutual labels: elasticsearch-plugin
elasticsearch plugin
Nodeos plugin for archiving blockchain data into Elasticsearch.
Stars: ✭ 57 (+46.15%)
Mutual labels: elasticsearch-plugin
reactivesearch-api
API Gateway for Elasticsearch with declarative querying and out-of-the-box access controls
Stars: ✭ 146 (+274.36%)
Mutual labels: elasticsearch-plugin
Elasticsearch
Elasticsearch是一个实时的分布式搜索和分析引擎,
Stars: ✭ 23 (-41.03%)
Mutual labels: elasticsearch-plugin
Elasticsearch Hq
Monitoring and Management Web Application for ElasticSearch instances and clusters.
Stars: ✭ 4,832 (+12289.74%)
Mutual labels: elasticsearch-plugin
Emoji Search
😄 Emoji synonyms to build your own emoji-capable search engine (elasticsearch, solr)
Stars: ✭ 184 (+371.79%)
Mutual labels: elasticsearch-plugin
elasticsearch-approximate-nearest-neighbor
Plugin to integrate approximate nearest neighbor(ANN) search with Elasticsearch
Stars: ✭ 53 (+35.9%)
Mutual labels: elasticsearch-plugin
Elasticsearch Analysis Dynamic Synonym
elasticsearch同义词热更新插件,支持本地文件更新,http远程文件更新,修复若干bug。
Stars: ✭ 30 (-23.08%)
Mutual labels: elasticsearch-plugin
Gem
💎 GUI for Data Modeling with Elasticsearch
Stars: ✭ 654 (+1576.92%)
Mutual labels: elasticsearch-plugin
elasticsearch-keyboard-layout
Elasticsearch plugin for keyboard layout suggestions
Stars: ✭ 21 (-46.15%)
Mutual labels: elasticsearch-plugin
Important
Thanks the great projects:
Package
com.hankcs.lucene
copy from hanlp-lucene-plugin
Issue
Can't use custom dictionary in JDK9. So change targetCompatibility
to 1.8.
All published releases had build on JDK9.
Build and Install
Install lib
gradle mvn
Import HanLP data
- Download HanLP data.See here HanLP Releases
- Modify the data root in config, change the ${data.root} to your own HanLP root data dir
Modify Plugin Security Policy
Modify ${elasticsearchHome}/config/jvm.options add this in the end
-Djava.security.policy=file://${elasticsearchHome}/plugins/analysis-hanlp/plugin-security.policy
Index and Highlight
Support two kind analyzer:
-
HanLPAnalyzer
standard analyzer, aliashanlp
-
HanLPIndexAnalyzer
index analyzer, aliashanlp-index
Test Analyzer
GET /_analyze
{
"analyzer" : "hanlp-index",
"text": ["中华人民共和国","地大物博"]
}
Response is:
{
"tokens": [
{
"token": "中华人民共和国",
"start_offset": 0,
"end_offset": 7,
"type": "ns",
"position": 0
},
{
"token": "中华人民",
"start_offset": 0,
"end_offset": 4,
"type": "nz",
"position": 1
},
{
"token": "中华",
"start_offset": 0,
"end_offset": 2,
"type": "nz",
"position": 2
},
{
"token": "华人",
"start_offset": 1,
"end_offset": 3,
"type": "n",
"position": 3
},
{
"token": "人民共和国",
"start_offset": 2,
"end_offset": 7,
"type": "nz",
"position": 4
},
{
"token": "人民",
"start_offset": 2,
"end_offset": 4,
"type": "n",
"position": 5
},
{
"token": "共和国",
"start_offset": 4,
"end_offset": 7,
"type": "n",
"position": 6
},
{
"token": "共和",
"start_offset": 4,
"end_offset": 6,
"type": "n",
"position": 7
},
{
"token": "地大物博",
"start_offset": 8,
"end_offset": 12,
"type": "nz",
"position": 8
},
{
"token": "地大",
"start_offset": 8,
"end_offset": 10,
"type": "nz",
"position": 9
}
]
}
Mapping
PUT test/_mapping/test
{
"properties": {
"content": {
"type": "text",
"analyzer": "hanlp-index",
"search_analyzer": "hanlp-index",
"index_options": "offsets"
}
}
}
Index Document
PUT /test/test/1
{
"content": ["中华人民共和国","地大物博"]
}
Highlight
POST /test/test/_search
{
"query": {
"match": {
"content": "中华"
}
},
"highlight": {
"pre_tags": [
"<tag1>"
],
"post_tags": [
"</tag1>"
],
"fields": {
"content": {}
}
}
}
Response is:
{
"took": 384,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.2876821,
"hits": [
{
"_index": "test",
"_type": "test",
"_id": "1",
"_score": 0.2876821,
"_source": {
"content": [
"中华人民共和国",
"地大物博"
]
},
"highlight": {
"content": [
"<tag1>中华</tag1>人民共和国"
]
}
}
]
}
}
Note that the project description data, including the texts, logos, images, and/or trademarks,
for each open source project belongs to its rightful owner.
If you wish to add or remove any projects, please contact us at [email protected].