medcl / Elasticsearch Analysis Stconvert
Licence: apache-2.0
STConvert is analyzer that convert chinese characters between traditional and simplified.中文简繁體互相转换.
Stars: ✭ 247
Programming Languages
java
68154 projects - #9 most used programming language
Labels
Projects that are alternatives of or similar to Elasticsearch Analysis Stconvert
Elasticsearch Analysis Pinyin
This Pinyin Analysis plugin is used to do conversion between Chinese characters and Pinyin.
Stars: ✭ 2,215 (+796.76%)
Mutual labels: elasticsearch, analyzer
Elasticsearch Analysis Ik
The IK Analysis plugin integrates Lucene IK analyzer into elasticsearch, support customized dictionary.
Stars: ✭ 13,078 (+5194.74%)
Mutual labels: elasticsearch, analyzer
Elasticsearch Analysis Openkoreantext
Korean analysis plugin that integrates open-korean-text module into elasticsearch.
Stars: ✭ 101 (-59.11%)
Mutual labels: elasticsearch, analyzer
Emoji Search
😄 Emoji synonyms to build your own emoji-capable search engine (elasticsearch, solr)
Stars: ✭ 184 (-25.51%)
Mutual labels: elasticsearch, analyzer
Relevant Search Book
Code and Examples for Relevant Search
Stars: ✭ 231 (-6.48%)
Mutual labels: elasticsearch
Userline
Query and report user logons relations from MS Windows Security Events
Stars: ✭ 221 (-10.53%)
Mutual labels: elasticsearch
Scrutineer
Compares a source of truth sorted stream with another to find mismatches. Designed for verifying indexes such as ElasticSearch & Solr are synchronized with their source of data (usually a DB)
Stars: ✭ 218 (-11.74%)
Mutual labels: elasticsearch
Winston Elasticsearch
An elasticsearch transport for winston
Stars: ✭ 217 (-12.15%)
Mutual labels: elasticsearch
Sist2
Lightning-fast file system indexer and search tool
Stars: ✭ 245 (-0.81%)
Mutual labels: elasticsearch
Neo4j To Elasticsearch
GraphAware Framework Module for Integrating Neo4j with Elasticsearch
Stars: ✭ 241 (-2.43%)
Mutual labels: elasticsearch
Syncclient
syncClient,数据实时同步中间件(同步mysql到kafka、redis、elasticsearch、httpmq)!
Stars: ✭ 227 (-8.1%)
Mutual labels: elasticsearch
Docker Elasticsearch Kubernetes
Ready to use Elasticsearch + Kubernetes discovery plug-in Docker image.
Stars: ✭ 227 (-8.1%)
Mutual labels: elasticsearch
Eland
Python Client and Toolkit for DataFrames, Big Data, Machine Learning and ETL in Elasticsearch
Stars: ✭ 235 (-4.86%)
Mutual labels: elasticsearch
Marija
Data exploration and visualisation for Elasticsearch and Splunk.
Stars: ✭ 220 (-10.93%)
Mutual labels: elasticsearch
Typo3 Docker Boilerplate
🍲 TYPO3 Docker Boilerplate project (NGINX, Apache HTTPd, PHP-FPM, MySQL, Solr, Elasticsearch, Redis, FTP)
Stars: ✭ 240 (-2.83%)
Mutual labels: elasticsearch
Webpackmonitor
A tool for monitoring webpack optimization metrics through the development process
Stars: ✭ 2,432 (+884.62%)
Mutual labels: analyzer
Springboot Learning Example
spring boot 实践学习案例,是 spring boot 初学者及核心技术巩固的最佳实践。
Stars: ✭ 14,640 (+5827.13%)
Mutual labels: elasticsearch
Archivy
Archivy is a self-hosted knowledge repository that allows you to safely preserve useful content that contributes to your own personal, searchable and extendable wiki.
Stars: ✭ 2,746 (+1011.74%)
Mutual labels: elasticsearch
STConvert Analysis for Elasticsearch
STConvert is analyzer that convert Chinese characters between Traditional and Simplified. [中文简繁體转换][简体到繁体][繁体到简体][简繁查询Expand]
You can download the pre-build package from release page
The plugin includes analyzer: stconvert
,
tokenizer: stconvert
,
token-filter: stconvert
,
and char-filter: stconvert
Supported config:
-
convert_type
: defaults2t
,optional option:-
s2t
,convert characters from Simple Chinese to Traditional Chinese -
t2s
,convert characters from Traditional Chinese to Simple Chinese
-
-
keep_both
:defaultfalse
, -
delimiter
:default,
Custom example:
PUT /stconvert/
{
"settings" : {
"analysis" : {
"analyzer" : {
"tsconvert" : {
"tokenizer" : "tsconvert"
}
},
"tokenizer" : {
"tsconvert" : {
"type" : "stconvert",
"delimiter" : "#",
"keep_both" : false,
"convert_type" : "t2s"
}
},
"filter": {
"tsconvert" : {
"type" : "stconvert",
"delimiter" : "#",
"keep_both" : false,
"convert_type" : "t2s"
}
},
"char_filter" : {
"tsconvert" : {
"type" : "stconvert",
"convert_type" : "t2s"
}
}
}
}
}
Analyze tests
GET stconvert/_analyze
{
"tokenizer" : "keyword",
"filter" : ["lowercase"],
"char_filter" : ["tsconvert"],
"text" : "国际國際"
}
Output:
{
"tokens": [
{
"token": "国际国际",
"start_offset": 0,
"end_offset": 4,
"type": "word",
"position": 0
}
]
}
Normalizer usage
DELETE index
PUT index
{
"settings": {
"analysis": {
"char_filter": {
"tsconvert": {
"type": "stconvert",
"convert_type": "t2s"
}
},
"normalizer": {
"my_normalizer": {
"type": "custom",
"char_filter": [
"tsconvert"
],
"filter": [
"lowercase"
]
}
}
}
},
"mappings": {
"properties": {
"foo": {
"type": "keyword",
"normalizer": "my_normalizer"
}
}
}
}
PUT index/_doc/1
{
"foo": "國際"
}
PUT index/_doc/2
{
"foo": "国际"
}
GET index/_search
{
"query": {
"term": {
"foo": "国际"
}
}
}
GET index/_search
{
"query": {
"term": {
"foo": "國際"
}
}
}
Note that the project description data, including the texts, logos, images, and/or trademarks,
for each open source project belongs to its rightful owner.
If you wish to add or remove any projects, please contact us at [email protected].