第一步:安装ElasticSearch 6.0.1 下载ElasticSerach,下载IK分词器
由于IK和ElasticSerach已经是编译好的,不需要编译,直接在安装解压修改配置文件即可
详情请见另一篇安装博客:/weixin_38822045/article/details/85612242
第二步:下载拼音分词器
由于拼音分词器并非编译好的,需要用IDEA的Maven package 进行打包
pingying 下载地址:/medcl/elasticsearch-analysis-pinyin
中文 分词器地址:/medcl/elasticsearch-analysis-ik
选择完毕版本,进行选择Zip包链接进行下载
下载到本地,进行解压,解压完毕之后发现是源码Maven工程,需要进行导入项目,进行mvn package
打包完成后编译后的包在target/releases目录下的
将编译好的zip包进行拷贝,进行解压,上传到ElasticSerach的 plugins的文件夹下
非Root用戶下 启动ElasticSerach ./bin/elasticsearch -d
使用PostMan进行测试
第一步:测试IK分词器
中文分词测试成功
第二步:测试拼音分词器
①:创建Mapping的Setting
PUT 192.168.1.129:9200/search_text
{"settings": {"index": {"max_result_window": 10000000},"refresh_interval": "5s","number_of_shards": 1,"number_of_replicas": 1,"analysis": {"filter": {"edge_ngram_filter": {"type": "edge_ngram","min_gram": 1,"max_gram": 50},"pinyin_full_filter": {"type": "pinyin","keep_first_letter": false,"keep_separate_first_letter": false,"keep_full_pinyin": true,"keep_original": false,"limit_first_letter_length": 50,"lowercase": true},"pinyin_simple_filter": {"type": "pinyin","keep_first_letter": true,"keep_separate_first_letter": false,"keep_full_pinyin": false,"keep_original": false,"limit_first_letter_length": 50,"lowercase": true}},"analyzer": {"pinyiSimpleIndexAnalyzer": {"type": "custom","tokenizer": "keyword","filter": ["pinyin_simple_filter","edge_ngram_filter","lowercase"]},"pinyiFullIndexAnalyzer": {"type": "custom","tokenizer": "keyword","filter": ["pinyin_full_filter", "lowercase"]}}}}}
②:设置type,并且设置字段name字段类型
fidlds字段
{"properties": {"name": {"type": "keyword","fields": {"fpy": {"type": "text","index": true,"analyzer": "pinyiFullIndexAnalyzer"},"spy": {"type": "text","index": true,"analyzer": "pinyiSimpleIndexAnalyzer"}}}}}
对Name字段进行插入数据进行测试
数据插入
PUT 192.168.1.129:9200//search_text/list/1
{
"name":"明天你好"
}
PUT 192.168.1.129:9200//search_text/list/2
{
"name":"天命"
}
PUT 192.168.1.129:9200//search_text/list/3
{
"name":"你好明天"
}
PUT 192.168.1.129:9200//search_text/list/4
{
"name":"明天早上上学"
}
③、测试拼音
第一方式 测试 分词查询
posthttp://192.168.1.129:9200/search_text/list/_search
{"query": {"match": {"name.fpy": {"query": "mingtian"}}}}
返回查询结果
由上述测试 发现如果 查询mingtian 会发现会对每一个拼音都进行分析查询,如果需要只包含mingtian的进行查询 需要修改语法为
{"query": {"match_phrase": {"name.fpy": {"query": "mingtian"}}}}
将上述语法中的match修改为match_phrase即可!!!
在此说明match语法与match_phrase的区别
match,name也分词,只要match的分词结果和字段查询结果的分词结果有相同的就会匹配,并且返回结果
实例:
{"query": {"match": {"content": "java spark"}}}
只要content里面包含有 java 的 或者 spark的全部都会将结果返回
{"query": {"match_phrase": {"content": "java spark"}}}
当前查询语法的返回结果必须包含java 和 spark 的两个 单词才会返回结果
ElasticSerach拼音首字母分词
{"query": {"match": {"name.spy": {"query": "MTNH"}}}}
有上述可知 ,拼音首字母查询可行
参考文档:
/question/3584
/liuqianli/p/8526456.html
/chenmz1995/p/10199147.html
/buxizhizhoum/p/9874703.html
/l-xxx-10000/p/6380125.html
/liuxiao723846/article/details/78365078