elasticsearch analysis ansj分词器的安装及使用
程序员文章站
2022-07-04 22:13:23
...
1. 修改pom文件配置
<elasticsearch.version>1.7.1</elasticsearch.version>
<dependency>
<groupId>org.ansj</groupId>
<artifactId>ansj_seg</artifactId>
<classifier>min</classifier>
<version>2.0.8</version>
<scope>compile</scope>
</dependency>
2.编译插件
mvn assembly:assembly
3. 插件安装
elasticsearch-1.7.1\bin>plugin -u file:///C:\Users\Administrator\Desktop\elasticsearch-analysis-ansj\target\releases\elasticsearch-analysis-ansj-1.x.1-release.zip -i ansj
4. 配置ansj分词器
index:
analysis:
analyzer:
index_ansj:
type: ansj_index
query_ansj:
type: ansj_query
ik:
alias: [news_analyzer_ik,ik_analyzer]
type: org.elasticsearch.index.analysis.IkAnalyzerProvider
mmseg:
alias: [news_analyzer, mmseg_analyzer]
type: org.elasticsearch.index.analysis.MMsegAnalyzerProvider
index.analysis.analyzer.default.type : "ansj_index"
详细配置可参考elasticsearch.yml.example
5. 测试及使用
- 索引分词
http://127.0.0.1:9200/articles/_analyze?analyzer=ansj_index&text=我们是中国人
注:其中articles
是索引名称,除articles
外的所有请求url参数部分均为固定写法。analyzer=ansj_index
指定索引分词器,text
后为要索引的内容
输出:
{
"tokens": [
{
"token": "我们",
"start_offset": 0,
"end_offset": 2,
"type": "word",
"position": 1
},
{
"token": "是",
"start_offset": 2,
"end_offset": 3,
"type": "word",
"position": 2
},
{
"token": "中国",
"start_offset": 3,
"end_offset": 5,
"type": "word",
"position": 3
},
{
"token": "人",
"start_offset": 5,
"end_offset": 6,
"type": "word",
"position": 4
}
]
}
- 查询分词
http://127.0.0.1:9200/articles/_analyze?analyzer=ansj_query&text=我们是中国人
注:其中articles
是索引名称,除articles
外的所有请求url参数部分均为固定写法。analyzer=ansj_query
指定查询分词器,text
后为要查询的内容
输出:
{
"tokens": [
{
"token": "我们",
"start_offset": 0,
"end_offset": 2,
"type": "word",
"position": 1
},
{
"token": "是",
"start_offset": 2,
"end_offset": 3,
"type": "word",
"position": 2
},
{
"token": "中国",
"start_offset": 3,
"end_offset": 5,
"type": "word",
"position": 3
},
{
"token": "人",
"start_offset": 5,
"end_offset": 6,
"type": "word",
"position": 4
}
]
}
上一篇: 远程启动终端服务的windows脚本ROTS.vbs
下一篇: 查找与清除线程插入式木马