【ElasticSearch实战】
程序员文章站
2022-07-05 14:38:16
...
ElasticSearch实战
ElasticSearch简介(ElasticSearch6.0以上的版本)
ElasticSearch是一个基于Lucene的搜索服务器。它提供了一个分布式多用户能力的全文搜索引擎,能够达到实时、稳定、可
靠、快速搜索。在大量数据收集分析的时候,非常适合。
查看ElasticSearch版本
http://es节点IP:9200/
{
"name": "xx",
"cluster_name": "xxxx",
"cluster_uuid": "xxxxxxxxxxxxxxxxxx",
"version": {
"number": "6.2.4",
"build_hash": "ccec39f",
"build_date": "2018-04-12T20:37:28.497551Z",
"build_snapshot": false,
"lucene_version": "7.2.1",
"minimum_wire_compatibility_version": "5.6.0",
"minimum_index_compatibility_version": "5.0.0"
},
"tagline": "You Know, for Search"
}
ElasticSearch核心概念
近实时:
写入数据到数据可以被搜索到有一个小延迟(大概1秒);基于es执行搜索和分析可以达到秒级。
Cluster(集群):
集群包含多个节点,每个节点属于哪个集群是通过一个配置(集群名称,默认是elasticsearch)来决定的
Node(节点):
节点默认是随机分配名称,默认节点会去加入一个名称为“elasticsearch”的集群
Index(索引):
索引包含一堆有相似结构的文档数据 相当于 数据库(DB)
Type(类型):
索引里都可以有一个或多个type,一个type下的document,都有相同的field 相当于 数据库(DB)的表
Document(文档):
文档是es中的最小数据单元,一个document可以是一条客户数据 相当于 数据库(DB)的表的行数据
Field(字段-列):
Field是Elasticsearch的最小单位。一个document里面有多个field,每个field就是一个数据字段 相当表字段
mapping(映射):
数据如何存放到索引对象上,需要有一个映射配置,包括:数据类型、是否存储、是否分词 相当于 数据库(DB)的表字段定义的约束
e.g:案例
client.indices.putMapping({
index : 'blog', //数据库名称
type : 'article', //表名称
body : {
article: {
properties: {
id: { // 表字段
type: 'string', //映射 表约束 数据类型
analyzer: 'ik', //映射 表约束 是否分词 ik:ik分词
store: 'yes', //映射 表约束 是否存储
},
title: { // 表字段
type: 'string', //映射 表约束 数据类型
analyzer: 'ik', //映射 表约束 是否分词 ik:ik分词
store: 'no', //映射 表约束 是否存储
},
content: { // 表字段
type: 'string', //映射 表约束 数据类型
analyzer: 'ik', //映射 表约束 数据类型 ik:ik分词
store: 'yes', //映射 表约束 数据类型
}
}
}
}
});
查看ES包含哪些index(数据库)
ES:
http://ES节点ip:9200/_cat/indices?v
Mysql:
show databases
查看ES index(数据库)包含哪些Type(表)
ES:
http://es-node:9200/索引名称/_mapping?pretty=true
Mysql:
use 数据库名
show tables
查看ES shards(索引分片)
http://es-node:9200/_cat/shards
可以看到数据被分布在哪些片上
查看ES TYPE(表)定义
GET 索引名称/_mapping
group by(分组)
sql:
select team, count(*) as player_count from player group by team;
es:
TermsAggregationBuilder interfaceIdTeam= AggregationBuilders.terms("teamKey").field("team.keyword");
注:teamKey 随便起的分组名称,到时时候取值的时候会用到
team.keyword: 分组的字段 team keyword:分词 一般在分组的时候都要加上
group by多个field(注: 没有用过 )
sql:
select team, position, count(*) as pos_count from player group by team, position;
es:
TermsBuilder teamAgg= AggregationBuilders.terms("player_count ").field("team");
TermsBuilder posAgg= AggregationBuilders.terms("pos_count").field("position");
sbuilder.addAggregation(teamAgg.subAggregation(posAgg));
count/max/min/sum/avg(注: max/max/min/sum/avg 字段 必须为数据类型)
sql:
select team, count(*) as player_count,max(age),min(age),sum(age),avg(age) from player group by team;
es:
//分组
TermsAggregationBuilder team= AggregationBuilders.terms("teamKey").field("team.keyword");
//count
ValueCountAggregationBuilder valueCountAggregationBuilder = AggregationBuilders.count("count_age").field("age");
//max
MaxAggregationBuilder maxAggregationBuilder = AggregationBuilders.max("max_age").field("age");
//min
MinAggregationBuilder minAggregationBuilder = AggregationBuilders.min("max_age").field("age");
//sum
SumAggregationBuilder sumAggregationBuilder = AggregationBuilders.sum("max_age").field("age");
//avg
AvgAggregationBuilder avgAggregationBuilder = AggregationBuilders.avg("max_age").field("age");
// 添加到时分组里面去
team.subAggregation(valueCountAggregationBuilder);
team.subAggregation(maxAggregationBuilder);
team.subAggregation(minAggregationBuilder);
team.subAggregation(sumAggregationBuilder);
team.subAggregation(avgAggregationBuilder);
order by
sql:
select team, sum(salary) as total_salary from player group by team order by total_salary desc;
ES:
TermsAggregationBuilder team= AggregationBuilders.terms("teamKey").field("team.keyword");
SumAggregationBuilder sumAggregationBuilder = AggregationBuilders.sum("total_salary ").field("salary");
team.subAggregation(sumAggregationBuilder);
team.order(BucketOrder.aggregation("total_salary ", false));
limit
sql:
select team, sum(salary) as total_salary from player group by team order by total_salary desc limit 1000;
ES:
TermsAggregationBuilder team= AggregationBuilders.terms("teamKey").field("team.keyword");
SumAggregationBuilder sumAggregationBuilder = AggregationBuilders.sum("total_salary ").field("salary");
team.subAggregation(sumAggregationBuilder);
team.order(BucketOrder.aggregation("total_salary ", false)).size(1000);;
where 条件
-
组合查询/多条件查询/布尔查询
BoolQueryBuilder boolQuery = new BoolQueryBuilder(); boolQuery().must();//文档必须完全匹配条件,相当于and boolQuery().mustNot();//文档必须不匹配条件,相当于not boolQuery().should();//至少满足一个条件,这个文档就符合should,相当于or
-
精确查询(必须完全匹配上)
//不分词查询 参数1: 字段名,参数2:字段查询值,因为不分词,所以汉字只能查询一个字,英语是一个单词. QueryBuilder queryBuilder=QueryBuilders.termQuery("fieldName", "fieldlValue"); //分词查询,采用默认的分词器 QueryBuilder queryBuilder2 = QueryBuilders.matchQuery("fieldName", "fieldlValue");
多个匹配
//不分词查询,参数1: 字段名,参数2:多个字段查询值,因为不分词,所以汉字只能查询一个字,英语是一个单词. QueryBuilder queryBuilder=QueryBuilders.termsQuery("fieldName", "fieldlValue1","fieldlValue2..."); //分词查询,采用默认的分词器 QueryBuilder queryBuilder= QueryBuilders.multiMatchQuery("fieldlValue", "fieldName1", "fieldName2", "fieldName3"); //匹配所有文件,相当于就没有设置查询条件 QueryBuilder queryBuilder=QueryBuilders.matchAllQuery();
-
模糊查询(只要包含即可)
//模糊查询常见的5个方法如下 //1.常用的字符串查询 QueryBuilders.queryStringQuery("fieldValue").field("fieldName");//左右模糊 //2.常用的用于推荐相似内容的查询 QueryBuilders.moreLikeThisQuery(new String[] {"fieldName"}).addLikeText("pipeidhua");//如果不指定filedName,则默认全部,常用在相似内容的推荐上 //3.前缀查询 如果字段没分词,就匹配整个字段前缀 QueryBuilders.prefixQuery("fieldName.keyWord","fieldValue"); //4.fuzzy query:分词模糊查询,通过增加fuzziness模糊属性来查询,如能够匹配hotelName为tel前或后加一个字母的文档,fuzziness 的含义是检索的term 前后增加或减少n个单词的匹配查询 QueryBuilders.fuzzyQuery("hotelName.keyWord", "tel").fuzziness(Fuzziness.ONE); //5.wildcard query:通配符查询,支持* 任意字符串;?任意一个字符 QueryBuilders.wildcardQuery("fieldName.keyWord","ctr*");//前面是fieldname,后面是带匹配字符的字符串 QueryBuilders.wildcardQuery("fieldName.keyWord ","c?r?"); 注: 在模糊查询的时候 在看下字段类型 有keyWord 要加上 不然不支持 中文以及大写字母模糊查询
-
范围查询
//闭区间查询 QueryBuilder queryBuilder0 = QueryBuilders.rangeQuery("fieldName").from("fieldValue1").to("fieldValue2"); //开区间查询 QueryBuilder queryBuilder1 = QueryBuilders.rangeQuery("fieldName").from("fieldValue1").to("fieldValue2").includeUpper(false).includeLower(false);//默认是true,也就是包含 //大于 QueryBuilder queryBuilder2 = QueryBuilders.rangeQuery("fieldName").gt("fieldValue"); //大于等于 QueryBuilder queryBuilder3 = QueryBuilders.rangeQuery("fieldName").gte("fieldValue"); //小于 QueryBuilder queryBuilder4 = QueryBuilders.rangeQuery("fieldName").lt("fieldValue"); //小于等于 QueryBuilder queryBuilder5 = QueryBuilders.rangeQuery("fieldName").lte("fieldValue"); ```
-
聚合查询
(1)统计某个字段的数量 ValueCountBuilder vcb= AggregationBuilders.count("count_uid").field("uid"); (2)去重统计某个字段的数量(有少量误差) CardinalityBuilder cb= AggregationBuilders.cardinality("distinct_count_uid").field("uid"); (3)聚合过滤 FilterAggregationBuilder fab= AggregationBuilders.filter("uid_filter").filter(QueryBuilders.queryStringQuery("uid:001")); (4)按某个字段分组 TermsBuilder tb= AggregationBuilders.terms("group_name").field("name"); (5)求和 SumBuilder sumBuilder= AggregationBuilders.sum("sum_price").field("price"); (6)求平均 AvgBuilder ab= AggregationBuilders.avg("avg_price").field("price"); (7)求最大值 MaxBuilder mb= AggregationBuilders.max("max_price").field("price"); (8)求最小值 MinBuilder min= AggregationBuilders.min("min_price").field("price"); (9)按日期间隔分组 DateHistogramBuilder dhb= AggregationBuilders.dateHistogram("dh").field("date"); (10)获取聚合里面的结果 TopHitsBuilder thb= AggregationBuilders.topHits("top_result"); (11)嵌套的聚合 NestedBuilder nb= AggregationBuilders.nested("negsted_path").path("quests"); (12)反转嵌套 AggregationBuilders.reverseNested("res_negsted").path("kps ");
上一篇: Vue.js 进阶打怪合集
下一篇: 设计模式系列__总览