Elasticsearch(024):es常见的字段映射类型之 连接类型(join type)
程序员文章站
2022-04-23 15:45:50
...
join type概述
出现的背景
引出问题: “某头条新闻APP”新闻内容和新闻评论是1对多的关系?在ES6.X该如何存储、如何进行高效检索、聚合操作呢?
1. ES6.X 新类型join产生背景
- Mysql中多表关联,我们可以通过left join 或者Join等实现
- ES5.X版本,借助父子文档实现多表关联,类似数据库中Join的功能;实现的核心是借助于ES5.X支持1个索引(index)下多个类型(type)
- ES6.X版本,由于每个索引下面只支持单一的类型(type)
- 所以,ES6.X版本如何实现Join成为关注点
ES6.X新推出了Join类型,主要解决类似Mysql中多表关联的问题。
2. join类型介绍
仍然是一个索引下,借助父子关系,实现类似Mysql中多表关联的操作
3. join类型的mapping定义
PUT my_index
{
"mappings": {
"docs": {
"properties": {
"id": {
"type": "long"
},
"my_join_field": { <1>
"type": "join",
"eager_global_ordinals": true,
"relations": {
"question": "answer" <2>
}
},
"text": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
}
<1> 为join的名称
<2> 指question为answer的父类
4. 父文档数据插入
PUT my_index/docs/1?refresh
{
"text": "This is a question",
"my_join_field": {
"name": "question"
}
}
PUT my_index/docs/2?refresh
{
"text": "This is a another question",
"my_join_field": {
"name": "question"
}
}
PUT my_index/docs/_bulk?refresh
{"index": {"_id": 3}}
{"id":3, "text": "question 3333", "my_join_field": {"name": "question"}}
{"index": {"_id": 4}}
{"id":4, "text": "question 4444", "my_join_field": {"name": "question"}}
文档类型为父类型: ”question”。
5. 子类型文档插入
PUT my_index/doc/5?routing=1&refresh <1>
{
"text": "This is an answer",
"my_join_field": {
"name": "answer", <2>
"parent": "1" <3>
}
}
PUT my_index/doc/6?routing=1&refresh
{
"text": "This is another answer",
"my_join_field": {
"name": "answer",
"parent": "1"
}
}
<1> 路由值是强制性的,因为父文件和子文件必须在相同的分片上建立索引。
<2> “answer”是此子文档的加入名称。代表其是一个子文档。
<3> 指定此子文档的父文档ID:1。
6. 使用join类型的其他约束
- 每个索引只允许一个Join类型Mapping定义
- 父文档和子文档必须在同一个分片上编入索引;这意味着,当进行删除、更新、查找子文档时候需要提供相同的路由值
- 一个文档可以有多个子文档,但只能有一个父文档
- 可以为已经存在的Join类型添加新的关系
- 当一个文档已经成为父文档后,可以为该文档添加子文档
7.join类型的搜索与聚合
7.1 搜索全部
GET my_index/docs/_search
结果数据为
{
"took": 145,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 6,
"max_score": 1,
"hits": [
{
"_index": "my_index",
"_type": "docs",
"_id": "4",
"_score": 1,
"_source": {
"id": 4,
"text": "question 4444",
"my_join_field": {
"name": "question"
}
}
},
{
"_index": "my_index",
"_type": "docs",
"_id": "2",
"_score": 1,
"_source": {
"text": "This is a another question",
"my_join_field": {
"name": "question"
}
}
},
{
"_index": "my_index",
"_type": "docs",
"_id": "1",
"_score": 1,
"_source": {
"text": "This is a question",
"my_join_field": {
"name": "question"
}
}
},
{
"_index": "my_index",
"_type": "docs",
"_id": "5",
"_score": 1,
"_routing": "1",
"_source": {
"text": "This is an answer",
"my_join_field": {
"name": "answer",
"parent": "1"
}
}
},
{
"_index": "my_index",
"_type": "docs",
"_id": "6",
"_score": 1,
"_routing": "1",
"_source": {
"text": "This is another answer",
"my_join_field": {
"name": "answer",
"parent": "1"
}
}
},
{
"_index": "my_index",
"_type": "docs",
"_id": "3",
"_score": 1,
"_source": {
"id": 3,
"text": "question 3333",
"my_join_field": {
"name": "question"
}
}
}
]
}
}
7.2 基于父文档查找子文档
GET my_index/docs/_search
{
"query": {
"has_parent": {
"parent_type": "question",
"query": {
"match": {
"text": "this is"
}
}
}
}
}
返回结果集
{
"took": 161,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 1,
"hits": [
{
"_index": "my_index",
"_type": "docs",
"_id": "5",
"_score": 1,
"_routing": "1",
"_source": {
"text": "This is an answer",
"my_join_field": {
"name": "answer",
"parent": "1"
}
}
},
{
"_index": "my_index",
"_type": "docs",
"_id": "6",
"_score": 1,
"_routing": "1",
"_source": {
"text": "This is another answer",
"my_join_field": {
"name": "answer",
"parent": "1"
}
}
}
]
}
}
7.3 基于子文档查找父文档
GET my_index/docs/_search
{
"query": {
"has_child": {
"type": "answer",
"query": {
"match": {
"text": "this is"
}
}
}
}
}
返回结果集
{
"took": 286,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1,
"hits": [
{
"_index": "my_index",
"_type": "docs",
"_id": "1",
"_score": 1,
"_source": {
"text": "This is a question",
"my_join_field": {
"name": "question"
}
}
}
]
}
}
7.4 查找指定父文档id的子文档集合
GET /my_index/docs/_search
{
"query": {
"parent_id": {
"type": "answer",
"id": "1"
}
}
}
结果集
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 0.13353139,
"hits": [
{
"_index": "my_index",
"_type": "docs",
"_id": "5",
"_score": 0.13353139,
"_routing": "1",
"_source": {
"text": "This is an answer",
"my_join_field": {
"name": "answer",
"parent": "1"
}
}
},
{
"_index": "my_index",
"_type": "docs",
"_id": "6",
"_score": 0.13353139,
"_routing": "1",
"_source": {
"text": "This is another answer",
"my_join_field": {
"name": "answer",
"parent": "1"
}
}
}
]
}
}
7.5 聚合操作
在这里不做过多介绍,详细的使用方法请在后面的聚合的章节进行分析。
上一篇: spring data jpa 多对多 ManyToMany
下一篇: Elasticsearch(022):es常见的字段映射类型之地理形状类型(geo_shape、多边的复杂的地址形状)
推荐阅读
-
Elasticsearch(017):es常见的字段映射类型之嵌套类型(nested)
-
Elasticsearch(023):es常见的字段映射类型之IP类型
-
Elasticsearch(021):es常见的字段映射类型之地理点类型(geo-point)
-
Elasticsearch(025):es常见的字段映射类型之单词计数类型(token count)
-
Elasticsearch(019):es常见的字段映射类型之数组类型(arrays)
-
Elasticsearch(020):es常见的字段映射类型之二进制类型(binary)
-
Elasticsearch(016):es常见的字段映射类型之对象类型(object)
-
Elasticsearch(024):es常见的字段映射类型之 连接类型(join type)
-
Elasticsearch(022):es常见的字段映射类型之地理形状类型(geo_shape、多边的复杂的地址形状)
-
Elasticsearch(014):es常见的字段映射类型之date(日期类型)