欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页

Elasticsearch(024):es常见的字段映射类型之 连接类型(join type)

程序员文章站 2022-04-23 15:45:50
...

join type概述

出现的背景

引出问题: “某头条新闻APP”新闻内容和新闻评论是1对多的关系?在ES6.X该如何存储、如何进行高效检索、聚合操作呢?

1. ES6.X 新类型join产生背景

  • Mysql中多表关联,我们可以通过left join 或者Join等实现
  • ES5.X版本,借助父子文档实现多表关联,类似数据库中Join的功能;实现的核心是借助于ES5.X支持1个索引(index)下多个类型(type)
  • ES6.X版本,由于每个索引下面只支持单一的类型(type)
  • 所以,ES6.X版本如何实现Join成为关注点

ES6.X新推出了Join类型,主要解决类似Mysql中多表关联的问题

2. join类型介绍

仍然是一个索引下,借助父子关系,实现类似Mysql中多表关联的操作

3. join类型的mapping定义

PUT my_index
{
  "mappings": {
    "docs": {
      "properties": {
          "id": {
            "type": "long"
          },
          "my_join_field": { <1>
            "type": "join",
            "eager_global_ordinals": true,
            "relations": {
              "question": "answer" <2>
            }
          },
          "text": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          }
        }
    }
  }
}

<1> 为join的名称

<2> 指question为answer的父类


4. 父文档数据插入

PUT my_index/docs/1?refresh
{
  "text": "This is a question",
  "my_join_field": {
    "name": "question" 
  }
}

PUT my_index/docs/2?refresh
{
  "text": "This is a another question",
  "my_join_field": {
    "name": "question"
  }
}

PUT my_index/docs/_bulk?refresh
{"index": {"_id": 3}}
{"id":3, "text": "question 3333", "my_join_field": {"name": "question"}}
{"index": {"_id": 4}}
{"id":4, "text": "question 4444", "my_join_field": {"name": "question"}}

文档类型为父类型: ”question”。

5. 子类型文档插入

PUT my_index/doc/5?routing=1&refresh <1>
{
  "text": "This is an answer",
  "my_join_field": {
    "name": "answer", <2>
    "parent": "1" <3>
  }
}

PUT my_index/doc/6?routing=1&refresh
{
  "text": "This is another answer",
  "my_join_field": {
    "name": "answer",
    "parent": "1"
  }
}

<1> 路由值是强制性的,因为父文件和子文件必须在相同的分片上建立索引。

<2> “answer”是此子文档的加入名称。代表其是一个子文档。

<3> 指定此子文档的父文档ID:1。


6. 使用join类型的其他约束

  • 每个索引只允许一个Join类型Mapping定义
  • 父文档和子文档必须在同一个分片上编入索引;这意味着,当进行删除、更新、查找子文档时候需要提供相同的路由值
  • 一个文档可以有多个子文档,但只能有一个父文档
  • 可以为已经存在的Join类型添加新的关系
  • 当一个文档已经成为父文档后,可以为该文档添加子文档

7.join类型的搜索与聚合

7.1 搜索全部

GET my_index/docs/_search

结果数据为

{
  "took": 145,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 6,
    "max_score": 1,
    "hits": [
      {
        "_index": "my_index",
        "_type": "docs",
        "_id": "4",
        "_score": 1,
        "_source": {
          "id": 4,
          "text": "question 4444",
          "my_join_field": {
            "name": "question"
          }
        }
      },
      {
        "_index": "my_index",
        "_type": "docs",
        "_id": "2",
        "_score": 1,
        "_source": {
          "text": "This is a another question",
          "my_join_field": {
            "name": "question"
          }
        }
      },
      {
        "_index": "my_index",
        "_type": "docs",
        "_id": "1",
        "_score": 1,
        "_source": {
          "text": "This is a question",
          "my_join_field": {
            "name": "question"
          }
        }
      },
      {
        "_index": "my_index",
        "_type": "docs",
        "_id": "5",
        "_score": 1,
        "_routing": "1",
        "_source": {
          "text": "This is an answer",
          "my_join_field": {
            "name": "answer",
            "parent": "1"
          }
        }
      },
      {
        "_index": "my_index",
        "_type": "docs",
        "_id": "6",
        "_score": 1,
        "_routing": "1",
        "_source": {
          "text": "This is another answer",
          "my_join_field": {
            "name": "answer",
            "parent": "1"
          }
        }
      },
      {
        "_index": "my_index",
        "_type": "docs",
        "_id": "3",
        "_score": 1,
        "_source": {
          "id": 3,
          "text": "question 3333",
          "my_join_field": {
            "name": "question"
          }
        }
      }
    ]
  }
}

7.2 基于父文档查找子文档

GET my_index/docs/_search
{
  "query": {
    "has_parent": {
      "parent_type": "question",
      "query": {
        "match": {
          "text": "this is"
        }
      }
    }
  }
}

返回结果集

{
  "took": 161,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 2,
    "max_score": 1,
    "hits": [
      {
        "_index": "my_index",
        "_type": "docs",
        "_id": "5",
        "_score": 1,
        "_routing": "1",
        "_source": {
          "text": "This is an answer",
          "my_join_field": {
            "name": "answer",
            "parent": "1"
          }
        }
      },
      {
        "_index": "my_index",
        "_type": "docs",
        "_id": "6",
        "_score": 1,
        "_routing": "1",
        "_source": {
          "text": "This is another answer",
          "my_join_field": {
            "name": "answer",
            "parent": "1"
          }
        }
      }
    ]
  }
}

7.3 基于子文档查找父文档

GET my_index/docs/_search
{
  "query": {
    "has_child": {
      "type": "answer",
      "query": {
        "match": {
          "text": "this is"
        }
      }
    }    
  }
}

返回结果集

{
  "took": 286,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 1,
    "hits": [
      {
        "_index": "my_index",
        "_type": "docs",
        "_id": "1",
        "_score": 1,
        "_source": {
          "text": "This is a question",
          "my_join_field": {
            "name": "question"
          }
        }
      }
    ]
  }
}

7.4 查找指定父文档id的子文档集合

GET /my_index/docs/_search
{
  "query": {
    "parent_id": {
      "type": "answer",
      "id": "1"
    }
  }
}

结果集

{
  "took": 3,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 2,
    "max_score": 0.13353139,
    "hits": [
      {
        "_index": "my_index",
        "_type": "docs",
        "_id": "5",
        "_score": 0.13353139,
        "_routing": "1",
        "_source": {
          "text": "This is an answer",
          "my_join_field": {
            "name": "answer",
            "parent": "1"
          }
        }
      },
      {
        "_index": "my_index",
        "_type": "docs",
        "_id": "6",
        "_score": 0.13353139,
        "_routing": "1",
        "_source": {
          "text": "This is another answer",
          "my_join_field": {
            "name": "answer",
            "parent": "1"
          }
        }
      }
    ]
  }
}

7.5 聚合操作

在这里不做过多介绍,详细的使用方法请在后面的聚合的章节进行分析。

相关标签: ElasticSearch