欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页

elasticsearch mapping之一:引入mapping

程序员文章站 2024-03-02 20:18:58
...

mapping作为index定义的两大结构之一(另外一个是setting),被用来指定documet以及document中的field如何存储以及被索引。其又包括两大结构: Meta-field 和 properties/Fields  。其中,Meta-field包含_field_names、_ignored、_id、_index、_meta、_routing、_source、_type(在6.0版本中被废弃)。 properties/Fields用于指定field数据类型,是否被索引,如何被索引,是否分词以及如何分词等等,这里不做展开,后面单独讲。

可以在索引级别设置index包含的最大field数量,嵌套field的最大深度,mapping中嵌套field的个数,document中存储嵌套field的字段数量以及field名称的最大长度:
index.mapping.total_fields.limit

The maximum number of fields in an index. Field and object mappings, as well as field aliases count towards this limit. The default value is 1000.

index.mapping.depth.limit

The maximum depth for a field, which is measured as the number of inner objects. For instance, if all fields are defined at the root object level, then the depth is 1. If there is one object mapping, then the depth is 2, etc. The default is 20.

index.mapping.nested_fields.limit

The maximum number of distinct nested mappings in an index, defaults to 50.

index.mapping.nested_objects.limit

The maximum number of nested JSON objects within a single document across all nested types, defaults to 10000.

index.mapping.field_name_length.limit

Setting for the maximum length of a field name. The default value is Long.MAX_VALUE (no limit). This setting isn’t really something that addresses mappings explosion but might still be useful if you want to limit the field length. It usually shouldn’t be necessary to set this setting. The default is okay unless a user starts to add a huge number of fields with really long names.

mapping有dynamic mapping以及explicit mapping。dynamic mapping使用户可以不用显示在创建索引时显示指定field的定义,由es根据默认的或者用户定义的规则,将doc中出现的field映射为特定的datatype。dynamic mapping可以在创建索引时在document级别或者有inner field的字段上指定"dynamic":false禁用。

例如,创建一个索引:
 

PUT /my-index
{
  "mappings": {
    "properties": {
      "age":    { "type": "integer","ignore_malformed": true },
      "grade":{"type": "long","ignore_malformed": true},
      "email":  { "type": "keyword"  }, 
      "name":   { "type": "text"  }  ,
      "body":{
        "type": "object",
        "properties": {
          "addr1":{"type":"keyword"},
          "addr2":{
            "type":"object",
            "properties":{
              "street":{"type":"text","fields":{"raw":{"type":"keyword"}}}
            },
            "dynamic":true
          }
        }
      }
    },
    "dynamic":false
  }
}

可以往存在的mapping中添加field,例如:
 

POST /my-index/_mapping
{
  "properties": {
    "nation":{"type": "keyword"}
  }
}

也可以更新一个已经存在的field mapping,例如:
 

POST /my-index/_mapping
{
  "properties": {
    "nation":{
      "type": "keyword",
      "fields": {
        "value1":{
          "type":"text"
        }
      }
    }
  }
}

POST my-index/_mapping
{
  "properties": {
    "body":{
        "type": "object",
        "properties": {
          "addr3":{"type":"keyword"}
        }
      }
  }
}

PUT /my-index/_mapping
{
  "properties": {
    "employee-id": {
      "type": "keyword",
      "ignore_above": 10
    }
  }
}

但是更新存在的field只能做如下三种,分别对应上面的举例:

  • You can add new properties to an object field.
  • You can use the field mapping parameter to enable multi-fields.
  • You can change the value of the ignore_above mapping parameter.

需要注意的是,由于segment是immutable不可变的,因此即使修改了filed的mapping,已经存在的数据也不会变化。反而可能因为修改而导致数据invalidate 。因此最好的方式是按照新的mapping新建一个索引,然后reindex数据到新建的索引上。此行为要求原索引存储source。

查看索引mapping:
 

GET my-index
GET /my-index/_mapping

查看特定field的mapping,这里需要注意下body.addr1的写法,反应了object类型的field在es内部是展开存储的事实:
 

GET my-index/_mapping/field/body.addr1