elasticsearch操作

程序员文章站 2022-07-09 18:49:36

...

文章目录

集群搭建7.4.1版本,配置
linux优化
索引映射创建，优化
es的常用指令
跨集群数据迁移
集群安全重启
分片移动

集群搭建7.4.1版本,配置

3台机器组成一个集群，分别为：a,b,c
a:
编辑a的config/elasticsearch.yml配置文件,修改后如下

# ======================== Elasticsearch Configuration #=========================
#
# NOTE: Elasticsearch comes with reasonable defaults for most settings.
#       Before you set out to tweak and tune the configuration, make sure you
#       understand what are you trying to accomplish and the consequences.
#
# The primary way of configuring a node is via this file. This template lists
# the most important settings you may want to configure for a production cluster.
#
# Please consult the documentation for further information on configuration options:
# https://www.elastic.co/guide/en/elasticsearch/reference/index.html
#
# ---------------------------------- Cluster #—————————————————
#
# Use a descriptive name for your cluster:
#集群名称
cluster.name: my-application
#
# ------------------------------------ Node ##
#
# Use a descriptive name for the node:
#确定master
node.master: true
#节点名称
node.name: node-1
#
#discovery.zen.minimum_master_nodes: 3
# Add custom attributes to the node:
#
#node.attr.rack: r1
#
# ----------------------------------- Paths #
#
# Path to directory where to store the data (separate multiple locations by comma):
#es数据存放位置，需要手动创建目录和赋予权限
path.data: /opt/soft/data
#
# Path to log files:
#
#es日志存放位置，需要手动创建目录和赋予权限
path.logs: /opt/soft/log
#
# ----------------------------------- Memory #
#
# Lock the memory on startup:
#
#bootstrap.memory_lock: true
#bootstrap.memory_lock: true
#
# Make sure that the heap size is set to about half the memory available
# on the system and that the owner of the process is allowed to use this
# limit.
#
# Elasticsearch performs poorly when the system is swapping the memory.
#
# ---------------------------------- Network #
#
# Set the bind address to a specific IP (IPv4 or IPv6):
#允许自身各种ip访问
network.host: 0.0.0.0
#
# Set a custom port for HTTP:
#对外服务端口
http.port: 9200
#
# For more information, consult the network module documentation.
#
# --------------------------------- Discovery #
#
# Pass an initial list of hosts to perform discovery when this node is started:
# The default list of hosts is ["127.0.0.1", "[::1]"]
#指定集群里的所有节点，9300是集群间相互通信的端口
discovery.seed_hosts:  ["10.209.5.87:9300","10.209.5.88:9300","10.209.5.89:9300"]
#discovery.zen.ping.unicast.hosts: ["10.209.5.79","10.209.5.80","10.209.5.78"]
#
# Bootstrap the cluster using an initial set of master-eligible nodes:
#集群启动指定的可选举的master节点
cluster.initial_master_nodes: ["node-1"]
#
# For more information, consult the discovery and cluster formation module documentation.
#
# ---------------------------------- Gateway #
#
# Block initial recovery after a full cluster restart until N nodes are started:
#
#gateway.recover_after_nodes: 3
#
# For more information, consult the gateway module documentation.
#
# ---------------------------------- Various #—————————————————
#
# Require explicit names when deleting indices:
#
#action.destructive_requires_name: true
#这两行允许跨域
http.cors.enabled: true
http.cors.allow-origin: "*"
#reindex同步数据，数据迁移需要的其他机器的白名单，不然不能使用reindex,这表示当前节点可以#从以下白名单节点获取数据，通常是其他集群的节点
reindex.remote.whitelist: ["10.209.5.84:9200","10.209.5.78:9200","10.209.1.48:9200","10.209.1.35:5200","10.47.187.45:5200","10.47.195.38:5200"]
#指定冷归档数据的存放位置目录，冷归档的数据可以压缩文件夹后剪切移到其他机器，目录需要手#动创建并赋予权限
path.repo: ["/opt/soft/es_backups/backups", "/opt/soft/es_backups/longterm_backups"]

b:
机器的elasticsearch.yml
其他一样，修改
#注释
#node.master: true
#节点名称
node.name: node-2

c:
机器的elasticsearch.yml
其他一样，修改
#注释
#node.master: true
#节点名称
node.name: node-3

修改每一台机器的内存大小参数（64g为例）
修改config/jvm.options文件,最大不能超过31g,最好不超过整个机器的内存50%
-Xms30g
-Xmx30g

linux优化

关闭交换分区，防止内存置换降低性能

swapoff -a

vim /etc/security/limits.conf

#文尾添加
* soft nofile 65535
* hard nofile 131072
* soft nproc 4096
* hard nproc 4096

vim /etc/sysctl.conf

vm.max_map_count=262145

#刷新配置

sysctl -p

es不允许root启动
#增加用户

useradd esuser

#切换用户

su esuser

启动命令：
一定要检查防火墙是否开放9200，9300端口
在解压目录执行命令

./bin/elasticsearch -d

索引映射创建，优化

创建索引es_persist_3

创建索引 es_persist_3
url

put http://ip:port/es_persist_3

json

{
  "settings": {
    "number_of_shards": "12",
    "number_of_replicas": "1",
    "index.translog.durability": "async",
    "index.translog.sync_interval": "60s",
    "index.translog.flush_threshold_size": "1024mb"
  }
}

创建映射mapping es_persist_3

创建mapping es_persist_3
url

post http://ip:port/es_persist_3/_mapping

json

{
  "properties": {
    "servCode": {
      "type": "text",
      "fields": {
        "keyword": {
          "ignore_above": 256,
          "type": "keyword"
        }
      }
    },
    "httpMethod": {
      "type": "text",
      "fields": {
        "keyword": {
          "ignore_above": 256,
          "type": "keyword"
        }
      }
    },
    "type": {
      "type": "text",
      "fields": {
        "keyword": {
          "ignore_above": 256,
          "type": "keyword"
        }
      }
    },
    "servVersionProxyType": {
      "type": "text",
      "fields": {
        "keyword": {
          "ignore_above": 256,
          "type": "keyword"
        }
      }
    },
    "exceptionStack": {
      "type": "text",
      "fields": {
        "keyword": {
          "ignore_above": 256,
          "type": "keyword"
        }
      }
    },
    "exceptionTime": {
      "type": "date"
    },
    "@version": {
      "type": "text",
      "fields": {
        "keyword": {
          "ignore_above": 256,
          "type": "keyword"
        }
      }
    },
    "host": {
      "type": "text",
      "fields": {
        "keyword": {
          "ignore_above": 256,
          "type": "keyword"
        }
      }
    },
    "pAppName": {
      "type": "text",
      "fields": {
        "keyword": {
          "ignore_above": 256,
          "type": "keyword"
        }
      }
    },
    "id": {
      "type": "text",
      "fields": {
        "keyword": {
          "ignore_above": 256,
          "type": "keyword"
        }
      }
    },
    "receiveSize": {
      "type": "long"
    },
    "authType": {
      "type": "text",
      "fields": {
        "keyword": {
          "ignore_above": 256,
          "type": "keyword"
        }
      }
    },
    "externalTime": {
      "type": "long"
    },
    "cAppName": {
      "type": "text",
      "fields": {
        "keyword": {
          "ignore_above": 256,
          "type": "keyword"
        }
      }
    },
    "returnSize": {
      "type": "long"
    },
    "authCode": {
      "type": "text",
      "fields": {
        "keyword": {
          "ignore_above": 256,
          "type": "keyword"
        }
      }
    },
    "statusDesc": {
      "type": "text",
      "fields": {
        "keyword": {
          "ignore_above": 256,
          "type": "keyword"
        }
      }
    },
    "platformTime": {
      "type": "long"
    },
    "servName": {
      "type": "text",
      "fields": {
        "keyword": {
          "ignore_above": 256,
          "type": "keyword"
        }
      }
    },
    "componentPort": {
      "type": "text",
      "fields": {
        "keyword": {
          "ignore_above": 256,
          "type": "keyword"
        }
      }
    },
    "esbId": {
      "type": "text",
      "fields": {
        "keyword": {
          "ignore_above": 256,
          "type": "keyword"
        }
      }
    },
    "responseSize": {
      "type": "long"
    },
    "message": {
      "type": "text",
      "fields": {
        "keyword": {
          "ignore_above": 256,
          "type": "keyword"
        }
      }
    },
    "logTime": {
      "type": "date"
    },
    "tags": {
      "type": "text",
      "fields": {
        "keyword": {
          "ignore_above": 256,
          "type": "keyword"
        }
      }
    },
    "receiveTime": {
      "type": "long"
    },
    "@timestamp": {
      "type": "date"
    },
    "messageList": {
      "properties": {
        "sizeX": {
          "type": "text",
          "fields": {
            "keyword": {
              "ignore_above": 256,
              "type": "keyword"
            }
          }
        },
        "serialNumber": {
          "type": "text",
          "fields": {
            "keyword": {
              "ignore_above": 256,
              "type": "keyword"
            }
          }
        },
        "header": {
          "type": "text",
          "fields": {
            "keyword": {
              "ignore_above": 256,
              "type": "keyword"
            }
          }
        },
        "time": {
          "type": "long"
        },
        "body": {
          "type": "text",
          "fields": {
            "keyword": {
              "ignore_above": 256,
              "type": "keyword"
            }
          }
        },
        "type": {
          "type": "text",
          "fields": {
            "keyword": {
              "ignore_above": 256,
              "type": "keyword"
            }
          }
        },
        "url": {
          "type": "text",
          "fields": {
            "keyword": {
              "ignore_above": 256,
              "type": "keyword"
            }
          }
        }
      }
    },
    "componentHost": {
      "type": "text",
      "fields": {
        "keyword": {
          "ignore_above": 256,
          "type": "keyword"
        }
      }
    },
    "cAppCode": {
      "type": "text",
      "fields": {
        "keyword": {
          "ignore_above": 256,
          "type": "keyword"
        }
      }
    },
    "fromIp": {
      "type": "text",
      "fields": {
        "keyword": {
          "ignore_above": 256,
          "type": "keyword"
        }
      }
    },
    "complete": {
      "type": "boolean"
    },
    "requestSize": {
      "type": "long"
    },
    "logtime": {
      "type": "text",
      "fields": {
        "keyword": {
          "ignore_above": 256,
          "type": "keyword"
        }
      }
    },
    "callTime": {
      "type": "long"
    },
    "pAppCode": {
      "type": "text",
      "fields": {
        "keyword": {
          "ignore_above": 256,
          "type": "keyword"
        }
      }
    },
    "statusCode": {
      "type": "text",
      "fields": {
        "keyword": {
          "ignore_above": 256,
          "type": "keyword"
        }
      }
    }
  }
}

创建索引 es_persist_4

url

put http://ip:port/es_persist_34

json

{
  "settings": {
    "number_of_shards": "2",
    "number_of_replicas": "1",
    "index.translog.durability": "async",
    "index.translog.sync_interval": "30s",
    "index.translog.flush_threshold_size": "248mb"
  }
}

创建mapping es_persist_4

url

post http://ip:port/es_persist_4/_mapping

json

{
  "properties": {
    "servCode": {
      "type": "text",
      "fields": {
        "keyword": {
          "ignore_above": 256,
          "type": "keyword"
        }
      }
    },
    "componentHost": {
      "type": "text",
      "fields": {
        "keyword": {
          "ignore_above": 256,
          "type": "keyword"
        }
      }
    },
    "exceptionCount": {
      "type": "long"
    },
    "sumCallTime": {
      "type": "long"
    },
    "maxCallTime": {
      "type": "long"
    },
    "cAppCode": {
      "type": "text",
      "fields": {
        "keyword": {
          "ignore_above": 256,
          "type": "keyword"
        }
      }
    },
    "minCallTime": {
      "type": "long"
    },
    "startTime": {
      "type": "long"
    },
    "endTime": {
      "type": "long"
    },
    "sumFlowSize": {
      "type": "long"
    },
    "totalCount": {
      "type": "long"
    },
    "servVersionProxyType": {
      "type": "text",
      "fields": {
        "keyword": {
          "ignore_above": 256,
          "type": "keyword"
        }
      }
    }
  }
}

es的常用指令

删除指定索引，从物理上整个索引的数据删除
url

delete http://ip:port/指定的索引名称

关闭索引，依然占着硬盘，关闭后不可进行io读写
url

post http://ip:port/指定的索引名称/_close

打开索引，占着硬盘，打开后可进行io读写，正常使用
url

post http://ip:port/指定的索引名称/_open

跨集群数据迁移

reindex迁移

b集群请求获取a集群的数据到b集群里,(b集群配置文件需要加上a集群的白名单，见集群安装配置文件)
query可以指定想要的数据，下面是获取指定月份时间段的数据，去掉则是全部数据
“version_type”: "internal"代表覆盖替换冲突的id相同的数据
size是批量条数，太大可能会报错，太小执行较慢
wait_for_completion=false后台异步操作

POST http://bip:bport/_reindex?wait_for_completion=false

{
  "source": {
    "index": "a的索引",
    "remote": {
      "host": "http://aip:aport"
    },
    "size": 1000,
    "query": {
      "range": {
        "receiveTime": {
          "gte": 1635696000000,
          "lt": 1638287999000
        }
      }
    }
  },
  "dest": {
    "index": "b的索引",
    "version_type": "internal"
  }
}

reindex取消命令

reindex执行没结束不想再执行了，成功迁移复制过去的数据依然保留，后续未完成的不再继续

POST _tasks/node_id:task_id/_cancel

reindex查看进度(可以看到node_id:task_id,任务数等)

GET _tasks?detailed=true&actions=*reindex

如果迁移自动停止了或者成功了就看不到进度，在数据预览里可以看到.tasks这个索引，是自动生成存储task任务的，查看详情，如果没有报错则成功迁移，报错几乎原因都是数据太大，所以需要调小size并重试

冷数据备份快照到磁盘

可压缩成压缩包剪切到其他机器，减少本集群的硬盘占用，需要在yml指定目录，见安装集群配置文件elasticsearch.yml，目录建议放在硬盘空间大的目录，防止快照文件过大导致失败，
而且指定的目录要有权限
path.repo: ["/opt/soft/es_backups/backups", “/opt/soft/es_backups/longterm_backups”]

1.创建仓库：创建一个名称为my_backup的仓库

PUT /_snapshot/my_backup

{
  "type": "fs",
  "settings": {
    "location": "/opt/soft/es_backups/backups/my_backup"
  }
}

如果生成快照会在/opt/soft/es_backups/backups下生成仓库my_backup目录

2.往指定仓库my_backup保存指定索引b_index快照数据,创建一个名为 snapshot_1 的快照
wait_for_completion=true可设置同步异步，多个索引用逗号隔开

put /_snapshot/my_backup/snapshot_1?wait_for_completion=true

{
  "indices": "b_index"
}

也可以对snapshot_1 的快照增量同步
直接原命令再创建一个快照 snapshot_2 就可以了

3.数据迁出
打包仓库my_backup

tar -zcvf my_backup.tar.gz my_backup

可以把压缩文件剪切转储其他机器,然后删除这个文件夹
做的更彻底些，可以删除这个仓库，每次创建快照都重开一个仓库，建议用日期后缀

delete /_snapshot/my_backup

数据包文件迁出后，可以把这个索引删除，解除对硬盘的占用

4.数据恢复
如果是新机器上恢复，新机器也要有仓库目录
把快照文件传过来，在快照目录下解压，解压的文件会生成目录my_backup和新机器指定的目录名称需要一致
执行
创建仓库

PUT /_snapshot/my_backup

{
  "type": "fs",
  "settings": {
    "location": "/opt/soft/es_backups/backups/my_backup"
  }
}

读取数据并创建索引

POST /_snapshot/my_backup/snapshot_1/_restore?wait_for_completion=true

{
  "indices": "index_1"
}

集群安全重启

关闭前执行

PUT _cluster/settings

{
  "persistent": {
    "cluster.routing.allocation.enable": "none"
  }
}

重启过后执行恢复

{
    "transient" : {
        "cluster.routing.allocation.enable" : "all"
    }
}

分片移动

post _cluster/reroute

{
  "commands": [
    {
      "move": {
        "index": "aindex",
        "shard": 2,
        "from_node": "node-1",
        "to_node": "node-3"
      }
    }
  ]
}

相关标签： linux服务工具配置配置文件 elasticsearch 搜索引擎 big data

上一篇： java-实现汉诺塔(分治算法)

下一篇： Elasticsearch Essentials——Elasticsearch查询操作

elasticsearch操作

文章目录

集群搭建7.4.1版本,配置

linux优化

索引映射创建，优化

创建索引es_persist_3

创建映射mapping es_persist_3

创建索引 es_persist_4

创建mapping es_persist_4

es的常用指令

跨集群数据迁移

reindex迁移

reindex取消命令

reindex查看进度(可以看到node_id:task_id,任务数等)

冷数据备份快照到磁盘

集群安全重启

分片移动

【C#】工具类-FTP操作封装类FTPHelper

php获取操作系统语言代码

Mac版本HBuilder提示没有jdk或者无权限操作的情况怎么解决？

Photoshop批量编辑图片的操作方法

jQuery实现全选、全不选以及反选操作

linux安装Python3.4.2的操作方法

js常用JSON数据操作

PHP mysqli操作数据库

探究HashMap线性不安全（一）——重温HashMap的put操作

一款简单实用的php操作mysql数据库类