欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页

ELK RPM 安装配置

程序员文章站 2022-07-13 10:49:40
...
相关组件:

1)filebeat。用于收集日志组件,经测试其使用简单,占用资源比flume更少。但是对资源的占用不是那么智能,需要调整一些参数。filebeat会同时耗费内存和cpu资源,需要小心。

2)kafka。流行的消息队列,在日志收集里有存储+缓冲的功能。kafka的topic过多,会有严重的性能问题,所以需要对收集的信息进行归类。更进一步,直接划分不同的kafka集群。kafka对cpu要求较低,大内存和高速磁盘会显著增加它的性能。

3)logstash。主要用于数据的过滤和整形。这个组件非常贪婪,会占用大量资源,千万不要和应用进程放在一块。不过它算是一个无状态的计算节点,可以根据需要随时扩容。

4)elasticsearch。可以存储容量非常大的日志数据。注意单个索引不要过大,可以根据量级进行按天索引或者按月索引,同时便于删除。

5)kibana。和es集成度非常好的展示组件

选择的组件越多,整个过程会越优雅。


一、elasticsearch

1、安装

rpm -ivh elasticsearch-7.11.2-x86_64.rpm

2、修改elasticsearch配置文件

# vi /etc/elasticsearch/elasticsearch.yml

cluster.name: ycyt-es

node.name: node-2

path.data: /home/elk/es-data

path.logs: /home/elk/es-logs

network.host: 192.101.11.161

http.port: 9200

discovery.seed_hosts: ["192.101.11.159", "192.101.11.161", "192.101.11.233"]

cluster.initial_master_nodes: ["node-1", "node-2", "node-3"]


3、启动elasticsearch

# systemctl start elasticsearch
# systemctl enable elasticsearch
# systemctl status elasticsearch

4、验证
可以打开浏览器,输入机器的IP地址加端口号。

http://192.101.11.233:9200

http://192.101.11.233:9200/_cat/nodes

http://192.101.11.233:9200/_cat/indices?v

通过使用以下命令,验证 Elasticsearch 是否接收到 Filebeat>logstash 数据:
http://192.101.11.233:9200/filebeat-*/_search?pretty

二、kibana

1、安装

# rpm -ivh kibana-7.11.2-x86_64.rpm

2、配置
# vi /etc/kibana/kibana.yml
server.port: 5601

server.host: "192.101.11.231"

server.name: "ycyt-kibana"

elasticsearch.hosts: ["http://192.101.11.159:9200","http://192.101.11.161:9200","http://192.101.11.233:9200"]

i18n.locale: "zh-CN"

xpack.encryptedSavedObjects.encryptionKey: encryptedSavedObjects12345678909876543210
xpack.security.encryptionKey: encryptionKeysecurity12345678909876543210
xpack.reporting.encryptionKey: encryptionKeyreporting12345678909876543210

xpack.reporting.capture.browser.chromium.disableSandbox: true
xpack.reporting.capture.browser.chromium.proxy.enabled: false
xpack.reporting.enabled: false


3、启动
# systemctl start kibana
# systemctl enable kibana
# systemctl status kibana

4、访问查看Kibana启动是否成功,并检索查看数据

http://192.101.11.231:5601

三、logstash

1、安装

# rpm -ivh logstash-7.11.2-x86_64.rpm

2、创建配置

# cd /etc/logstash/conf.d

# touch ycyt_tpl_01.conf
# vi ycyt_tpl_01.conf

input {
  beats {
    port => 8031
  }
}
filter {

  grok {
    match => { "message" => "%{TIMESTAMP_ISO8601:log_date} \[%{DATA:app.thread}\] %{LOGLEVEL:app.level}%{SPACE}*\[%{DATA:app.class}\] %{DATA:app.java_file}:%{DATA:app.code_line} - %{GREEDYDATA:app.message}"}
#    remove_field => ["message","log_date"]
  }

  date {
#    match => [ "log_date", "yyyy-MM-dd HH:mm:ss,SSS" ]
    match => [ "log_date", "ISO8601" ]
    target => "@timestamp"
  }

#  mutate {
#    gsub => ["log.file.path", "[\\]", "/"] 
#  }

  ruby {
    code => "
      path = event.get('log')['file']['path']
      puts format('path = %<path>s', path: path)
      if (!path.nil?) && (!path.empty?)
        fileFullName = path.split('/')[-1]
        event.set('file_full_name', fileFullName)
        event.set('file_name', fileFullName.split('.')[0])
      else
        event.set('file_full_name', event.get('beat'))
        event.set('file_name', event.get('beat'))
      end
    "
  }

#  geoip {
#    source => "clientIp"
#  }
}

output {
  elasticsearch {
    hosts => ["192.101.11.230:9200","192.101.11.232:9200"]
    index => "%{[file_name]}-%{+YYYY.MM.dd}"
#    index => "%{@metadata][beat]}-%{[@metadata][version]}-%{+YYYY.MM.dd}"
  }
  stdout { codec => rubydebug }
}

3、启动
# systemctl start logstash
# systemctl enable logstash
# systemctl status logstash


四、filebeat

1、安装

# rpm -ivh /opt/filebeat-7.11.2-x86_64.rpm

2、配置

# cd /etc/filebeat

# vi filebeat.yml
###################### Filebeat Configuration Example #########################

# This file is an example configuration file highlighting only the most common
# options. The filebeat.reference.yml file from the same directory contains all the
# supported options with more comments. You can use it as a reference.
#
# You can find the full configuration reference here:
# https://www.elastic.co/guide/en/beats/filebeat/index.html

# For more available modules and options, please see the filebeat.reference.yml sample
# configuration file.

# ============================== Filebeat inputs ===============================

filebeat.inputs:

# Each - is an input. Most options can be set at the input level, so
# you can use different inputs for various configurations.
# Below are the input specific configurations.

- type: log

  # Change to true to enable this input configuration.
  enabled: true

  # Paths that should be crawled and fetched. Glob based paths.
  paths:
    - /home/admin/logs/ycyt/*.log
    #- c:\programdata\elasticsearch\logs\*

  # Exclude lines. A list of regular expressions to match. It drops the lines that are
  # matching any regular expression from the list.
  #exclude_lines: ['^DBG']

  # Include lines. A list of regular expressions to match. It exports the lines that are
  # matching any regular expression from the list.
  #include_lines: ['^ERR', '^WARN']

  # Exclude files. A list of regular expressions to match. Filebeat drops the files that
  # are matching any regular expression from the list. By default, no files are dropped.
  #exclude_files: ['.gz$']

  # Optional additional fields. These fields can be freely picked
  # to add additional information to the crawled log files for filtering
  #fields:
  #  level: debug
  #  review: 1

  ### Multiline options

  # Multiline can be used for log messages spanning multiple lines. This is common
  # for Java Stack Traces or C-Line Continuation

  # The regexp Pattern that has to be matched. The example pattern matches all lines starting with [
  #multiline.pattern: ^\[
  multiline.pattern: ^.{24}\[

  # Defines if the pattern set under pattern should be negated or not. Default is false.
  multiline.negate: true

  # Match can be set to "after" or "before". It is used to define if lines should be append to a pattern
  # that was (not) matched before or after or as long as a pattern is not matched based on negate.
  # Note: After is the equivalent to previous and before is the equivalent to to next in Logstash
  multiline.match: after

# filestream is an experimental input. It is going to replace log input in the future.
- type: filestream

  # Change to true to enable this input configuration.
  enabled: false

  # Paths that should be crawled and fetched. Glob based paths.
  paths:
    - /var/log/*.log
    #- c:\programdata\elasticsearch\logs\*

  # Exclude lines. A list of regular expressions to match. It drops the lines that are
  # matching any regular expression from the list.
  #exclude_lines: ['^DBG']

  # Include lines. A list of regular expressions to match. It exports the lines that are
  # matching any regular expression from the list.
  #include_lines: ['^ERR', '^WARN']

  # Exclude files. A list of regular expressions to match. Filebeat drops the files that
  # are matching any regular expression from the list. By default, no files are dropped.
  #prospector.scanner.exclude_files: ['.gz$']

  # Optional additional fields. These fields can be freely picked
  # to add additional information to the crawled log files for filtering
  #fields:
  #  level: debug
  #  review: 1

# ============================== Filebeat modules ==============================

filebeat.config.modules:
  # Glob pattern for configuration loading
  path: ${path.config}/modules.d/*.yml

  # Set to true to enable config reloading
  reload.enabled: false

  # Period on which files under path should be checked for changes
  #reload.period: 10s

# ======================= Elasticsearch template setting =======================

setup.template.settings:
  index.number_of_shards: 1
  #index.codec: best_compression
  #_source.enabled: false


# ================================== General ===================================

# The name of the shipper that publishes the network data. It can be used to group
# all the transactions sent by a single shipper in the web interface.
#name:

# The tags of the shipper are included in their own field with each
# transaction published.
#tags: ["service-X", "web-tier"]

# Optional fields that you can specify to add additional information to the
# output.
#fields:
#  env: staging

# ================================= Dashboards =================================
# These settings control loading the sample dashboards to the Kibana index. Loading
# the dashboards is disabled by default and can be enabled either by setting the
# options here or by using the `setup` command.
#setup.dashboards.enabled: false

# The URL from where to download the dashboards archive. By default this URL
# has a value which is computed based on the Beat name and version. For released
# versions, this URL points to the dashboard archive on the artifacts.elastic.co
# website.
#setup.dashboards.url:

# =================================== Kibana ===================================

# Starting with Beats version 6.0.0, the dashboards are loaded via the Kibana API.
# This requires a Kibana endpoint configuration.
setup.kibana:

  # Kibana Host
  # Scheme and port can be left out and will be set to the default (http and 5601)
  # In case you specify and additional path, the scheme is required: http://localhost:5601/path
  # IPv6 addresses should always be defined as: https://[2001:db8::1]:5601
  #host: "localhost:5601"

  # Kibana Space ID
  # ID of the Kibana Space into which the dashboards should be loaded. By default,
  # the Default Space will be used.
  #space.id:

# =============================== Elastic Cloud ================================

# These settings simplify using Filebeat with the Elastic Cloud (https://cloud.elastic.co/).

# The cloud.id setting overwrites the `output.elasticsearch.hosts` and
# `setup.kibana.host` options.
# You can find the `cloud.id` in the Elastic Cloud web UI.
#cloud.id:

# The cloud.auth setting overwrites the `output.elasticsearch.username` and
# `output.elasticsearch.password` settings. The format is `<user>:<pass>`.
#cloud.auth:

# ================================== Outputs ===================================

# Configure what output to use when sending the data collected by the beat.

# ---------------------------- Elasticsearch Output ----------------------------
#output.elasticsearch:
  # Array of hosts to connect to.
#  hosts: ["localhost:9200"]

  # Protocol - either `http` (default) or `https`.
  #protocol: "https"

  # Authentication credentials - either API key or username/password.
  #api_key: "id:api_key"
  #username: "elastic"
  #password: "changeme"

# ------------------------------ Logstash Output -------------------------------
output.logstash:
  # The Logstash hosts
  hosts: ["localhost:8031"]

  # Optional SSL. By default is off.
  # List of root certificates for HTTPS server verifications
  #ssl.certificate_authorities: ["/etc/pki/root/ca.pem"]

  # Certificate for SSL client authentication
  #ssl.certificate: "/etc/pki/client/cert.pem"

  # Client Certificate Key
  #ssl.key: "/etc/pki/client/cert.key"

# ================================= Processors =================================
processors:
  - add_host_metadata:
      when.not.contains.tags: forwarded
  - add_cloud_metadata: ~
  - add_docker_metadata: ~
  - add_kubernetes_metadata: ~

# ================================== Logging ===================================

# Sets log level. The default log level is info.
# Available log levels are: error, warning, info, debug
#logging.level: debug

# At debug level, you can selectively enable logging only for some components.
# To enable all selectors use ["*"]. Examples of other selectors are "beat",
# "publisher", "service".
#logging.selectors: ["*"]

# ============================= X-Pack Monitoring ==============================
# Filebeat can export internal metrics to a central Elasticsearch monitoring
# cluster.  This requires xpack monitoring to be enabled in Elasticsearch.  The
# reporting is disabled by default.

# Set to true to enable the monitoring reporter.
#monitoring.enabled: false

# Sets the UUID of the Elasticsearch cluster under which monitoring data for this
# Filebeat instance will appear in the Stack Monitoring UI. If output.elasticsearch
# is enabled, the UUID is derived from the Elasticsearch cluster referenced by output.elasticsearch.
#monitoring.cluster_uuid:

# Uncomment to send the metrics to Elasticsearch. Most settings from the
# Elasticsearch output are accepted here as well.
# Note that the settings should point to your Elasticsearch *monitoring* cluster.
# Any setting that is not set is automatically inherited from the Elasticsearch
# output configuration, so if you have the Elasticsearch output configured such
# that it is pointing to your Elasticsearch monitoring cluster, you can simply
# uncomment the following line.
#monitoring.elasticsearch:

# ============================== Instrumentation ===============================

# Instrumentation support for the filebeat.
#instrumentation:
    # Set to true to enable instrumentation of filebeat.
    #enabled: false

    # Environment in which filebeat is running on (eg: staging, production, etc.)
    #environment: ""

    # APM Server hosts to report instrumentation results to.
    #hosts:
    #  - http://localhost:8200

    # API Key for the APM Server(s).
    # If api_key is set then secret_token will be ignored.
    #api_key:

    # Secret token for the APM Server(s).
    #secret_token:


# ================================= Migration ==================================

# This allows to enable 6.7 migration aliases
#migration.6_to_7.enabled: true


3、启动

# systemctl start filebeat
# systemctl enable filebeat
# systemctl status filebeat


#####################################使用kafka模式#########################################

Kafka 集群搭建参见:
https://www.iteye.com/blog/user/maosheng/blog/2520386


logstash 配置:

# cd /etc/logstash/conf.d
# vi ycyt_tpl_01.conf

input {
  kafka {
        enable_auto_commit => true
        auto_commit_interval_ms => "1000"
        codec => "json"
        bootstrap_servers => "192.101.11.159:9092,192.101.11.161:9092,192.101.11.231:9092"
        topics => ["logs"]
  }
}

filter {

  grok {
    match => { "message" => "%{TIMESTAMP_ISO8601:log_date} \[%{DATA:app.thread}\] %{LOGLEVEL:app.level}%{SPACE}*\[%{DATA:app.class}\] %{DATA:app.java_file}:%{DATA:app.code_line} - %{GREEDYDATA:app.message}"}
#    remove_field => ["message","log_date"]
  }

  date {
#    match => [ "log_date", "yyyy-MM-dd HH:mm:ss,SSS" ]
    match => [ "log_date", "ISO8601" ]
    target => "@timestamp"
  }

#  mutate {
#    gsub => ["log.file.path", "[\\]", "/"] 
#  }

  ruby {
    code => "
      path = event.get('log')['file']['path']
      puts format('path = %<path>s', path: path)
      if (!path.nil?) && (!path.empty?)
        fileFullName = path.split('/')[-1]
        event.set('file_full_name', fileFullName)
        event.set('file_name', fileFullName.split('.')[0])
      else
        event.set('file_full_name', event.get('beat'))
        event.set('file_name', event.get('beat'))
      end
    "
  }

#  geoip {
#    source => "clientIp"
#  }
}

output {
  elasticsearch {
    hosts => ["192.101.11.159:9200","192.101.11.161:9200","192.101.11.233:9200"]
    index => "%{[file_name]}-%{+YYYY.MM.dd}"
#    index => "%{@metadata][beat]}-%{[@metadata][version]}-%{+YYYY.MM.dd}"
  }
  stdout { codec => rubydebug }
}


filebeat 配置:

# cd /etc/filebeat
# vi filebeat.yml

# ============================== Filebeat inputs ===============================

filebeat.inputs:

# Each - is an input. Most options can be set at the input level, so
# you can use different inputs for various configurations.
# Below are the input specific configurations.

- type: log

  # Change to true to enable this input configuration.
  enabled: true

  # Paths that should be crawled and fetched. Glob based paths.
  paths:
    - /home/admin/logs/ycyt/*.log
    #- c:\programdata\elasticsearch\logs\*

  # Exclude lines. A list of regular expressions to match. It drops the lines that are
  # matching any regular expression from the list.
  #exclude_lines: ['^DBG']

  # Include lines. A list of regular expressions to match. It exports the lines that are
  # matching any regular expression from the list.
  #include_lines: ['^ERR', '^WARN']

  # Exclude files. A list of regular expressions to match. Filebeat drops the files that
  # are matching any regular expression from the list. By default, no files are dropped.
  #exclude_files: ['.gz$']

  # Optional additional fields. These fields can be freely picked
  # to add additional information to the crawled log files for filtering
  #fields:
  #  level: debug
  #  review: 1

  ### Multiline options

  # Multiline can be used for log messages spanning multiple lines. This is common
  # for Java Stack Traces or C-Line Continuation

  # The regexp Pattern that has to be matched. The example pattern matches all lines starting with [
  #multiline.pattern: ^\[
  multiline.pattern: ^.{24}\[

  # Defines if the pattern set under pattern should be negated or not. Default is false.
  multiline.negate: true

  # Match can be set to "after" or "before". It is used to define if lines should be append to a pattern
  # that was (not) matched before or after or as long as a pattern is not matched based on negate.
  # Note: After is the equivalent to previous and before is the equivalent to to next in Logstash
  multiline.match: after

# filestream is an experimental input. It is going to replace log input in the future.
- type: filestream

  # Change to true to enable this input configuration.
  enabled: false

  # Paths that should be crawled and fetched. Glob based paths.
  paths:
    - /var/log/*.log
    #- c:\programdata\elasticsearch\logs\*

  # Exclude lines. A list of regular expressions to match. It drops the lines that are
  # matching any regular expression from the list.
  #exclude_lines: ['^DBG']

  # Include lines. A list of regular expressions to match. It exports the lines that are
  # matching any regular expression from the list.
  #include_lines: ['^ERR', '^WARN']

  # Exclude files. A list of regular expressions to match. Filebeat drops the files that
  # are matching any regular expression from the list. By default, no files are dropped.
  #prospector.scanner.exclude_files: ['.gz$']

  # Optional additional fields. These fields can be freely picked
  # to add additional information to the crawled log files for filtering
  #fields:
  #  level: debug
  #  review: 1

# ============================== Filebeat modules ==============================

filebeat.config.modules:
  # Glob pattern for configuration loading
  path: ${path.config}/modules.d/*.yml

  # Set to true to enable config reloading
  reload.enabled: false

  # Period on which files under path should be checked for changes
  #reload.period: 10s

# ======================= Elasticsearch template setting =======================

setup.template.settings:
  index.number_of_shards: 1
  #index.codec: best_compression
  #_source.enabled: false


# =================================== Kibana ===================================

# Starting with Beats version 6.0.0, the dashboards are loaded via the Kibana API.
# This requires a Kibana endpoint configuration.
setup.kibana:

  # Kibana Host
  # Scheme and port can be left out and will be set to the default (http and 5601)
  # In case you specify and additional path, the scheme is required: http://localhost:5601/path
  # IPv6 addresses should always be defined as: https://[2001:db8::1]:5601
  #host: "localhost:5601"

  # Kibana Space ID
  # ID of the Kibana Space into which the dashboards should be loaded. By default,
  # the Default Space will be used.
  #space.id:

# ==================================Kafka Outputs ===================================

# Configure what output to use when sending the data collected by the beat.

output.kafka:
   enable: true
   hosts: ["192.101.11.159:9092","192.101.11.161:9092","192.101.11.231:9092"]
   topic: 'logs'
   #version: '0.10.2.0'

# ---------------------------- Elasticsearch Output ----------------------------
#output.elasticsearch:
  # Array of hosts to connect to.
#  hosts: ["localhost:9200"]

  # Protocol - either `http` (default) or `https`.
  #protocol: "https"

  # Authentication credentials - either API key or username/password.
  #api_key: "id:api_key"
  #username: "elastic"
  #password: "changeme"

# ------------------------------ Logstash Output -------------------------------
#output.logstash:
  # The Logstash hosts
  # hosts: ["localhost:8031"]

  # Optional SSL. By default is off.
  # List of root certificates for HTTPS server verifications
  #ssl.certificate_authorities: ["/etc/pki/root/ca.pem"]

  # Certificate for SSL client authentication
  #ssl.certificate: "/etc/pki/client/cert.pem"

  # Client Certificate Key
  #ssl.key: "/etc/pki/client/cert.key"

# ================================= Processors =================================
processors:
  - add_host_metadata:
      when.not.contains.tags: forwarded
  - add_cloud_metadata: ~
  - add_docker_metadata: ~
  - add_kubernetes_metadata: ~

##############################################################################