欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页  >  网络运营

prometheus之钉钉报警配置

程序员文章站 2022-03-03 20:33:19
1.上传安装包1.上传最新得二进制安装包并解压tar xf alertmanager-0.20.0-rc.0.linux-amd64.tar.gztar xf prometheus-webhook-dingtalk-0.3.0.linux-amd64.tar.gz2.改名mv alertmanager-0.20.0-rc.0.linux-amd64 alertmanagermv prometheus-webhook-dingtalk-0.3.0.linux-amd64 prometheus-we...

1.上传安装包

1.上传最新得二进制安装包并解压
tar xf alertmanager-0.20.0-rc.0.linux-amd64.tar.gz
tar xf prometheus-webhook-dingtalk-0.3.0.linux-amd64.tar.gz
2.改名
mv alertmanager-0.20.0-rc.0.linux-amd64 alertmanager
mv prometheus-webhook-dingtalk-0.3.0.linux-amd64 prometheus-webhook-dingtalk

2.启动钉钉插件

钉钉创建机器人拿webhook上网一大堆

nohup ./prometheus-webhook-dingtalk --ding.profile="ops_dingding=自己钉钉得webhook"   & 

3.配置alertmanager

# 1.配置文件
vim alertmanager.yml
global:
  resolve_timeout: 5m
route:
  receiver: webhook
  group_wait: 30s
  group_interval: 5m
  repeat_interval: 4h
  group_by: [alertname]
  routes:
  - receiver: webhook
    group_wait: 10s
    match:
      team: node
receivers:
- name: webhook
  webhook_configs:
  - url: http://10.10.9.200:8060/dingtalk/ops_dingding/send #钉钉插件地址,ops_dingding和启动插件指定得名字一样
    send_resolved: true
  
# 2.启动alertmanager
nohup ./alertmanager --config.file=alertmanager.yml &

4.配置prometheus报警规则

#1.配置报警规则
vim rules.yml
groups:
    - name: test-rule
      rules:
      - alert: 主机状态
        expr: up == 0
        for: 2m
        labels:
          status: warning
        annotations:
          summary: "{{$labels.instance}}:服务器关闭"
          description: "{{$labels.instance}}:服务器关闭"

#2.修改prometheus配置让报警生效
vim prometheus.yml
# Alertmanager configuration
alerting:
  alertmanagers:
  - static_configs:
    - targets: ["10.10.9.200:9093"] #alertmanager地址
      # - alertmanager:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  - "rules.yml"#指定报警规则文件
  # - "second_rules.yml"

3.重启prometheus

5.实验配置是否生效

1.关闭node监控
2.钉钉报警信息
[FIRING:1] 主机状态
Labels

alertname: 主机状态
instance: linux
job: node_export
status: warning
Annotations

description: linux:服务器关闭
summary: linux:服务器关闭
Source: http://test:9090/graph?g0.expr=up+%3D%3D+0&g0.tab=1

promethus报警状态
· Inactive:这里什么都没有发生。
· Pending:已触发阈值,但未满足告警持续时间(即rule中的for字段)
· Firing:已触发阈值且满足告警持续时间。警报发送到Notification Pipeline,经过处理,发送给接受者这样目的是多次判断失败才发告警,减少邮件。

本文地址:https://blog.csdn.net/weixin_43999932/article/details/107608046