Prometheus告警规则配置

2020年08月11日

DevOps

留言

Alertmanager的作用

规则在rules 在prometheus/rules目录下，后缀一般为.rules或.yaml。。

在prometheus/rules目录下新增一个shuang_test.rules，重启prometheus后可在此页面(http://192.168.100.254:9090/alerts?search=) 看到

告警在alertmanager/alertmanager.yml中配置，可以认为是告警部分，比如发给谁什么的，发到哪里什么的

Prometheus AlertManager的使用（邮件和微信告警

图片来自 AlertManager的使用（邮件和微信告警）

Prometheus从各个节点那里pull数据。达到设置的阈值后(rules/中的.yaml或.rules文件)，通过push的方式发送给Alertmanager；Alertmanager 最后将 Prometheus 推送过来的告警信息经过其模板渲染后路由给指定用户&&接收端（邮件/钉钉/企业微信等），从而实现告警通知

https://blog.csdn.net/IT_ZRS/article/details/129932353

Prometheus的rules，是yaml文件还是rules文件？

Prometheus的rules可以以两种格式定义:

YAML文件格式:

这是Prometheus官方推荐的格式,rules以YAML文件格式定义,通常以.yaml或.yml作为后缀。例如:

groups:
- name: example
  rules:

  - record: job_failed_duration_seconds_total
    expr: sum(increase(job_failed_duration_seconds_total[5m])) by (job) 

- name: example2
  rules:

  - record: job_success_duration_seconds_count
    expr: sum(increase(job_success_duration_seconds_count[5m])) by (job)

Prometheus规则语言格式:

这是Prometheus早期使用的原生规则文件格式,以.rules作为后缀。例如:

1
2
3

job_failed_duration_seconds_total = sum(increase(job_failed_duration_seconds_total[5m])) by (job)

job_success_duration_seconds_count = sum(increase(job_success_duration_seconds_count[5m])) by (job)

总结:

YAML格式是推荐的格式,可以定义多个rule groups。
原生规则格式现在仍被支持,但日后可能会被移除。
无论哪种格式,文件扩展名比较常用的有 .yaml/.yml/.rules

一个可用的rules如下:

groups:
- name: shuang1
  rules:
  - alert: shuang1
    expr: max(ontology_block_height{job="polaris"})  >= 1
    for: 10s
    labels:
      severity: page
    annotations:
      summary: "shuang-test: Instance {{ $labels.instance }} polaris block height diff too much"
      description: "{{ $labels.instance }} polaris height diff: {{ $value }}"

可以使用up{job=”diuwallet”}=1这个条件，永远满足 (up代表在不在线)

Prometheus和node_exporter什么关系？必须要使用到吗

Prometheus和node_exporter是相关但独立的两个组件:

Prometheus是著名的监控与报警工具,可以通过抓取或推送方式收集时序数据,并提供查询语言PromQL,可以设置报警规则。
node_exporter是Prometheus生态中的一个 exporter,用于暴露主机节点的硬件和系统指标,如CPU、内存、磁盘利用率等。

node_exporter和Prometheus的关系:

node_exporter可以向Prometheus提供主机节点指标数据。Prometheus可以通过抓取node_exporter的metrics接口来获取这些数据。
但node_exporter并非必须与Prometheus结合使用。node_exporter暴露metrics接口,任何支持抓取或推送的监控系统都可以获取这些指标,不一定是Prometheus。
Prometheus也可以抓取其他类型的目标,不一定是node_exporter。它支持很多种数据源。

所以,Prometheus和node_exporter不具有强依赖关系:

Prometheus可以独立运行,不必须使用node_exporter。
node_exporter也可以提供给其他监控系统,不必须要与Prometheus集成。
但两者结合可以构建强大的监控报警系统,是常见的组合方式。

总之,用户根据需要决定是否一起使用Prometheus和node_exporter,它们可以配合但也可以独立存在。

这段配置什么意思?

groups:
- name: dappnodeheight
  rules:
  - alert: dappnodeheight
    expr: rate(ontology_block_height_mainnet[5m]) <= 0.001
    for: 1m
    labels:
      severity: page
    annotations:
      summary: "dappnode blk height not grow!"
      description: "growth rate  (current value: {{ $value }})"

这段YAML配置定义了一个Prometheus报警规则:

groups: 定义了一个规则组,名称为dappnodeheight
rules: 该组包含一个规则
alert: 规则的报警名称为dappnodeheight
expr: 报警表达式,如果5分钟内ontology_block_height_mainnet指标的增长率<= 0.001就触发报警
for: 报警持续时间为1分钟,即需要连续1分钟满足报警条件才实际发送报警
labels: 定义了该报警的severity为page
annotations: 定义了报警的文本内容
- summary:报警简要说明
- description: 报警详细描述,使用插入当前指标的值

总结一下,这是一个检测ontology主网区块增长过慢的报警规则。如果5分钟内区块高度增长率非常小,则认为网络存在问题,发送报警。报警文本包含当前具体的增长率值。

Prometheus报警规则允许灵活定义复杂的报警逻辑,expression和for组合检测持续问题,annotations自定义报警文本。

配置文件中的groups和rules有何区别？

在 Prometheus 配置报警规则时,groups 和 rules 有以下区别:

groups: 用来定义一组相关的报警规则。一个配置文件可以包含多个 groups。
rules: 用来定义单条报警规则。每个 group 下可以包含多条 rules。

举个例子:

groups:
- name: host_alerts 
  rules:

  - alert: HostOutOfMemory
    expr: node_memory_MemAvailable < 1048576
    for: 5m

  - alert: HostHighLoad 
    expr: node_load5 > 3
    for: 2m
  
- name: db_alerts
  rules:
  
  - alert: HighQPS
    expr: rate(mysql_queries[5m]) > 100
    for: 10m

这个例子定义了两个组:

host_alerts:包含两条关于主机指标的报警规则
db_alerts:包含一条关于数据库QPS的报警规则。

groups的作用是将相关的规则组织在一起,便于管理。rules则定义了实际的报警判断逻辑。

所以在Prometheus报警配置中,groups负责规则的组织和分组,rules用于定义真正的报警规则。两者搭配使用可以配置清晰可管理的报警系统。

http://43.134.68.68:9090/graph?g0.expr=rate(ontology_block_height_mainnet%5B5m%5D)&g0.tab=1&g0.stacked=0&g0.show_exemplars=0&g0.range_input=1h

rate(ontology_block_height_mainnet[5m])

Prometheus 新增规则后，是否需要重启？

在Prometheus中,当报警规则配置(rule files)发生变更时,无需重启Prometheus服务器,规则可以自动重新加载。

具体来说,Prometheus支持两种方式动态重新加载规则文件:

发送SIGHUP信号给Prometheus进程:

1	kill -HUP <pid of Prometheus process>

收到SIGHUP信号后,Prometheus会重新加载规则文件。

HTTP请求Prometheus的-/reload接口:

1	curl -X POST http://localhost:9090/-/reload

调用该接口同样可以触发规则重载。

另外,Prometheus也提供了配置规则文件自动重新加载的功能。可以在启动参数中指定:

1	--rule-files="/path/to/rules/*.yaml" --reload-on-change

这样当规则文件变更时,会自动重载而无需手动操作。

所以Prometheus可以动态加载修改后的规则文件,不需要重启服务器,这大大简化了规则管理。新的报警规则可以很快应用生效。

http://43.134.68.68:9090/rules

如果发现不符合预期，就去 http://43.134.68.68:9090/rules，点击expr，去实际Execute一下~

可能有坑的是，某个expr查出来会有多台机器，要注意选其中min或max，或avg，再和后面的值做比较~

例如如果不加 min，则会这样

将告警信息发送到邮箱

参考 prometheus使用 (十三) QQ邮件告警

udaupxyghnaabjcg

#全局配置
global:
  resolve_timeout: 5m     #当告警的状态由"firing"变为"resolve"的以后还要呆多长时间，才宣布告警解除
  smtp_from: '2665483426@qq.com'          #发件人邮箱
  smtp_smarthost: 'smtp.qq.com:465'       #qq邮箱smtp端口端口
  smtp_auth_username: '2665483426@qq.com'   #邮箱地址
  smtp_auth_password: 'vxehxwuwnwlreaih'    #邮箱安全码
  smtp_require_tls: false                   #不携带证书请求
  smtp_hello: 'qq.com'
 
#路由配置
route:            
  group_by: ['alertname']    #告警应该根据那些标签进行分组
                             #这里匹配的是key，当相同标签的告警一起来的时候分为同一组
  group_wait: 5s             #一组的告警发出前要等待多少秒，这个是为了把更多的告警一个批次发出去
  group_interval: 5s         #同一组的多批次告警间隔多少秒后，才能发出
  repeat_interval: 5m        #重复的告警要等待多久后才能再次发出去
  receiver: 'email'           #指定路由到email的路由
 
#发送配置
receivers:                         #定义谁接收告警，（接收路由的告警）
- name: 'email'                    #名称对应route中的receiver参数
  email_configs:                   #采用邮箱机制
  - to: '2665483426@qq.com'        #发送到那里的邮箱
    send_resolved: true            #是否发送状态恢复的邮件

MSABYJVCZKKGRPIE

网易邮箱相关配置:

global:
  resolve_timeout: 5m
  smtp_from: '68985@163.com'            #发送人的邮箱，自定义  
  smtp_smarthost: 'smtp.163.com:465'
  smtp_auth_username: '68985@163.com'  #你自己的邮箱  
  smtp_auth_password: 'MSABYJVCZKKGRPIE'   #这个是上面获取安全吗的值，不是你邮箱的密码
  smtp_require_tls: false
  smtp_hello: '163.com'

将告警信息发送到IM中

发送飞书/Lark

https://www.google.com/search?q=prometheus+alertmanager+lark&newwindow=1&sca_esv=556959823&sxsrf=AB5stBiAP5qDcstJlttdFqGnqQLOhznCfA%3A1692066690325&ei=guPaZMarE7fq2roPypyUKA&oq=prometheus+lark&gs_lp=Egxnd3Mtd2l6LXNlcnAiD3Byb21ldGhldXMgbGFyayoCCAAyBhAAGAgYHkjJRFDlAljDNXAEeAGQAQCYAaMBoAH_DqoBBDYuMTC4AQPIAQD4AQGoAgrCAgoQABhHGNYEGLADwgIHECMY6gIYJ8ICBBAjGCfCAgcQABiKBRhDwgILEAAYgAQYsQMYgwHCAggQABiABBixA8ICCxAAGIoFGLEDGIMBwgIHECMYigUYJ8ICERAuGIAEGLEDGIMBGMcBGNEDwgIFEAAYgATCAgsQLhiABBixAxiDAcICDRAAGIoFGLEDGIMBGEPCAgcQABgMGIAEwgIOEC4YgAQYxwEY0QMYywHCAggQABiABBjLAcICBBAAGB7CAgYQABgFGB7iAwQYACBBiAYBkAYC&sclient=gws-wiz-serp

https://www.yoyoask.com/?p=4514

从使用Receiver接收告警信息下面的几小节的介绍来看，邮件，slack，企业微信，都比较容易直接集成。

集成钉钉，飞书等需要额外的工作

https://open.larksuite.com/open-apis/bot/v2/hook/bb2943b9-59f2-4063-8b47-414f158b33c8

新建一个群，点击添加机器人

要将 Prometheus Alertmanager 的告警发送到 Lark（飞书）上，你可以使用 Alertmanager 的 webhook 配置来实现。Alertmanager 提供了发送告警通知到外部系统的功能，其中包括通过 webhook 将告警发送到自定义的目标。

以下是将 Prometheus Alertmanager 配置为发送告警到 Lark 的一般步骤：

确保你已经在 Lark 上创建了一个机器人，并获取到了该机器人的 Webhook URL。
编辑 Alertmanager 的配置文件（通常是 alertmanager.yml）。
在 receivers 部分添加一个新的 receiver 来定义发送到 Lark 的告警配置。例如：

receivers:
  - name: 'lark-webhook'
    webhook_configs:
      - url: 'https://api.lark.com/your-webhook-url'

请将 url 替换为你在 Lark 机器人中获取到的实际 Webhook URL。

在 route 部分将你想要发送到 Lark 的告警路由到刚刚定义的 receiver。例如：

1
2
3

route:
  receiver: 'lark-webhook'
  # 添加其他的路由规则...

保存并关闭配置文件。
重新启动 Prometheus Alertmanager 服务，使配置生效。

完成上述步骤后，Prometheus Alertmanager 将会将匹配的告警发送到 Lark 机器人的 Webhook URL，从而实现告警的发送到 Lark 平台上。

请注意，以上步骤仅提供了一种将告警发送到 Lark 的一般方法，具体的配置可能会根据你的需求和环境而有所不同。你可以根据 Alertmanager 的文档和 Lark 的文档进行进一步的配置和调整，以满足你的实际需求。

实际上按上面说的是有问题的。。slack和企业微信Alertmanager是直接支持的，但是钉钉，飞书/Lark需要自己写代码，将Alertmanager的格式转成Lark的格式

发送slack

slack: （参考 prometheus报警—通过slack发送报警内容）

例如直接请求 https://open.larksuite.com/open-apis/bot/v2/hook/bb2943b9-59f2-4063-8b47-414f158b33c8

社区用的比较多的项目是 PrometheusAlert，这一篇是通过alertmanager和PrometheusAlert实现飞书监控告警, 这篇 prometheus配置邮件及飞书告警也差不多

也有自己实现 alertmanager到飞书请求体的转换工具，prometheus实战之五：飞书通知告警

发送企业微信

集成企业微信

https://leehao.me/prometheus-%E5%8F%91%E9%80%81%E4%BC%81%E4%B8%9A%E5%BE%AE%E4%BF%A1/

global:
  resolve_timeout: 10m
  wechat_api_url: 'https://qyapi.weixin.qq.com/cgi-bin/'
  wechat_api_secret: '应用的secret，在应用的配置页面可以看到'
  wechat_api_corp_id: '企业id，在企业的配置页面可以看到'
templates:
- '/etc/alertmanager/config/*.tmpl'
route:
  group_by: ['alertname']
  group_wait: 30s
  group_interval: 5m
  repeat_interval: 12h
  routes:
  - receiver: 'wechat'
    continue: true
inhibit_rules:
- source_match:
receivers:
- name: 'wechat'
  wechat_configs:
  - send_resolved: false
    corp_id: '企业id，在企业的配置页面可以看到'
    to_user: '@all'
    to_party: ' PartyID1 | PartyID2 '
    message: '{{ template "wechat.default.message" . }}'
    agent_id: '应用的AgentId，在应用的配置页面可以看到'
    api_secret: '应用的secret，在应用的配置页面可以看到'

https://work.weixin.qq.com/wework_admin/frame#apps/createApiApp

企业id从这里找 https://work.weixin.qq.com/wework_admin/frame#profile/enterprise

agent_id: ‘应用的AgentId，在应用的配置页面可以看到’

https://www.cnblogs.com/miaocbin/p/13706164.html

直接启动，查看日志

1
2
3

{"caller":"wechat.go:178","incident":"{}/{}:{alertname=\"shuang1\"}","integration":"wechat","level":"debug","response":"{\"errcode\":60020,\"errmsg\":\"not allow to access from your ip, hint: [1692253047612133639172792], from ip: 61.169.64.116, more info at https://open.work.weixin.qq.com/devtool/query?e=60020\"}","ts":"2023-08-17T06:17:27.510Z"}
{"attempts":1,"caller":"notify.go:743","component":"dispatcher","integration":"wechat[0]","level":"debug","msg":"Notify success","receiver":"wechat","ts":"2023-08-17T06:17:27.510Z"}

https://developer.work.weixin.qq.com/devtool/query?e=60020

最后用这个配置就好了

global:
  resolve_timeout: 5m
  smtp_smarthost: 'smtp.163.com:465'
  smtp_from: 'xxxxxx@163.com' # 用于发送告警右键的邮箱
  smtp_auth_username: 'xxxxx@163.com'
  smtp_auth_password: 'xxxxxx'    #此处为邮箱的授权密码，非邮箱登录密码
  smtp_require_tls: false
  smtp_hello: '163.com'
  #企业微信相关配置
  wechat_api_url: 'https://qyapi.weixin.qq.com/cgi-bin/'
  wechat_api_secret: 'xzxzxa-xzxzx'  #应用的secret，在应用的配置页面可以看到
  wechat_api_corp_id: 'asasaasasasas'   #企业id，在企业的配置页面可以看到


route:
  group_by: ['alertname']
  group_wait: 5s #30s
  group_interval: 5m #5m
  repeat_interval: 30m #1h
  receiver: 'default-receiver'
  routes: # 子路由
   - receiver: 'k8smail'
     continue: true
   - receiver: 'ont-slack'
     continue: true
   - receiver: 'wechat'
     continue: true

receivers:
- name: 'default-receiver'
  email_configs:
  - to: 'xxxx@126.com' # 接收报警邮件的邮箱
    send_resolved: true
- name: 'k8smail'
  email_configs:
  - to: 'xxxxx@163.com' # 接收报警邮件的邮箱
    send_resolved: true
  - to: 'xxxxx@gmail.com' # 接收报警邮件的邮箱(cuishuang)
    send_resolved: true
- name: 'ont-slack'
  slack_configs:
   - send_resolved: true
     api_url: https://hooks.slack.com/services/xxx/xxx/xxxx   #这里填你安装webhook app应该的时候给的url
     channel: '#monitoring'   #要发往那个频道
     text: "{{ range .Alerts }} {{ .Annotations.description}}\n {{end}} @{{ .CommonAnnotations.username}} <{{.CommonAnnotations.link}}| click here>"
     title: "{{.CommonAnnotations.summary}}"
     title_link: "{{.CommonAnnotations.link}}"
- name: 'wechat'
  wechat_configs:
  - send_resolved: false
    corp_id: 'xxxx'
    to_user: '@all'
   # to_party: ' PartyID1 | PartyID2 '
    message: '{{ template "wechat.default.message" . }}'
  #  message: '触发告警，请查看'   
    agent_id: '1000002'  #应用的AgentId，在应用的配置页面可以看到
    api_secret: 'YpZ-OgtocBlUGnTB89oGUrSwHMeyT29Hf0lGxYGxqGc'  #应用的secret，在应用的配置页面可以看到

inhibit_rules:
  - source_match:
      severity: 'critical'
    target_match:
      severity: 'warning'
    equal: ['alertname', 'dev', 'instance']

alertmanager.yml中route部分如果有多个receiver，该怎么写？

在alertmanager.yml中，你可以定义多个接收者（receivers）并通过路由（route）来决定哪些警报应该发送到哪些接收者。以下是一个简单的示例，说明如何定义和使用多个接收者。

route:
  receiver: 'default-receiver'
  group_wait: 30s
  group_interval: 5m
  repeat_interval: 4h 
  routes:
  - match:
      severity: critical
    receiver: 'critical-receiver'
  - match_re:
      service: ^(foo1|foo2|baz)$
    receiver: 'service-foo-receiver'
  - match:
      service: bar
    receiver: 'service-bar-receiver'

receivers:
- name: 'default-receiver'
  email_configs:
  - to: 'default-email@example.com'

- name: 'critical-receiver'
  email_configs:
  - to: 'critical-email@example.com'

- name: 'service-foo-receiver'
  email_configs:
  - to: 'service-foo-email@example.com'

- name: 'service-bar-receiver'
  email_configs:
  - to: 'service-bar-email@example.com'

在上述配置中，警报会根据其标签和对应的匹配规则被路由到不同的接收者。例如，严重性为”critical”的警报会被发送到’critical-receiver’，服务名为’foo1’、’foo2’或’baz’的警报会被发送到’service-foo-receiver’，服务名为’bar’的警报会被发送到’service-bar-receiver’。所有其他没有匹配到的警报将会被发送到’default-receiver’。

请注意，这只是一个基本示例。实际的配置可能会根据你的需要进行修改，并且可能会包含更复杂的路由规则和接收者配置。例如，每个接收者可以有多个不同类型的配置（如email, webhook, PagerDuty等）。

查看是否触发告警

http://192.168.100.254:9090/alerts?search=

Inactive是不活跃，即这项规则达不到触发的阈值
Pending是挂起中(理解成蓄力中)，已经到阈值，但要达到设置得持续时间后，才会变成Firing
Firing可以理解为开火，即已经发送告警

另外也有说

https://blog.csdn.net/IT_ZRS/article/details/129932353

inactive：没有异常。
pending：已触发阈值，但未满足告警的持续时间（即 rule 中的 for 字段）。
firing：已触发阈值且满足告警的持续时间，然后将告警发送至 Alertmanager，Alertmanager 根据相关模板发送至Email/钉钉等。

为什么会出现只有邮件或slack能收到的情况？怎样才能同时都收到

大概率是route，即转发规则有问题

route:
  group_by: ['alertname']
  group_wait: 5s #30s
  group_interval: 5s #5m
  repeat_interval: 1m #1h
  receiver: 'k8smail'
  routes: # 子路由
   - receiver: 'my-slack'

上面这样配置，为什么k8smail收不到，只能my-slack收到？

你的问题是关于Prometheus Alertmanager的路由配置，你希望理解为什么所有的警报都发送到’my-slack’接收者，而不是’k8smail’接收者。

根据你的配置，主路由（root route）设置了接收者为’k8smail’。然后，你定义了一个子路由（child route），并设置了接收者为’my-slack’。根据Alertmanager的路由规则，如果一个警报匹配到一个子路由，那么它将会被发送到该子路由指定的接收者，而不是主路由的接收者。

然而，你的子路由没有定义匹配条件（match 或 match_re），这就意味着所有的警报都会匹配到这个子路由，因此，所有的警报都会被发送到’ont-slack’接收者，而不是’k8smail’接收者。

如果你希望只有特定的警报发送到’ont-slack’，你应该在子路由中定义适当的匹配条件。例如：

route:
  group_by: ['alertname']
  group_wait: 5s
  group_interval: 5s
  repeat_interval: 1m
  receiver: 'k8smail'
  routes: # 子路由
   - match:
       severity: critical
     receiver: 'my-slack'

在这个例子中，只有严重性为’critical’的警报会被发送到’my-slack’，其他的警报会被发送到’k8smail’。如果你没有定义子路由的匹配条件，那么所有警报都会默认匹配到子路由，所以你会看到所有的警报都发送到了’my-slack’。

alertmanager 子路由routes配置多个receivers

Alertmanager 的 routes 配置允许设置多个 receivers,这使得可以将告警发送到多个不同的接收器进行处理。

例如,可以在 routes 下配置多个 receivers,像这样:

routes:
  - receiver: 'email-receiver'
    # 其他路由匹配规则

  - receiver: 'slack-receiver'

  - receiver: 'pagerduty-receiver'

上面配置了3个不同的 receivers。当一个告警被触发时,它会被发送到所有3个配置的 receivers。

你也可以通过 continue 字段来控制告警只被发送到第一个匹配的 receiver,而不是所有的:

routes:
  - receiver: 'email-receiver'
    # 其他路由匹配规则
    continue: true

  - receiver: 'slack-receiver'

  - receiver: 'pagerduty-receiver'

这时告警只会被发送到第一个匹配的 receiver,后面的 receiver 不会再收到这个告警。

同时,你也可以通过 match 和 match_re 来匹配告警的标签,从而实现根据告警的不同发送到不同的receiver。

总之,routes 配置允许很灵活地控制告警的发送逻辑。

此处
这样写就可以了

route:
  group_by: ['alertname']
  group_wait: 5s #30s
  group_interval: 5s #5m
  repeat_interval: 1m #1h
  receiver: 'k8smail'
  routes: # 子路由
   - receiver: 'my-slack'
   - receiver: 'k8smail'

完整配置:

global:
  resolve_timeout: 5m
  smtp_from: '68985@163.com'            #发送人的邮箱，自定义
  smtp_smarthost: 'smtp.163.com:465'
  smtp_auth_username: '68985@163.com'  #你自己的邮箱
  smtp_auth_password: 'xxxxxxx'   #这个是上面获取安全码的值，不是你邮箱的密码
  smtp_require_tls: false
  smtp_hello: '163.com'

route:
  group_by: ['alertname']
  group_wait: 5s #30s
  group_interval: 5s #5m
  repeat_interval: 1m #1h
  receiver: 'k8smail'
  routes: # 子路由
   - receiver: 'my-slack'
   - receiver: 'k8smail'


receivers:
- name: 'k8smail'
  email_configs:
  - to: 'xxxxx@gmail.com' # 接收报警邮件的邮箱
    send_resolved: true
  - to: 'yyyyyy@gmail.com' # 接收报警邮件的邮箱
    send_resolved: true
- name: 'my-slack'
  slack_configs:
   - send_resolved: true
     api_url: https://hooks.slack.com/services/T04ASAT9J90/aaaaa/xxxxxx   #这里填你安装webhook app应该的时候给的url
     channel: '#monitoring'   #要发往那个频道
     text: "{{ range .Alerts }} {{ .Annotations.description}}\n {{end}} @{{ .CommonAnnotations.username}} <{{.CommonAnnotations.link}}| click here>"
     title: "{{.CommonAnnotations.summary}}"
     title_link: "{{.CommonAnnotations.link}}"


inhibit_rules:
  - source_match:
      severity: 'critical'
    target_match:
      severity: 'warning'
    equal: ['alertname', 'dev', 'instance']

所以Alertmanager查找默认接收者的顺序是:

route 里指定的默认接收者
global 配置里的 default_receiver
第一个定义的接收者

1
2
3

routes: # 子路由
 - receiver: 'my-slack'
 - receiver: 'k8smail'

这里匹配到第一个，默认是不会继续向下匹配的。。所以。。

Alertmanager routes继续向下匹配：

Alertmanager 的 routes 配置支持通过 continue 字段控制告警是否继续向下匹配。

如果将 continue 设置为 true,则告警会继续和后面的 route 规则进行匹配:

routes:
- match:
    severity: critical 
  receiver: team-critical
  continue: true

- match:
    severity: warning
  receiver: team-warning

上面第一个路由匹配严重级别为critical的告警,并发送到team-critical接收者,continue为true,所以告警会继续和后续规则匹配。

如果severity为warning,会同时发送到team-critical和team-warning两个接收者。

如果将continue设置为false(默认值):

routes:
- match:
    severity: critical
  receiver: team-critical 

- match:
    severity: warning
  receiver: team-warning

那么告警只会发送到第一个匹配的接收者,不会继续往下匹配。

severity为critical只会发送到team-critical,不会再发送到team-warning。

所以continue用于控制告警是否继续匹配后面的路由规则,默认为false,只发送给第一个匹配的接收者。将continue设置为true可以实现告警发送到多个接收者。

https://www.cnblogs.com/binliubiao/p/13465140.html

https://www.coderdocument.com/docs/prometheus/v2.14/alerting/configuration.html

https://www.bookstack.cn/read/prometheus-manual/alerting-configuration.md

可以通过 http://192.168.100.254:9093/#/status 查看Alertmanager的Config

所以最终的既能发给slack，也能发邮件的配置为:

global:
  resolve_timeout: 5m
  smtp_smarthost: 'smtp.163.com:465'
  smtp_from: 'xxxx@163.com' # 用于发送告警右键的邮箱
  smtp_auth_username: 'xxxx@163.com'
  smtp_auth_password: 'xxxxxxxx'    #此处为邮箱的授权密码，非邮箱登录密码
  smtp_require_tls: false
  smtp_hello: '163.com'


route:
  group_by: ['alertname']
  group_wait: 5s #30s
  group_interval: 5s #5m
  repeat_interval: 10s #1h
  receiver: 'default-receiver'
  routes: # 子路由
   - receiver: 'k8smail'
     continue: true
   - receiver: 'ont-slack'
     continue: true

receivers:
- name: 'default-receiver'
  email_configs:
  - to: 'xxxx@126.com' # 接收报警邮件的邮箱
    send_resolved: true
- name: 'k8smail'
  email_configs:
  - to: 'xxxxx@yeah.net' # 接收报警邮件的邮箱
    send_resolved: true
  - to: 'xxxxx@gmail.com' # 接收报警邮件的邮箱(cuishuang)
    send_resolved: true
- name: 'my-slack'
  slack_configs:
   - send_resolved: true
     api_url: https://hooks.slack.com/services/xxxxxxxx/xxxxxxx   #这里填你安装webhook app应该的时候给的url
     channel: '#monitoring'   #要发往那个频道
     text: "{{ range .Alerts }} {{ .Annotations.description}}\n {{end}} @{{ .CommonAnnotations.username}} <{{.CommonAnnotations.link}}| click here>"
     title: "{{.CommonAnnotations.summary}}"
     title_link: "{{.CommonAnnotations.link}}"


inhibit_rules:
  - source_match:
      severity: 'critical'
    target_match:
      severity: 'warning'
    equal: ['alertname', 'dev', 'instance']

这两个地址是非常好用的~

http://192.168.100.254:9090/alerts?search= 查看告警合适触发，pending中还是firing

http://43.134.68.68:9090/rules 这个看规则，能点击expr直接查询值

http://192.168.100.254:9093/#/status 这个看alertmanager的配置信息。。用处相对不大~

这里的日志级别设置并不好用 https://www.cnblogs.com/winstom/p/11940570.html

还是不知道日志保存到了哪里。。

可以不用supervisord启动，直接二进制启，从命令行看输出日志

./alertmanager --log.level=debug --log.format=json

可以这样查看，昨天的某个告警有没有发出来，以及为何没发出来(如处于pending状态，最终没有firing)

http://43.134.68.68:9090/graph?g0.expr=ALERTS&g0.tab=0&g0.stacked=0&g0.show_exemplars=0&g0.range_input=1d

Prometheus Blackbox Exporter（普罗米修斯黑盒导出器）是一个用于对网络服务进行主动探测和监控的工具。它是Prometheus生态系统中的一个组件，用于监控网络应用的可用性和性能。

Blackbox Exporter 提供了一组内置的探针（Probes），用于执行各种网络探测任务，例如 HTTP、TCP、ICMP、DNS、SMTP 等。它可以定期发送请求到目标服务，并收集有关服务的指标数据。通过这些指标，你可以了解服务的响应时间、可用性和其他相关的网络性能指标。

Blackbox Exporter 配置灵活，可以根据需要定义多个探测任务，并可以使用 Prometheus 的标准配置文件进行配置。它会将探测结果作为指标（Metrics）暴露给 Prometheus 服务器，然后可以使用 PromQL 查询语言对指标进行查询和分析。

使用 Blackbox Exporter，你可以监控各种网络服务，例如 Web 服务器、数据库、消息队列等，并在出现问题时得到通知。它对于实时监控网络服务的可用性和性能非常有用，并能够帮助你及时发现和解决潜在的问题。

请注意，Blackbox Exporter 主要用于对网络服务进行主动探测，而对于已经集成了客户端库的应用程序，你可以直接使用 Prometheus 客户端库来进行指标采集和导出。

在 Prometheus Blackbox Exporter 中，expr 是指用于配置探测任务的表达式（expression）。每个探测任务都定义了一个或多个表达式，用于指定要执行的网络探测操作。

expr 字段通常是在 Blackbox Exporter 的配置文件中定义的，用于指定探测任务的行为。它可以是一个字符串，其中包含一个或多个 PromQL 查询语句。

表达式可以有不同的形式，具体取决于探测任务和要检查的服务类型。以下是一些示例：

对于 HTTP 探测任务，表达式可以是一个 HTTP 请求的 URL，例如 expr: "http://example.com"。Blackbox Exporter 将发送一个 HTTP GET 请求到指定的 URL，并记录响应时间和状态码等指标。
对于 TCP 探测任务，表达式可以是一个主机名和端口号的组合，例如 expr: "example.com:80"。Blackbox Exporter 将尝试建立到指定主机和端口的 TCP 连接，并记录连接时间和连接状态等指标。
对于 DNS 探测任务，表达式可以是一个域名，例如 expr: "example.com"。Blackbox Exporter 将执行 DNS 查询，并记录查询时间和响应状态等指标。

通过配置 expr 字段，你可以定义多个探测任务，并监控不同服务的可用性和性能。Blackbox Exporter 将周期性地执行这些探测任务，并将指标数据暴露给 Prometheus 服务器供后续查询和分析。

在 Prometheus Blackbox Exporter 中，probe 是指用于执行网络探测任务的组件或操作。它是 Blackbox Exporter 的核心功能之一。

Probe（探针）是一种主动探测的机制，用于测试和监测目标服务的可用性和性能。在 Blackbox Exporter 中，它由一组内置的探测器（Probers）实现，用于执行不同类型的网络探测任务。

Blackbox Exporter 提供了多个探测器，每个探测器针对特定的网络协议或服务类型。以下是一些常用的探测器示例：

HTTP 探测器：用于发送 HTTP 请求到目标 URL，并记录响应时间、状态码、内容匹配等指标。
TCP 探测器：用于建立到目标主机和端口的 TCP 连接，并记录连接时间、连接状态等指标。
ICMP 探测器：用于执行 ICMP Ping 操作，测试目标主机的可达性，并记录响应时间和丢包率等指标。
DNS 探测器：用于执行 DNS 查询，并记录查询时间、响应状态等指标。

通过配置 Blackbox Exporter 的 probe 字段，你可以选择适当的探测器，并定义探测任务的细节。探测器将根据配置的参数执行相应的探测操作，并收集指标数据。这些指标数据将作为 Prometheus 暴露的指标供后续查询、报警和分析使用。

总之，Probe 是 Blackbox Exporter 中用于执行网络探测任务的组件，它通过不同的探测器来实现对不同服务和协议的探测，并提供相关的指标数据供监控和分析使用。

ALERTS

网络探测：Blackbox Exporter

https://yunlzheng.gitbook.io/prometheus-book/part-ii-prometheus-jin-jie/exporter/commonly-eporter-usage/install_blackbox_exporter

https://cloud.tencent.com/developer/article/2183169?cps_key=1d358d18a7a17b4a6df8d67a62fd3d3d

https://cloud.tencent.com/developer/news/620963?cps_key=1d358d18a7a17b4a6df8d67a62fd3d3d

https://segmentfault.com/a/1190000039680793

https://www.voidking.com/dev-prometheus-blackbox-exporter/

下面是k8s中使用operator部署的

https://alert.tooltaskon.xyz/#/alerts

告警分级

https://yunlzheng.gitbook.io/prometheus-book/parti-prometheus-ji-chu/alert/alert-manager-route

基于标签的告警信息分发

Prometheus支持在 rules 文件中为每个告警规则设置 severity 标签来标识告警的严重程度。

通常建议定义如下几个 severity 级别: （用其他的tag也可以~）

critical: 严重故障,需要立即处理。
error: 错误告警,可能会影响服务正常运行。
warning: 警告告警,需要定期关注。
info: 提醒信息,不影响服务但需要了解。

您定义的 polaris 高度差异告警规则设置的 severity 为 page,可以定义为 warning 级别。

实际规则文件可以设置为:

groups:

name: critical
rules:
- alert: cpu_usage_too_high
  expr: 100 - (avg by(instance) (rate(node_cpu{mode=”idle”}[5m])) * 100) > 90
  for: 10m
  labels:
  severity: critical
name: error
rules:
- alert: …
  severity: error
name: warning
rules:
- alert: polaris_block_height_diff
  expr: …
  severity: warning
name: info
rules:
- alert: …
  severity: info

这样就将不同等级的告警规则根据严重程度归类到不同的 rule group 中,在告警通知和处理策略上也可以做区分。

除了 severity 外,还可以根据告警触发频率等属性进一步细分规则组,将类似类型的规则归一统一管理。

实际场景下，告警处理可不是这么简单的一件事情，对于不同级别的告警，我们可能会有完全不同的处理方式，因此在route中，我们还可以定义更多的子Route，这些Route通过标签匹配告警的处理方式

告警的匹配有两种方式可以选择。一种方式基于字符串验证，通过设置match规则判断当前告警中是否存在标签labelname并且其值等于labelvalue。第二种方式则基于正则表达式，通过设置match_re验证当前告警标签的值是否满足正则表达式的内容。

group_wait:

Alertmanager可以对告警通知进行分组，将多条告警合合并为一个通知。这里我们可以使用group_by来定义分组规则。基于告警中包含的标签，如果满足group_by中定义标签名称，那么这些告警将会合并为一个通知发送给接收器。

有的时候为了能够一次性收集和发送更多的相关信息时，可以通过group_wait参数设置等待时间，如果在等待时间内当前group接收到了新的告警，这些告警将会合并为一个通知向receiver发送。

group_interval:

而group_interval配置，则用于定义相同的Group之间发送告警通知的时间间隔。

其实非常简单~ prometheus的rules中，labels:这行下新增一行设置，例如叫severity_level: critical

在alertmanager的alertmanager.yml配置文件中，子路由新增一个receiver，需要加上match 对应上面带label，例如:

route:
  group_by: ['alertname']
  group_wait: 0s
  group_interval: 1m
  repeat_interval: 1m
  receiver: 'default-receiver'
  routes: # 子路由
   - receiver: 'k8smail'
     continue: true
   - receiver: 'ont-slack'
     continue: true
   - receiver: 'wechat'
     continue: true
   - receiver: 'critical-notifications' # 这行及以下为新增
     continue: true
     match:
       severity_level: critical

在 receivers:配置中新增要发送的邮箱啥的。如

receivers:
- name: 'default-receiver'
  email_configs:
  - to: 'cuishuang@ont.io' # 接收报警邮件的邮箱(实际不会收到)
    send_resolved: true

- name: 'critical-notifications' # 这行及以下为新增
  email_configs:
  - to: 'xxxx@qq.com' # 接收报警邮件的邮箱(张三)
    send_resolved: true
  - to: 'xxxx@163.com' # 接收报警邮件的邮箱(李四)
    send_resolved: true

静默(Silence)与抑制(Inhibition)

两者不同。。

Inhibition 抑制

抑制是当出现其它告警的时候压制当前告警的通知，可以有效的防止告警风暴。在alertmanager.yaml中配置，有专门的 inhibit_rules段

比如当机房出现网络故障时，所有服务都将不可用而产生大量服务不可用告警，但这些警告并不能反映真实问题在哪，真正需要发出的应该是网络故障告警。当出现网络故障告警的时候，应当抑制服务不可用告警的通知。

例如：当集群中的某一个主机节点异常宕机导致告警NodeDown被触发，同时在告警规则中定义了告警级别severity=critical。由于主机异常宕机，该主机上部署的所有服务，中间件会不可用并触发报警。根据抑制规则的定义，如果有新的告警级别为severity=critical，并且告警中标签node的值与NodeDown告警的相同，则说明新的告警是由NodeDown导致的，则启动抑制机制停止向接收器发送通知。

- source_match:
  alertname: NodeDown
  severity: critical
target_match:
  severity: critical
equal:
  - node

来自 Prometheus 之 Alertmanager告警抑制与静默

抑制是指当某告警发出后，可以停止重复发送由此告警引发的其他告警的机制

比如，我现在有一台服务器宕机了，上面跑了很多中间件都设置了告警，那么我将收到大量无用的告警信息

这篇和全，好！ https://www.cnblogs.com/gered/p/13496950.html#_label1_1

对这种匹配验证操作灰常考究个人的逻辑思维能力，这不是人干的事情呀~因此，Prometheus发布了一个 https://prometheus.io/webtools/alerting/routing-tree-editor/

Silence 静默

其实就是屏蔽，比如不想接收某个告警2h，要在alertmanager的UI界面(xxxx:9093)设置

也可以使用 amtool 来静默

amtool silence add –comment “hello world” –duration “2h” alertname=”node-up” instance=”192.168.1.20:9100” job=”node” severity=”1” team=”node”

https://blog.csdn.net/qq_42883074/article/details/115554668

查看已有的静默:

amtool silence query

更多使用可以 amtool silence –help

但是用图形界面更直观一些。。

https://blog.csdn.net/IT_ZRS/article/details/129932353

alertmanager 静默是什么意思？

在监控和警报系统中，”静默”是指暂时停止或禁止某个告警规则产生的警报通知。当一个告警规则触发时，通常会触发警报通知，例如发送电子邮件、短信或将通知发送到集中的警报管理系统。但在某些情况下，我们可能希望暂时性地禁止警报通知，以避免不必要的干扰或解决相关问题。

Alertmanager 提供了静默功能，允许用户在特定条件下停止发送警报通知。可以根据告警的标签或其他属性来配置静默规则，只有符合这些规则的告警才会被静默。

以下是一些使用 Alertmanager 静默功能的常见情况：

维护窗口期：当进行系统维护或升级时，我们可能希望在此期间暂时禁止相关告警的通知，以免误报或干扰。
已知问题：对于已知的问题或故障，我们可能不希望持续地收到相关的警报通知，而是在解决问题之前将其静默。
调试和测试：在调试和测试过程中，我们可能希望暂时禁用某些告警规则，以便更好地调试和验证系统行为。

通过配置 Alertmanager 的静默规则，我们可以根据需要灵活控制警报通知的发送，以便在特定情况下提供更好的操作和故障排除体验。

hostCpuUsageAlert 这个是prometheus默认的告警规则吗？如何停止或静默

hostCpuUsageAlert 不是 Prometheus 默认的告警规则。Prometheus 是一个开源监控系统，它提供了一组默认的告警规则，但没有名为 hostCpuUsageAlert 的默认规则。

如果你的监控系统中存在名为 hostCpuUsageAlert 的告警规则，并且你想要停止或静默该规则的警报通知，你可以使用 Alertmanager 的静默功能来实现。以下是一般的步骤：

确定告警规则名称：确定名为 hostCpuUsageAlert 的告警规则确实存在。

编辑 Alertmanager 配置文件：打开 Alertmanager 的配置文件（通常是 alertmanager.yml），并添加一个静默配置块。

receivers:
  - name: 'silence'
    email_configs: []
route:
  receiver: 'silence'
  group_by: []
inhibit_rules:
  - source_match:
      alertname: 'hostCpuUsageAlert'
    target_match:
    equal: []
    duration: 1h
    apply_to: 'any'

在上述配置中，我们定义了一个名为 silence 的接收器（receiver）和一个名为 inhibit_rules 的静默规则。静默规则使用标签匹配来选择要静默的告警。

在这个示例中，我们将 alertname 标签设置为 hostCpuUsageAlert，并指定静默的持续时间为 1 小时。

重启 Alertmanager：保存配置文件后，重新启动 Alertmanager 服务，使配置生效。

1	systemctl restart alertmanager

配置完成后，Alertmanager 将根据定义的静默规则匹配 hostCpuUsageAlert 告警，并在指定的持续时间内抑制该告警的发送。

请注意，以上步骤是一个示例，你可能需要根据你的具体需求和 Alertmanager 的版本进行相应的调整。确保在进行任何配置更改之前备份配置文件，并参考 Alertmanager 的文档以获取更详细的配置说明。

上述示例是Prometheus的警报管理配置文件，其中涉及接收器（receiver）、路由（route）和抑制规则（inhibit_rules）。以下是每个部分的含义：

receivers：定义了接收器的配置。在示例中，只有一个名为’silence’的接收器，它没有任何电子邮件配置（email_configs为空）。
route：指定了警报的路由配置。在示例中，警报将被路由到名为’silence’的接收器。同时，警报的分组依据是’alertname’和’instance’，这意味着具有相同’alertname’和’instance’标签的警报将被分组在一起。
inhibit_rules：定义了抑制规则，用于控制在特定条件下抑制（静默）警报。在示例中，有一个抑制规则：
- source_match：定义了源规则匹配的条件，即当警报的标签instance等于’machine-1’时。
- target_match：定义了目标规则匹配的条件，即当警报的标签alertname等于’high-cpu-usage’时。
- equal：指定了需要相等的标签列表，即’alertname’和’instance’。
- duration：定义了抑制的持续时间，这里是1小时（1h）。
- apply_to：指定了应用抑制规则的条件，这里是’any’，表示只要满足任何一个条件，就应用抑制规则。

这段配置的含义是，当收到标签instance为’machine-1’且标签alertname为’high-cpu-usage’的警报时，将抑制（静默）该警报，并持续1小时。这样，在一小时内只会收到一次该警报，即使有多个匹配的警报产生。

请注意，上述示例是一种常见的配置方式，具体的操作步骤可能因您的配置和需求而有所不同。您需要根据您的实际情况和需求来调整接收器、路由和抑制规则的配置。

抑制的持续时间想设置为永远，该怎么写？

如果您想将抑制规则的持续时间设置为永久，即永远抑制警报，可以将抑制规则的duration参数设置为特定的值来达到这个效果。在Prometheus中，可以使用字符串"forever"表示持续时间为永久。

下面是将抑制规则的持续时间设置为永久的示例配置：

inhibit_rules:
  - source_match:
      instance: 'machine-1'
    target_match:
      alertname: 'high-cpu-usage'
    equal: ['alertname', 'instance']
    duration: 'forever'
    apply_to: 'any'

在上述示例中，duration参数被设置为"forever"，表示抑制规则将永远持续下去。这样，当满足抑制规则的条件时，相应的警报将被永久抑制，不会再触发。

请注意，将抑制规则持续时间设置为永久需要谨慎操作，因为警报将无法再次触发。确保在设置永久抑制之前，您已经仔细考虑了相关的需求和后果，并确认不会导致重要的警报被长期忽略。

alertmanager 静默某台机器某个告警

要在 Alertmanager 中静默某台机器上的特定告警，你可以使用以下步骤：

配置标签（Label）：首先，你需要为要静默的机器和告警定义相应的标签。这些标签将用于匹配要静默的告警规则。

例如，你可以为机器设置一个标签 instance="machine-1"，为特定的告警设置一个标签 alertname="high-cpu-usage"。

编辑 Alertmanager 配置文件：打开 Alertmanager 的配置文件（通常是 alertmanager.yml），并添加一个静默配置块。

receivers:
  - name: 'silence'
    email_configs: []
route:
  receiver: 'silence'
  group_by: ['alertname', 'instance']
inhibit_rules:
  - source_match:
      instance: 'machine-1'
    target_match:
      alertname: 'high-cpu-usage'
    equal: ['alertname', 'instance']
    duration: 1h
    apply_to: 'any'

在上述配置中，我们定义了一个名为 silence 的接收器（receiver）和一个名为 inhibit_rules 的静默规则。静默规则使用标签匹配来选择要静默的告警。

在这个示例中，我们将 instance 标签设置为 machine-1，将 alertname 标签设置为 high-cpu-usage，并指定静默的持续时间为 1 小时。

重启 Alertmanager：保存配置文件后，重新启动 Alertmanager 服务，使配置生效。
1
systemctl restart alertmanager

配置完成后，Alertmanager 将根据定义的静默规则匹配机器和告警，并在指定的持续时间内抑制该告警的发送。

groups:
- name: hostStatsAlert
  rules:
  - alert: hostCpuUsageAlert
    expr: node_load1 > 4
    for: 1m
    labels:
      severity: page
    annotations:
      summary: "Instance {{ $labels.instance }} CPU usgae high"
      description: "{{ $labels.instance }} CPU usage above 85% (current value: {{ $value }})"
  - alert: hostMemUsageAlert
    expr: (node_memory_numa_MemTotal - node_memory_MemAvailable_bytes)/node_memory_numa_MemTotal > 0.85
    for: 1m
    labels:
      severity: page
    annotations:
      summary: "Instance {{ $labels.instance }} MEM usgae high"
      description: "{{ $labels.instance }} MEM usage above 85% (current value: {{ $value }})"

什么意思？如何一定触发告警？

http://43.134.68.68:9090/graph?g0.expr=node_load1&g0.tab=0&g0.stacked=0&g0.show_exemplars=0&g0.range_input=1h

确实如此…

静默只能在alertmanager的ui页面设置 http://43.134.68.68:9093/#/silences

抑制要在alertmanager.yaml里设置

静默和抑制作用是不同的

某个告警，比如内存大于80%，对某台机器上这是正常的，不用管，那就可以用静默(静默很长一段时间)

而抑制，更多是之间的关系，比如在某个告警发生的情况下，让其他告警就发一条，不会一直发(例如，当critical的告警发生了，级别更低的warning告警就只发送一条，不会一直发~)

inhibit_rules:
  - source_match:
      severity: 'critical'
    target_match:
      severity: 'warning'
    equal: ['alertname', 'dev', 'instance']

https://www.cnblogs.com/gered/p/15946822.html

k8s方式部署的，增加redis-exporter

/home/ubuntu/prom/redismonitor

1950 2023-08-31 10:16:35 ls
1951 2023-08-31 10:16:41 k -n taskon get pod -o wide |grep mysql
1952 2023-08-31 10:17:03 ls
1953 2023-08-31 10:17:12 k -n taskon get servicemonitor

1954 2023-08-31 10:17:53 k -n taskon get servicerule
1955 2023-08-31 10:18:57 cd pro
1956 2023-08-31 10:18:59 ls
1957 2023-08-31 10:19:01 cd prom/
1958 2023-08-31 10:19:01 ls
1959 2023-08-31 10:19:04 cat rule.yaml
1960 2023-08-31 10:19:13 k -n taskon get PrometheusRule

1961 2023-08-31 10:19:19 k -n monitoring get PrometheusRule（之后需要加）

1962 2023-08-31 10:19:34 k -n monitoring get PrometheusRule mysql-connection-rules -o yaml

k -n monitoring get PrometheusRule mysql-connection-rules -o yaml：

apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  annotations:
    prometheus-operator-validated: "true"
  creationTimestamp: "2023-06-09T07:16:40Z"
  generation: 1
  labels:
    release: prometheus
  name: mysql-connection-rules
  namespace: monitoring
  resourceVersion: "9620168964"
  uid: bd11d900-d79c-40ad-ae3c-e43ddbcc3f8a
spec:
  groups:
  - name: mysqlconnectionrul
    rules:
    - alert: RequestTooHigh
      annotations:
        description: mysql qps too much
        summary: mysql qps too high
      expr: |
        rate(mysql_global_status_queries[10m]) > 2000
      for: 15m
      labels:
        cluster: taskon
        severity: critical

1963 2023-08-31 10:21:08 ls
1964 2023-08-31 10:27:21 mkdir redismonitor
1965 2023-08-31 10:27:23 cd redismonitor/
1966 2023-08-31 10:27:23 ls
1967 2023-08-31 10:27:31 vim redisdeploy.yaml
1968 2023-08-31 10:29:51 ~/conn.sh
1969 2023-08-31 10:30:50 ls
1970 2023-08-31 10:30:52 vim redisdeploy.yaml
1971 2023-08-31 10:36:09 ls
1972 2023-08-31 10:36:14 k -n taskon create -f redisdeploy.yaml
1973 2023-08-31 10:36:21 vim redisdeploy.yaml
1974 2023-08-31 10:36:34 k -n taskon delete -f redisdeploy.yaml
1975 2023-08-31 10:36:39 k -n taskon create -f redisdeploy.yaml

1976 2023-08-31 10:36:46 ls
1977 2023-08-31 10:36:50 k -n taskon get pod -o wide
1978 2023-08-31 10:36:55 k -n taskon get pod -o wide |grep export
1979 2023-08-31 10:37:06 k -n taskon logs taskon-redis-buy-exporter-5c5c58578d-cdgph

1980 2023-08-31 10:37:23 curl 10.16.1.128:9121/metrics |grep redis

1981 2023-08-31 10:37:30 ls
1982 2023-08-31 10:37:34 cd ../
1983 2023-08-31 10:37:34 ls
1984 2023-08-31 10:37:42 vim metric-collector.yaml
1985 2023-08-31 10:38:14 k -n taskon get servicemonitor
1986 2023-08-31 10:38:27 k -n taskon get servicemonitor mysql-exporter
1987 2023-08-31 10:38:29 k -n taskon get servicemonitor mysql-exporter -o yaml

k -n taskon get servicemonitor mysql-exporter -o yaml

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  creationTimestamp: "2023-06-09T07:03:18Z"
  generation: 1
  labels:
    team: diu
  name: mysql-exporter
  namespace: taskon
  resourceVersion: "9619780695"
  uid: 7002691d-66e6-49ca-abd6-f12212e16558
spec:
  endpoints:
  - port: exporter
  selector:
    matchLabels:
      app: mysql-exporter

1988 2023-08-31 10:38:46 ls
1989 2023-08-31 10:38:59 cd redismonitor/
1990 2023-08-31 10:38:59 ls
1991 2023-08-31 10:39:06 vim servicemonitor.yaml
1992 2023-08-31 10:39:44 cat redisdeploy.yaml
1993 2023-08-31 10:39:54 fg
1994 2023-08-31 10:40:17 k -n taskon create -f servicemonitor.yaml
1995 2023-08-31 10:40:19 ls
1996 2023-08-31 10:40:24 k -n taskon get pod -o wide
1997 2023-08-31 10:42:08 k -n taskon get pod -o wide |grep redis
1998 2023-08-31 10:43:51 ls
1999 2023-08-31 10:44:49 k -n taskon get pod -o wide |grep exporter

要创建两个yaml文件，

cat redisdeploy.yaml

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: taskon-redis-buy-exporter
  namespace: taskon
  labels:
    app: taskon-redis-buy-exporter
spec:
  replicas: 1
  selector:
    matchLabels:
      app: taskon-redis-buy-exporter
  template:
    metadata:
      labels:
        app: taskon-redis-buy-exporter
    spec:
      containers:
      - name: taskon-redis-buy-exporter
        image: oliver006/redis_exporter:latest
        imagePullPolicy: IfNotPresent
        # 此处添加redis相关配置，例如:地址、密码等
        # 如果是监控k8s容器外的Redis，则此处的redis.addr对应的值需要添加redis://前缀，类似下面注释的那样
        # args: ["-redis.addr", "redis://10.128.27.22:6379", "-redis.password", "123456@redis"]
        args: ["-redis.addr", "redis://172.22.0.112:6379", "-redis.password", "xxxxxx"]
        ports:
        - containerPort: 9121

---
apiVersion: v1
kind: Service
metadata:
  labels:
    app: taskon-redis-buy-exporter
  name: taskon-redis-buy-exporter
  namespace: taskon
spec:
  type: ClusterIP
  ports:
  - name: metrics
    port: 9121
    protocol: TCP
    targetPort: 9121
  selector:
    app: taskon-redis-buy-exporter

cat servicemonitor.yaml


apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  labels:
    team: diu
  name: redis-exporter-servicemonitor
  namespace: taskon
spec:
  endpoints:
  - port: metrics
  selector:
    matchLabels:
      app: taskon-redis-buy-exporter

网上找一个grafana上redis的模版

https://grafana.tooltaskon.xyz/dashboards

https://cloud.tencent.com/developer/article/1706610?cps_key=1d358d18a7a17b4a6df8d67a62fd3d3d

https://grafana.com/search/?term=redis&type=dashboard 随便选一个。。可以下载json文件再导入，也可以复制id

https://grafana.com/grafana/dashboards/763-redis-dashboard-for-prometheus-redis-exporter-1-x/ 可以从这里下载json

最后用了这个~ 选172.22.0.112:6379这个instance

https://grafana.tooltaskon.xyz/d/xDLNRKUWz/redis-dashboard-for-prometheus-redis-exporter-helm-stable-redis-ha?orgId=1&refresh=30s&var-DS_PROMETHEUS=Prometheus&var-namespace=taskon&var-pod_name=taskon-redis-buy-exporter-5c5c58578d-cdgph&var-instance=172.22.0.112:6379&from=now-15m&to=now

可以点这里，选JSON Model看json文件

配置规则:

$ cat redis-rule.yaml

apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  labels:
    release: prometheus
  name: redis-rules
  namespace: monitoring
spec:
  groups:
  - name: redis-alert-rule
    rules:
    - alert: ClientsTooMuch
      annotations:
        summary: redis_connected_clients too much
        description: redis clients too much
      expr: |
        redis_connected_clients > 200
      for: 15m
      labels:
        cluster: "taskon"
        severity: critical

    - alert: MemoryUsageTooHigh
      annotations:
        summary: redis_memory_used_bytes too much, more than 70%
        description: redis_memory_used_bytes too much
      expr: |
        100 * (redis_memory_used_bytes{}  / redis_memory_max_bytes{} ) > 70
      for: 15m
      labels:
        cluster: "taskon"
        severity: critical
    
    - alert: MissesTooMuch
      annotations:
        summary: redis_keyspace_misses_total, more than 1
        description: redis_keyspace_misses_total too high
      expr: |
        irate(redis_keyspace_misses_total{}[5m]) > 1
      for: 15m
      labels:
        cluster: "taskon"
        severity: critical

然后 k apply -f redis-rule.yaml

去 https://prom.tooltaskon.xyz/alerts?search= 界面看是否新增了告警规则

https://blog.csdn.net/HYZX_9987/article/details/101023503

mysql的json(网上找的)

{
  "annotations": {
    "list": [
      {
        "builtIn": 1,
        "datasource": {
          "type": "datasource",
          "uid": "grafana"
        },
        "enable": true,
        "hide": false,
        "iconColor": "#e0752d",
        "limit": 100,
        "name": "PMM Annotations",
        "showIn": 0,
        "tags": [
          "pmm_annotation"
        ],
        "type": "tags"
      }
    ]
  },
  "description": "Dashboard from Percona Monitoring and Management project. ",
  "editable": true,
  "fiscalYearStartMonth": 0,
  "gnetId": 7362,
  "graphTooltip": 1,
  "id": 28,
  "links": [
    {
      "icon": "dashboard",
      "includeVars": true,
      "keepTime": true,
      "tags": [
        "QAN"
      ],
      "targetBlank": false,
      "title": "Query Analytics",
      "type": "link",
      "url": "/graph/dashboard/db/_pmm-query-analytics"
    },
    {
      "asDropdown": true,
      "includeVars": true,
      "keepTime": true,
      "tags": [
        "OS"
      ],
      "targetBlank": false,
      "title": "OS",
      "type": "dashboards"
    },
    {
      "asDropdown": true,
      "includeVars": true,
      "keepTime": true,
      "tags": [
        "MySQL"
      ],
      "targetBlank": false,
      "title": "MySQL",
      "type": "dashboards"
    },
    {
      "asDropdown": true,
      "includeVars": true,
      "keepTime": true,
      "tags": [
        "MongoDB"
      ],
      "targetBlank": false,
      "title": "MongoDB",
      "type": "dashboards"
    },
    {
      "asDropdown": true,
      "includeVars": true,
      "keepTime": true,
      "tags": [
        "HA"
      ],
      "targetBlank": false,
      "title": "HA",
      "type": "dashboards"
    },
    {
      "asDropdown": true,
      "includeVars": true,
      "keepTime": true,
      "tags": [
        "Cloud"
      ],
      "targetBlank": false,
      "title": "Cloud",
      "type": "dashboards"
    },
    {
      "asDropdown": true,
      "includeVars": true,
      "keepTime": true,
      "tags": [
        "Insight"
      ],
      "targetBlank": false,
      "title": "Insight",
      "type": "dashboards"
    },
    {
      "asDropdown": true,
      "includeVars": true,
      "keepTime": true,
      "tags": [
        "PMM"
      ],
      "targetBlank": false,
      "title": "PMM",
      "type": "dashboards"
    }
  ],
  "liveNow": false,
  "panels": [
    {
      "collapsed": false,
      "datasource": {
        "type": "prometheus",
        "uid": "prometheus"
      },
      "gridPos": {
        "h": 1,
        "w": 24,
        "x": 0,
        "y": 0
      },
      "id": 382,
      "panels": [],
      "targets": [
        {
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "refId": "A"
        }
      ],
      "type": "row"
    },
    {
      "datasource": {
        "type": "prometheus",
        "uid": "prometheus"
      },
      "description": "**MySQL Uptime**\n\nThe amount of time since the last restart of the MySQL server process.",
      "fieldConfig": {
        "defaults": {
          "color": {
            "mode": "thresholds"
          },
          "decimals": 1,
          "mappings": [],
          "thresholds": {
            "mode": "absolute",
            "steps": [
              {
                "color": "rgba(245, 54, 54, 0.9)",
                "value": null
              },
              {
                "color": "rgba(237, 129, 40, 0.89)",
                "value": 300
              },
              {
                "color": "rgba(50, 172, 45, 0.97)",
                "value": 3600
              }
            ]
          },
          "unit": "s"
        },
        "overrides": []
      },
      "gridPos": {
        "h": 4,
        "w": 6,
        "x": 0,
        "y": 1
      },
      "id": 12,
      "interval": "$interval",
      "links": [],
      "maxDataPoints": 100,
      "options": {
        "colorMode": "value",
        "graphMode": "none",
        "justifyMode": "auto",
        "orientation": "horizontal",
        "reduceOptions": {
          "calcs": [
            "lastNotNull"
          ],
          "fields": "",
          "values": false
        },
        "textMode": "auto"
      },
      "pluginVersion": "9.5.3",
      "targets": [
        {
          "calculatedInterval": "10m",
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "datasourceErrors": {},
          "errors": {},
          "expr": "mysql_global_status_uptime{instance=\"$host\"}",
          "format": "time_series",
          "interval": "5m",
          "intervalFactor": 1,
          "legendFormat": "",
          "metric": "",
          "refId": "A",
          "step": 300
        }
      ],
      "title": "MySQL Uptime",
      "type": "stat"
    },
    {
      "datasource": {
        "type": "prometheus",
        "uid": "prometheus"
      },
      "description": "**Current QPS**\n\nBased on the queries reported by MySQL's ``SHOW STATUS`` command, it is the number of statements executed by the server within the last second. This variable includes statements executed within stored programs, unlike the Questions variable. It does not count \n``COM_PING`` or ``COM_STATISTICS`` commands.",
      "fieldConfig": {
        "defaults": {
          "color": {
            "fixedColor": "rgb(31, 120, 193)",
            "mode": "fixed"
          },
          "decimals": 2,
          "mappings": [],
          "thresholds": {
            "mode": "absolute",
            "steps": [
              {
                "color": "rgba(245, 54, 54, 0.9)",
                "value": null
              },
              {
                "color": "rgba(237, 129, 40, 0.89)",
                "value": 35
              },
              {
                "color": "rgba(50, 172, 45, 0.97)",
                "value": 75
              }
            ]
          },
          "unit": "short"
        },
        "overrides": []
      },
      "gridPos": {
        "h": 4,
        "w": 6,
        "x": 6,
        "y": 1
      },
      "id": 13,
      "interval": "$interval",
      "links": [
        {
          "targetBlank": true,
          "title": "MySQL Server Status Variables",
          "url": "https://dev.mysql.com/doc/refman/5.7/en/server-status-variables.html#statvar_Queries"
        }
      ],
      "maxDataPoints": 100,
      "options": {
        "colorMode": "none",
        "graphMode": "area",
        "justifyMode": "auto",
        "orientation": "horizontal",
        "reduceOptions": {
          "calcs": [
            "lastNotNull"
          ],
          "fields": "",
          "values": false
        },
        "textMode": "auto"
      },
      "pluginVersion": "9.5.3",
      "targets": [
        {
          "calculatedInterval": "10m",
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "datasourceErrors": {},
          "errors": {},
          "expr": "rate(mysql_global_status_queries{instance=\"$host\"}[$interval]) or irate(mysql_global_status_queries{instance=\"$host\"}[5m])",
          "format": "time_series",
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "",
          "metric": "",
          "refId": "A",
          "step": 20
        }
      ],
      "title": "Current QPS",
      "type": "stat"
    },
    {
      "datasource": {
        "type": "prometheus",
        "uid": "prometheus"
      },
      "description": "**InnoDB Buffer Pool Size**\n\nInnoDB maintains a storage area called the buffer pool for caching data and indexes in memory.  Knowing how the InnoDB buffer pool works, and taking advantage of it to keep frequently accessed data in memory, is one of the most important aspects of MySQL tuning. The goal is to keep the working set in memory. In most cases, this should be between 60%-90% of available memory on a dedicated database host, but depends on many factors.",
      "fieldConfig": {
        "defaults": {
          "color": {
            "mode": "thresholds"
          },
          "decimals": 0,
          "mappings": [],
          "thresholds": {
            "mode": "absolute",
            "steps": [
              {
                "color": "rgba(50, 172, 45, 0.97)",
                "value": null
              },
              {
                "color": "rgba(237, 129, 40, 0.89)",
                "value": 90
              },
              {
                "color": "rgba(245, 54, 54, 0.9)",
                "value": 95
              }
            ]
          },
          "unit": "bytes"
        },
        "overrides": []
      },
      "gridPos": {
        "h": 4,
        "w": 6,
        "x": 12,
        "y": 1
      },
      "id": 51,
      "interval": "$interval",
      "links": [
        {
          "targetBlank": true,
          "title": "Tuning the InnoDB Buffer Pool Size",
          "url": "https://www.percona.com/blog/2015/06/02/80-ram-tune-innodb_buffer_pool_size/"
        }
      ],
      "maxDataPoints": 100,
      "options": {
        "colorMode": "none",
        "graphMode": "none",
        "justifyMode": "auto",
        "orientation": "horizontal",
        "reduceOptions": {
          "calcs": [
            "lastNotNull"
          ],
          "fields": "",
          "values": false
        },
        "textMode": "auto"
      },
      "pluginVersion": "9.5.3",
      "targets": [
        {
          "calculatedInterval": "10m",
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "datasourceErrors": {},
          "errors": {},
          "expr": "mysql_global_variables_innodb_buffer_pool_size{instance=\"$host\"}",
          "format": "time_series",
          "interval": "5m",
          "intervalFactor": 1,
          "legendFormat": "",
          "metric": "",
          "refId": "A",
          "step": 300
        }
      ],
      "title": "InnoDB Buffer Pool Size",
      "type": "stat"
    },
    {
      "datasource": {
        "type": "prometheus",
        "uid": "prometheus"
      },
      "description": "**InnoDB Buffer Pool Size % of Total RAM**\n\nInnoDB maintains a storage area called the buffer pool for caching data and indexes in memory.  Knowing how the InnoDB buffer pool works, and taking advantage of it to keep frequently accessed data in memory, is one of the most important aspects of MySQL tuning. The goal is to keep the working set in memory. In most cases, this should be between 60%-90% of available memory on a dedicated database host, but depends on many factors.",
      "fieldConfig": {
        "defaults": {
          "color": {
            "mode": "thresholds"
          },
          "decimals": 0,
          "mappings": [],
          "thresholds": {
            "mode": "absolute",
            "steps": [
              {
                "color": "rgba(245, 54, 54, 0.9)",
                "value": null
              },
              {
                "color": "rgba(237, 129, 40, 0.89)",
                "value": 40
              },
              {
                "color": "rgba(50, 172, 45, 0.97)",
                "value": 80
              }
            ]
          },
          "unit": "percent"
        },
        "overrides": []
      },
      "gridPos": {
        "h": 4,
        "w": 6,
        "x": 18,
        "y": 1
      },
      "id": 52,
      "interval": "$interval",
      "links": [
        {
          "targetBlank": true,
          "title": "Tuning the InnoDB Buffer Pool Size",
          "url": "https://www.percona.com/blog/2015/06/02/80-ram-tune-innodb_buffer_pool_size/"
        }
      ],
      "maxDataPoints": 100,
      "options": {
        "colorMode": "value",
        "graphMode": "area",
        "justifyMode": "auto",
        "orientation": "horizontal",
        "reduceOptions": {
          "calcs": [
            "lastNotNull"
          ],
          "fields": "",
          "values": false
        },
        "textMode": "auto"
      },
      "pluginVersion": "9.5.3",
      "targets": [
        {
          "calculatedInterval": "10m",
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "datasourceErrors": {},
          "errors": {},
          "expr": "(mysql_global_variables_innodb_buffer_pool_size{instance=\"$host\"} * 100) / on (instance) node_memory_MemTotal_bytes{instance=\"$host\"}",
          "format": "time_series",
          "interval": "5m",
          "intervalFactor": 1,
          "legendFormat": "",
          "metric": "",
          "refId": "A",
          "step": 300
        }
      ],
      "title": "Buffer Pool Size of Total RAM",
      "type": "stat"
    },
    {
      "collapsed": false,
      "datasource": {
        "type": "prometheus",
        "uid": "prometheus"
      },
      "gridPos": {
        "h": 1,
        "w": 24,
        "x": 0,
        "y": 5
      },
      "id": 383,
      "panels": [],
      "targets": [
        {
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "refId": "A"
        }
      ],
      "title": "Connections",
      "type": "row"
    },
    {
      "aliasColors": {},
      "bars": false,
      "dashLength": 10,
      "dashes": false,
      "datasource": {
        "type": "prometheus",
        "uid": "prometheus"
      },
      "decimals": 0,
      "description": "**Max Connections** \n\nMax Connections is the maximum permitted number of simultaneous client connections. By default, this is 151. Increasing this value increases the number of file descriptors that mysqld requires. If the required number of descriptors are not available, the server reduces the value of Max Connections.\n\nmysqld actually permits Max Connections + 1 clients to connect. The extra connection is reserved for use by accounts that have the SUPER privilege, such as root.\n\nMax Used Connections is the maximum number of connections that have been in use simultaneously since the server started.\n\nConnections is the number of connection attempts (successful or not) to the MySQL server.",
      "editable": true,
      "error": false,
      "fill": 2,
      "fillGradient": 0,
      "grid": {},
      "gridPos": {
        "h": 7,
        "w": 12,
        "x": 0,
        "y": 6
      },
      "height": "250px",
      "hiddenSeries": false,
      "id": 92,
      "legend": {
        "alignAsTable": true,
        "avg": true,
        "current": false,
        "max": true,
        "min": true,
        "show": true,
        "sort": "avg",
        "sortDesc": true,
        "total": false,
        "values": true
      },
      "lines": true,
      "linewidth": 2,
      "links": [
        {
          "targetBlank": true,
          "title": "MySQL Server System Variables",
          "url": "https://dev.mysql.com/doc/refman/5.7/en/server-system-variables.html#sysvar_max_connections"
        }
      ],
      "nullPointMode": "null",
      "options": {
        "alertThreshold": true
      },
      "percentage": false,
      "pluginVersion": "9.5.3",
      "pointradius": 5,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [
        {
          "alias": "Max Connections",
          "fill": 0
        }
      ],
      "spaceLength": 10,
      "stack": false,
      "steppedLine": false,
      "targets": [
        {
          "calculatedInterval": "2m",
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "datasourceErrors": {},
          "errors": {},
          "expr": "max(max_over_time(mysql_global_status_threads_connected{instance=\"$host\"}[$interval])  or mysql_global_status_threads_connected{instance=\"$host\"} )",
          "format": "time_series",
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "Connections",
          "metric": "",
          "refId": "A",
          "step": 20
        },
        {
          "calculatedInterval": "2m",
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "datasourceErrors": {},
          "errors": {},
          "expr": "mysql_global_status_max_used_connections{instance=\"$host\"}",
          "format": "time_series",
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "Max Used Connections",
          "metric": "",
          "refId": "C",
          "step": 20,
          "target": ""
        },
        {
          "calculatedInterval": "2m",
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "datasourceErrors": {},
          "errors": {},
          "expr": "mysql_global_variables_max_connections{instance=\"$host\"}",
          "format": "time_series",
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "Max Connections",
          "metric": "",
          "refId": "B",
          "step": 20,
          "target": ""
        }
      ],
      "thresholds": [],
      "timeRegions": [],
      "title": "MySQL Connections",
      "tooltip": {
        "msResolution": false,
        "shared": true,
        "sort": 0,
        "value_type": "cumulative"
      },
      "type": "graph",
      "xaxis": {
        "mode": "time",
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "format": "short",
          "label": "",
          "logBase": 1,
          "min": 0,
          "show": true
        },
        {
          "format": "short",
          "label": "",
          "logBase": 1,
          "min": 0,
          "show": true
        }
      ],
      "yaxis": {
        "align": false
      }
    },
    {
      "aliasColors": {},
      "bars": false,
      "dashLength": 10,
      "dashes": false,
      "datasource": {
        "type": "prometheus",
        "uid": "prometheus"
      },
      "decimals": 2,
      "description": "**MySQL Active Threads**\n\nThreads Connected is the number of open connections, while Threads Running is the number of threads not sleeping.",
      "editable": true,
      "error": false,
      "fill": 2,
      "fillGradient": 0,
      "grid": {},
      "gridPos": {
        "h": 7,
        "w": 12,
        "x": 12,
        "y": 6
      },
      "hiddenSeries": false,
      "id": 10,
      "legend": {
        "alignAsTable": true,
        "avg": true,
        "current": true,
        "max": true,
        "min": true,
        "rightSide": false,
        "show": true,
        "sortDesc": true,
        "total": false,
        "values": true
      },
      "lines": true,
      "linewidth": 2,
      "links": [],
      "nullPointMode": "null",
      "options": {
        "alertThreshold": true
      },
      "percentage": false,
      "pluginVersion": "9.5.3",
      "pointradius": 5,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [
        {
          "alias": "Peak Threads Running",
          "color": "#E24D42",
          "lines": false,
          "pointradius": 1,
          "points": true
        },
        {
          "alias": "Peak Threads Connected",
          "color": "#1F78C1"
        },
        {
          "alias": "Avg Threads Running",
          "color": "#EAB839"
        }
      ],
      "spaceLength": 10,
      "stack": false,
      "steppedLine": false,
      "targets": [
        {
          "calculatedInterval": "2m",
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "datasourceErrors": {},
          "errors": {},
          "expr": "max_over_time(mysql_global_status_threads_connected{instance=\"$host\"}[$interval]) or\nmax_over_time(mysql_global_status_threads_connected{instance=\"$host\"}[5m])",
          "format": "time_series",
          "hide": false,
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "Peak Threads Connected",
          "metric": "",
          "refId": "A",
          "step": 20
        },
        {
          "calculatedInterval": "2m",
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "datasourceErrors": {},
          "errors": {},
          "expr": "max_over_time(mysql_global_status_threads_running{instance=\"$host\"}[$interval]) or\nmax_over_time(mysql_global_status_threads_running{instance=\"$host\"}[5m])",
          "format": "time_series",
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "Peak Threads Running",
          "metric": "",
          "refId": "B",
          "step": 20
        },
        {
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "expr": "avg_over_time(mysql_global_status_threads_running{instance=\"$host\"}[$interval]) or \navg_over_time(mysql_global_status_threads_running{instance=\"$host\"}[5m])",
          "format": "time_series",
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "Avg Threads Running",
          "refId": "C",
          "step": 20
        }
      ],
      "thresholds": [],
      "timeRegions": [],
      "title": "MySQL Client Thread Activity",
      "tooltip": {
        "msResolution": false,
        "shared": true,
        "sort": 0,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "mode": "time",
        "show": true,
        "values": [
          "total"
        ]
      },
      "yaxes": [
        {
          "format": "short",
          "label": "Threads",
          "logBase": 1,
          "min": 0,
          "show": true
        },
        {
          "format": "short",
          "label": "",
          "logBase": 1,
          "min": 0,
          "show": false
        }
      ],
      "yaxis": {
        "align": false
      }
    },
    {
      "collapsed": false,
      "datasource": {
        "type": "prometheus",
        "uid": "prometheus"
      },
      "gridPos": {
        "h": 1,
        "w": 24,
        "x": 0,
        "y": 13
      },
      "id": 384,
      "panels": [],
      "targets": [
        {
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "refId": "A"
        }
      ],
      "title": "Table Locks",
      "type": "row"
    },
    {
      "aliasColors": {},
      "bars": false,
      "dashLength": 10,
      "dashes": false,
      "datasource": {
        "type": "prometheus",
        "uid": "prometheus"
      },
      "description": "**MySQL Questions**\n\nThe number of statements executed by the server. This includes only statements sent to the server by clients and not statements executed within stored programs, unlike the Queries used in the QPS calculation. \n\nThis variable does not count the following commands:\n* ``COM_PING``\n* ``COM_STATISTICS``\n* ``COM_STMT_PREPARE``\n* ``COM_STMT_CLOSE``\n* ``COM_STMT_RESET``",
      "editable": true,
      "error": false,
      "fill": 2,
      "fillGradient": 0,
      "grid": {},
      "gridPos": {
        "h": 7,
        "w": 12,
        "x": 0,
        "y": 14
      },
      "hiddenSeries": false,
      "id": 53,
      "legend": {
        "alignAsTable": true,
        "avg": true,
        "current": false,
        "max": true,
        "min": true,
        "rightSide": false,
        "show": true,
        "total": false,
        "values": true
      },
      "lines": true,
      "linewidth": 2,
      "links": [
        {
          "targetBlank": true,
          "title": "MySQL Queries and Questions",
          "url": "https://www.percona.com/blog/2014/05/29/how-mysql-queries-and-questions-are-measured/"
        }
      ],
      "nullPointMode": "null",
      "options": {
        "alertThreshold": true
      },
      "percentage": false,
      "pluginVersion": "9.5.3",
      "pointradius": 5,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [],
      "spaceLength": 10,
      "stack": false,
      "steppedLine": false,
      "targets": [
        {
          "calculatedInterval": "2m",
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "datasourceErrors": {},
          "errors": {},
          "expr": "rate(mysql_global_status_questions{instance=\"$host\"}[$interval]) or irate(mysql_global_status_questions{instance=\"$host\"}[5m])",
          "format": "time_series",
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "Questions",
          "metric": "",
          "refId": "A",
          "step": 20
        }
      ],
      "thresholds": [],
      "timeRegions": [],
      "title": "MySQL Questions",
      "tooltip": {
        "msResolution": false,
        "shared": true,
        "sort": 0,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "mode": "time",
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "format": "short",
          "logBase": 1,
          "min": 0,
          "show": true
        },
        {
          "format": "short",
          "logBase": 1,
          "min": 0,
          "show": true
        }
      ],
      "yaxis": {
        "align": false
      }
    },
    {
      "aliasColors": {},
      "bars": false,
      "dashLength": 10,
      "dashes": false,
      "datasource": {
        "type": "prometheus",
        "uid": "prometheus"
      },
      "decimals": 2,
      "description": "**MySQL Thread Cache**\n\nThe thread_cache_size variable sets how many threads the server should cache to reuse. When a client disconnects, the client's threads are put in the cache if the cache is not full. It is autosized in MySQL 5.6.8 and above (capped to 100). Requests for threads are satisfied by reusing threads taken from the cache if possible, and only when the cache is empty is a new thread created.\n\n* *Threads_created*: The number of threads created to handle connections.\n* *Threads_cached*: The number of threads in the thread cache.",
      "editable": true,
      "error": false,
      "fill": 2,
      "fillGradient": 0,
      "grid": {},
      "gridPos": {
        "h": 7,
        "w": 12,
        "x": 12,
        "y": 14
      },
      "hiddenSeries": false,
      "id": 11,
      "legend": {
        "alignAsTable": true,
        "avg": true,
        "current": false,
        "max": true,
        "min": true,
        "rightSide": false,
        "show": true,
        "sort": "avg",
        "sortDesc": true,
        "total": false,
        "values": true
      },
      "lines": true,
      "linewidth": 2,
      "links": [
        {
          "title": "Tuning information",
          "url": "https://dev.mysql.com/doc/refman/5.6/en/server-system-variables.html#sysvar_thread_cache_size"
        }
      ],
      "nullPointMode": "null",
      "options": {
        "alertThreshold": true
      },
      "percentage": false,
      "pluginVersion": "9.5.3",
      "pointradius": 5,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [
        {
          "alias": "Threads Created",
          "fill": 0
        }
      ],
      "spaceLength": 10,
      "stack": false,
      "steppedLine": false,
      "targets": [
        {
          "calculatedInterval": "2m",
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "datasourceErrors": {},
          "errors": {},
          "expr": "mysql_global_variables_thread_cache_size{instance=\"$host\"}",
          "format": "time_series",
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "Thread Cache Size",
          "metric": "",
          "refId": "B",
          "step": 20
        },
        {
          "calculatedInterval": "2m",
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "datasourceErrors": {},
          "errors": {},
          "expr": "mysql_global_status_threads_cached{instance=\"$host\"}",
          "format": "time_series",
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "Threads Cached",
          "metric": "",
          "refId": "C",
          "step": 20
        },
        {
          "calculatedInterval": "2m",
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "datasourceErrors": {},
          "errors": {},
          "expr": "rate(mysql_global_status_threads_created{instance=\"$host\"}[$interval]) or irate(mysql_global_status_threads_created{instance=\"$host\"}[5m])",
          "format": "time_series",
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "Threads Created",
          "metric": "",
          "refId": "A",
          "step": 20
        }
      ],
      "thresholds": [],
      "timeRegions": [],
      "title": "MySQL Thread Cache",
      "tooltip": {
        "msResolution": false,
        "shared": true,
        "sort": 0,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "mode": "time",
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "format": "short",
          "logBase": 1,
          "min": 0,
          "show": true
        },
        {
          "format": "short",
          "logBase": 1,
          "min": 0,
          "show": true
        }
      ],
      "yaxis": {
        "align": false
      }
    },
    {
      "collapsed": false,
      "datasource": {
        "type": "prometheus",
        "uid": "prometheus"
      },
      "gridPos": {
        "h": 1,
        "w": 24,
        "x": 0,
        "y": 21
      },
      "id": 385,
      "panels": [],
      "targets": [
        {
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "refId": "A"
        }
      ],
      "title": "Temporary Objects",
      "type": "row"
    },
    {
      "aliasColors": {},
      "bars": false,
      "dashLength": 10,
      "dashes": false,
      "datasource": {
        "type": "prometheus",
        "uid": "prometheus"
      },
      "decimals": 2,
      "editable": true,
      "error": false,
      "fill": 2,
      "fillGradient": 0,
      "grid": {},
      "gridPos": {
        "h": 7,
        "w": 12,
        "x": 0,
        "y": 22
      },
      "hiddenSeries": false,
      "id": 22,
      "legend": {
        "alignAsTable": true,
        "avg": true,
        "current": false,
        "max": true,
        "min": true,
        "rightSide": false,
        "show": true,
        "sort": "avg",
        "sortDesc": true,
        "total": false,
        "values": true
      },
      "lines": true,
      "linewidth": 2,
      "links": [],
      "nullPointMode": "null",
      "options": {
        "alertThreshold": true
      },
      "percentage": false,
      "pluginVersion": "9.5.3",
      "pointradius": 5,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [],
      "spaceLength": 10,
      "stack": false,
      "steppedLine": false,
      "targets": [
        {
          "calculatedInterval": "2m",
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "datasourceErrors": {},
          "errors": {},
          "expr": "rate(mysql_global_status_created_tmp_tables{instance=\"$host\"}[$interval]) or irate(mysql_global_status_created_tmp_tables{instance=\"$host\"}[5m])",
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "Created Tmp Tables",
          "metric": "",
          "refId": "A",
          "step": 20
        },
        {
          "calculatedInterval": "2m",
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "datasourceErrors": {},
          "errors": {},
          "expr": "rate(mysql_global_status_created_tmp_disk_tables{instance=\"$host\"}[$interval]) or irate(mysql_global_status_created_tmp_disk_tables{instance=\"$host\"}[5m])",
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "Created Tmp Disk Tables",
          "metric": "",
          "refId": "B",
          "step": 20
        },
        {
          "calculatedInterval": "2m",
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "datasourceErrors": {},
          "errors": {},
          "expr": "rate(mysql_global_status_created_tmp_files{instance=\"$host\"}[$interval]) or irate(mysql_global_status_created_tmp_files{instance=\"$host\"}[5m])",
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "Created Tmp Files",
          "metric": "",
          "refId": "C",
          "step": 20
        }
      ],
      "thresholds": [],
      "timeRegions": [],
      "title": "MySQL Temporary Objects",
      "tooltip": {
        "msResolution": false,
        "shared": true,
        "sort": 0,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "mode": "time",
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "format": "short",
          "logBase": 1,
          "min": 0,
          "show": true
        },
        {
          "format": "short",
          "logBase": 1,
          "min": 0,
          "show": true
        }
      ],
      "yaxis": {
        "align": false
      }
    },
    {
      "aliasColors": {},
      "bars": false,
      "dashLength": 10,
      "dashes": false,
      "datasource": {
        "type": "prometheus",
        "uid": "prometheus"
      },
      "decimals": 2,
      "description": "**MySQL Select Types**\n\nAs with most relational databases, selecting based on indexes is more efficient than scanning an entire table's data. Here we see the counters for selects not done with indexes.\n\n* ***Select Scan*** is how many queries caused full table scans, in which all the data in the table had to be read and either discarded or returned.\n* ***Select Range*** is how many queries used a range scan, which means MySQL scanned all rows in a given range.\n* ***Select Full Join*** is the number of joins that are not joined on an index, this is usually a huge performance hit.",
      "editable": true,
      "error": false,
      "fill": 2,
      "fillGradient": 0,
      "grid": {},
      "gridPos": {
        "h": 7,
        "w": 12,
        "x": 12,
        "y": 22
      },
      "height": "250px",
      "hiddenSeries": false,
      "id": 311,
      "legend": {
        "alignAsTable": true,
        "avg": true,
        "current": false,
        "hideZero": true,
        "max": true,
        "min": true,
        "rightSide": false,
        "show": true,
        "sort": "avg",
        "sortDesc": true,
        "total": false,
        "values": true
      },
      "lines": true,
      "linewidth": 2,
      "links": [],
      "nullPointMode": "null",
      "options": {
        "alertThreshold": true
      },
      "percentage": false,
      "pluginVersion": "9.5.3",
      "pointradius": 5,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [],
      "spaceLength": 10,
      "stack": false,
      "steppedLine": false,
      "targets": [
        {
          "calculatedInterval": "2m",
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "datasourceErrors": {},
          "errors": {},
          "expr": "rate(mysql_global_status_select_full_join{instance=\"$host\"}[$interval]) or irate(mysql_global_status_select_full_join{instance=\"$host\"}[5m])",
          "format": "time_series",
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "Select Full Join",
          "metric": "",
          "refId": "A",
          "step": 20
        },
        {
          "calculatedInterval": "2m",
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "datasourceErrors": {},
          "errors": {},
          "expr": "rate(mysql_global_status_select_full_range_join{instance=\"$host\"}[$interval]) or irate(mysql_global_status_select_full_range_join{instance=\"$host\"}[5m])",
          "format": "time_series",
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "Select Full Range Join",
          "metric": "",
          "refId": "B",
          "step": 20
        },
        {
          "calculatedInterval": "2m",
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "datasourceErrors": {},
          "errors": {},
          "expr": "rate(mysql_global_status_select_range{instance=\"$host\"}[$interval]) or irate(mysql_global_status_select_range{instance=\"$host\"}[5m])",
          "format": "time_series",
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "Select Range",
          "metric": "",
          "refId": "C",
          "step": 20
        },
        {
          "calculatedInterval": "2m",
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "datasourceErrors": {},
          "errors": {},
          "expr": "rate(mysql_global_status_select_range_check{instance=\"$host\"}[$interval]) or irate(mysql_global_status_select_range_check{instance=\"$host\"}[5m])",
          "format": "time_series",
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "Select Range Check",
          "metric": "",
          "refId": "D",
          "step": 20
        },
        {
          "calculatedInterval": "2m",
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "datasourceErrors": {},
          "errors": {},
          "expr": "rate(mysql_global_status_select_scan{instance=\"$host\"}[$interval]) or irate(mysql_global_status_select_scan{instance=\"$host\"}[5m])",
          "format": "time_series",
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "Select Scan",
          "metric": "",
          "refId": "E",
          "step": 20
        }
      ],
      "thresholds": [],
      "timeRegions": [],
      "title": "MySQL Select Types",
      "tooltip": {
        "msResolution": false,
        "shared": true,
        "sort": 0,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "mode": "time",
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "format": "short",
          "logBase": 1,
          "min": 0,
          "show": true
        },
        {
          "format": "short",
          "logBase": 1,
          "min": 0,
          "show": true
        }
      ],
      "yaxis": {
        "align": false
      }
    },
    {
      "collapsed": false,
      "datasource": {
        "type": "prometheus",
        "uid": "prometheus"
      },
      "gridPos": {
        "h": 1,
        "w": 24,
        "x": 0,
        "y": 29
      },
      "id": 386,
      "panels": [],
      "targets": [
        {
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "refId": "A"
        }
      ],
      "title": "Sorts",
      "type": "row"
    },
    {
      "aliasColors": {},
      "bars": false,
      "dashLength": 10,
      "dashes": false,
      "datasource": {
        "type": "prometheus",
        "uid": "prometheus"
      },
      "decimals": 2,
      "description": "**MySQL Sorts**\n\nDue to a query's structure, order, or other requirements, MySQL sorts the rows before returning them. For example, if a table is ordered 1 to 10 but you want the results reversed, MySQL then has to sort the rows to return 10 to 1.\n\nThis graph also shows when sorts had to scan a whole table or a given range of a table in order to return the results and which could not have been sorted via an index.",
      "editable": true,
      "error": false,
      "fill": 2,
      "grid": {},
      "gridPos": {
        "h": 7,
        "w": 12,
        "x": 0,
        "y": 30
      },
      "id": 30,
      "legend": {
        "alignAsTable": true,
        "avg": true,
        "current": false,
        "hideZero": true,
        "max": true,
        "min": true,
        "rightSide": false,
        "show": true,
        "sort": "avg",
        "sortDesc": true,
        "total": false,
        "values": true
      },
      "lines": true,
      "linewidth": 2,
      "links": [],
      "nullPointMode": "null",
      "percentage": false,
      "pointradius": 5,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [],
      "spaceLength": 10,
      "stack": false,
      "steppedLine": false,
      "targets": [
        {
          "calculatedInterval": "2m",
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "datasourceErrors": {},
          "errors": {},
          "expr": "rate(mysql_global_status_sort_rows{instance=\"$host\"}[$interval]) or irate(mysql_global_status_sort_rows{instance=\"$host\"}[5m])",
          "format": "time_series",
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "Sort Rows",
          "metric": "",
          "refId": "A",
          "step": 20
        },
        {
          "calculatedInterval": "2m",
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "datasourceErrors": {},
          "errors": {},
          "expr": "rate(mysql_global_status_sort_range{instance=\"$host\"}[$interval]) or irate(mysql_global_status_sort_range{instance=\"$host\"}[5m])",
          "format": "time_series",
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "Sort Range",
          "metric": "",
          "refId": "B",
          "step": 20
        },
        {
          "calculatedInterval": "2m",
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "datasourceErrors": {},
          "errors": {},
          "expr": "rate(mysql_global_status_sort_merge_passes{instance=\"$host\"}[$interval]) or irate(mysql_global_status_sort_merge_passes{instance=\"$host\"}[5m])",
          "format": "time_series",
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "Sort Merge Passes",
          "metric": "",
          "refId": "C",
          "step": 20
        },
        {
          "calculatedInterval": "2m",
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "datasourceErrors": {},
          "errors": {},
          "expr": "rate(mysql_global_status_sort_scan{instance=\"$host\"}[$interval]) or irate(mysql_global_status_sort_scan{instance=\"$host\"}[5m])",
          "format": "time_series",
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "Sort Scan",
          "metric": "",
          "refId": "D",
          "step": 20
        }
      ],
      "thresholds": [],
      "title": "MySQL Sorts",
      "tooltip": {
        "msResolution": false,
        "shared": true,
        "sort": 0,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "mode": "time",
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "format": "short",
          "logBase": 1,
          "min": 0,
          "show": true
        },
        {
          "format": "short",
          "logBase": 1,
          "min": 0,
          "show": true
        }
      ],
      "yaxis": {
        "align": false
      }
    },
    {
      "aliasColors": {},
      "bars": false,
      "dashLength": 10,
      "dashes": false,
      "datasource": {
        "type": "prometheus",
        "uid": "prometheus"
      },
      "decimals": 2,
      "description": "**MySQL Slow Queries**\n\nSlow queries are defined as queries being slower than the long_query_time setting. For example, if you have long_query_time set to 3, all queries that take longer than 3 seconds to complete will show on this graph.",
      "editable": true,
      "error": false,
      "fill": 2,
      "grid": {},
      "gridPos": {
        "h": 7,
        "w": 12,
        "x": 12,
        "y": 30
      },
      "id": 48,
      "legend": {
        "alignAsTable": true,
        "avg": true,
        "current": false,
        "max": true,
        "min": true,
        "show": true,
        "total": false,
        "values": true
      },
      "lines": true,
      "linewidth": 2,
      "links": [],
      "nullPointMode": "null",
      "percentage": false,
      "pointradius": 5,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [],
      "spaceLength": 10,
      "stack": false,
      "steppedLine": false,
      "targets": [
        {
          "calculatedInterval": "2m",
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "datasourceErrors": {},
          "errors": {},
          "expr": "rate(mysql_global_status_slow_queries{instance=\"$host\"}[$interval]) or irate(mysql_global_status_slow_queries{instance=\"$host\"}[5m])",
          "format": "time_series",
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "Slow Queries",
          "metric": "",
          "refId": "A",
          "step": 20
        }
      ],
      "thresholds": [],
      "title": "MySQL Slow Queries",
      "tooltip": {
        "msResolution": false,
        "shared": true,
        "sort": 0,
        "value_type": "cumulative"
      },
      "type": "graph",
      "xaxis": {
        "mode": "time",
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "format": "short",
          "label": "",
          "logBase": 1,
          "min": 0,
          "show": true
        },
        {
          "format": "short",
          "label": "",
          "logBase": 1,
          "min": 0,
          "show": true
        }
      ],
      "yaxis": {
        "align": false
      }
    },
    {
      "collapsed": false,
      "datasource": {
        "type": "prometheus",
        "uid": "prometheus"
      },
      "gridPos": {
        "h": 1,
        "w": 24,
        "x": 0,
        "y": 37
      },
      "id": 387,
      "panels": [],
      "targets": [
        {
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "refId": "A"
        }
      ],
      "title": "Aborted",
      "type": "row"
    },
    {
      "aliasColors": {},
      "bars": false,
      "dashLength": 10,
      "dashes": false,
      "datasource": {
        "type": "prometheus",
        "uid": "prometheus"
      },
      "decimals": 2,
      "description": "**Aborted Connections**\n\nWhen a given host connects to MySQL and the connection is interrupted in the middle (for example due to bad credentials), MySQL keeps that info in a system table (since 5.6 this table is exposed in performance_schema).\n\nIf the amount of failed requests without a successful connection reaches the value of max_connect_errors, mysqld assumes that something is wrong and blocks the host from further connection.\n\nTo allow connections from that host again, you need to issue the ``FLUSH HOSTS`` statement.",
      "editable": true,
      "error": false,
      "fill": 2,
      "grid": {},
      "gridPos": {
        "h": 7,
        "w": 12,
        "x": 0,
        "y": 38
      },
      "id": 47,
      "legend": {
        "alignAsTable": true,
        "avg": true,
        "current": false,
        "max": true,
        "min": true,
        "show": true,
        "sort": "avg",
        "sortDesc": true,
        "total": false,
        "values": true
      },
      "lines": true,
      "linewidth": 2,
      "links": [],
      "nullPointMode": "null",
      "percentage": false,
      "pointradius": 5,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [],
      "spaceLength": 10,
      "stack": false,
      "steppedLine": false,
      "targets": [
        {
          "calculatedInterval": "2m",
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "datasourceErrors": {},
          "errors": {},
          "expr": "rate(mysql_global_status_aborted_connects{instance=\"$host\"}[$interval]) or irate(mysql_global_status_aborted_connects{instance=\"$host\"}[5m])",
          "format": "time_series",
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "Aborted Connects (attempts)",
          "metric": "",
          "refId": "A",
          "step": 20
        },
        {
          "calculatedInterval": "2m",
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "datasourceErrors": {},
          "errors": {},
          "expr": "rate(mysql_global_status_aborted_clients{instance=\"$host\"}[$interval]) or irate(mysql_global_status_aborted_clients{instance=\"$host\"}[5m])",
          "format": "time_series",
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "Aborted Clients (timeout)",
          "metric": "",
          "refId": "B",
          "step": 20,
          "target": ""
        }
      ],
      "thresholds": [],
      "title": "MySQL Aborted Connections",
      "tooltip": {
        "msResolution": false,
        "shared": true,
        "sort": 0,
        "value_type": "cumulative"
      },
      "type": "graph",
      "xaxis": {
        "mode": "time",
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "format": "short",
          "label": "",
          "logBase": 1,
          "min": 0,
          "show": true
        },
        {
          "format": "short",
          "label": "",
          "logBase": 1,
          "min": 0,
          "show": true
        }
      ],
      "yaxis": {
        "align": false
      }
    },
    {
      "aliasColors": {},
      "bars": false,
      "dashLength": 10,
      "dashes": false,
      "datasource": {
        "type": "prometheus",
        "uid": "prometheus"
      },
      "decimals": 2,
      "description": "**Table Locks**\n\nMySQL takes a number of different locks for varying reasons. In this graph we see how many Table level locks MySQL has requested from the storage engine. In the case of InnoDB, many times the locks could actually be row locks as it only takes table level locks in a few specific cases.\n\nIt is most useful to compare Locks Immediate and Locks Waited. If Locks waited is rising, it means you have lock contention. Otherwise, Locks Immediate rising and falling is normal activity.",
      "editable": true,
      "error": false,
      "fill": 2,
      "grid": {},
      "gridPos": {
        "h": 7,
        "w": 12,
        "x": 12,
        "y": 38
      },
      "id": 32,
      "legend": {
        "alignAsTable": true,
        "avg": true,
        "current": false,
        "max": true,
        "min": true,
        "rightSide": false,
        "show": true,
        "sort": "avg",
        "sortDesc": true,
        "total": false,
        "values": true
      },
      "lines": true,
      "linewidth": 2,
      "links": [],
      "nullPointMode": "null",
      "percentage": false,
      "pointradius": 5,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [],
      "spaceLength": 10,
      "stack": false,
      "steppedLine": false,
      "targets": [
        {
          "calculatedInterval": "2m",
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "datasourceErrors": {},
          "errors": {},
          "expr": "rate(mysql_global_status_table_locks_immediate{instance=\"$host\"}[$interval]) or irate(mysql_global_status_table_locks_immediate{instance=\"$host\"}[5m])",
          "format": "time_series",
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "Table Locks Immediate",
          "metric": "",
          "refId": "A",
          "step": 20
        },
        {
          "calculatedInterval": "2m",
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "datasourceErrors": {},
          "errors": {},
          "expr": "rate(mysql_global_status_table_locks_waited{instance=\"$host\"}[$interval]) or irate(mysql_global_status_table_locks_waited{instance=\"$host\"}[5m])",
          "format": "time_series",
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "Table Locks Waited",
          "metric": "",
          "refId": "B",
          "step": 20
        }
      ],
      "thresholds": [],
      "title": "MySQL Table Locks",
      "tooltip": {
        "msResolution": false,
        "shared": true,
        "sort": 0,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "mode": "time",
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "format": "short",
          "logBase": 1,
          "min": 0,
          "show": true
        },
        {
          "format": "short",
          "logBase": 1,
          "min": 0,
          "show": true
        }
      ],
      "yaxis": {
        "align": false
      }
    },
    {
      "collapsed": false,
      "datasource": {
        "type": "prometheus",
        "uid": "prometheus"
      },
      "gridPos": {
        "h": 1,
        "w": 24,
        "x": 0,
        "y": 45
      },
      "id": 388,
      "panels": [],
      "targets": [
        {
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "refId": "A"
        }
      ],
      "title": "Network",
      "type": "row"
    },
    {
      "aliasColors": {},
      "bars": false,
      "dashLength": 10,
      "dashes": false,
      "datasource": {
        "type": "prometheus",
        "uid": "prometheus"
      },
      "decimals": 2,
      "description": "**MySQL Network Traffic**\n\nHere we can see how much network traffic is generated by MySQL. Outbound is network traffic sent from MySQL and Inbound is network traffic MySQL has received.",
      "editable": true,
      "error": false,
      "fill": 6,
      "grid": {},
      "gridPos": {
        "h": 7,
        "w": 12,
        "x": 0,
        "y": 46
      },
      "id": 9,
      "legend": {
        "alignAsTable": true,
        "avg": true,
        "current": false,
        "max": true,
        "min": true,
        "rightSide": false,
        "show": true,
        "sort": "avg",
        "sortDesc": true,
        "total": false,
        "values": true
      },
      "lines": true,
      "linewidth": 2,
      "links": [],
      "nullPointMode": "null",
      "percentage": false,
      "pointradius": 5,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [],
      "spaceLength": 10,
      "stack": true,
      "steppedLine": false,
      "targets": [
        {
          "calculatedInterval": "2m",
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "datasourceErrors": {},
          "errors": {},
          "expr": "rate(mysql_global_status_bytes_received{instance=\"$host\"}[$interval]) or irate(mysql_global_status_bytes_received{instance=\"$host\"}[5m])",
          "format": "time_series",
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "Inbound",
          "metric": "",
          "refId": "A",
          "step": 20
        },
        {
          "calculatedInterval": "2m",
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "datasourceErrors": {},
          "errors": {},
          "expr": "rate(mysql_global_status_bytes_sent{instance=\"$host\"}[$interval]) or irate(mysql_global_status_bytes_sent{instance=\"$host\"}[5m])",
          "format": "time_series",
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "Outbound",
          "metric": "",
          "refId": "B",
          "step": 20
        }
      ],
      "thresholds": [],
      "title": "MySQL Network Traffic",
      "tooltip": {
        "msResolution": false,
        "shared": true,
        "sort": 0,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "mode": "time",
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "format": "Bps",
          "logBase": 1,
          "min": 0,
          "show": true
        },
        {
          "format": "none",
          "logBase": 1,
          "min": 0,
          "show": true
        }
      ],
      "yaxis": {
        "align": false
      }
    },
    {
      "aliasColors": {},
      "bars": true,
      "dashLength": 10,
      "dashes": false,
      "datasource": {
        "type": "prometheus",
        "uid": "prometheus"
      },
      "decimals": 2,
      "description": "**MySQL Network Usage Hourly**\n\nHere we can see how much network traffic is generated by MySQL per hour. You can use the bar graph to compare data sent by MySQL and data received by MySQL.",
      "editable": true,
      "error": false,
      "fill": 6,
      "grid": {},
      "gridPos": {
        "h": 7,
        "w": 12,
        "x": 12,
        "y": 46
      },
      "height": "250px",
      "id": 381,
      "legend": {
        "alignAsTable": true,
        "avg": true,
        "current": false,
        "max": true,
        "min": true,
        "rightSide": false,
        "show": true,
        "sort": "avg",
        "sortDesc": true,
        "total": false,
        "values": true
      },
      "lines": false,
      "linewidth": 2,
      "links": [],
      "nullPointMode": "null",
      "percentage": false,
      "pointradius": 5,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [],
      "spaceLength": 10,
      "stack": true,
      "steppedLine": false,
      "targets": [
        {
          "calculatedInterval": "2m",
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "datasourceErrors": {},
          "errors": {},
          "expr": "increase(mysql_global_status_bytes_received{instance=\"$host\"}[1h])",
          "format": "time_series",
          "interval": "1h",
          "intervalFactor": 1,
          "legendFormat": "Received",
          "metric": "",
          "refId": "A",
          "step": 3600
        },
        {
          "calculatedInterval": "2m",
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "datasourceErrors": {},
          "errors": {},
          "expr": "increase(mysql_global_status_bytes_sent{instance=\"$host\"}[1h])",
          "format": "time_series",
          "interval": "1h",
          "intervalFactor": 1,
          "legendFormat": "Sent",
          "metric": "",
          "refId": "B",
          "step": 3600
        }
      ],
      "thresholds": [],
      "timeFrom": "24h",
      "title": "MySQL Network Usage Hourly",
      "tooltip": {
        "msResolution": false,
        "shared": true,
        "sort": 0,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "mode": "time",
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "format": "bytes",
          "logBase": 1,
          "min": 0,
          "show": true
        },
        {
          "format": "none",
          "logBase": 1,
          "min": 0,
          "show": true
        }
      ],
      "yaxis": {
        "align": false
      }
    },
    {
      "collapsed": false,
      "datasource": {
        "type": "prometheus",
        "uid": "prometheus"
      },
      "gridPos": {
        "h": 1,
        "w": 24,
        "x": 0,
        "y": 53
      },
      "id": 389,
      "panels": [],
      "targets": [
        {
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "refId": "A"
        }
      ],
      "title": "Memory",
      "type": "row"
    },
    {
      "aliasColors": {},
      "bars": false,
      "dashLength": 10,
      "dashes": false,
      "datasource": {
        "type": "prometheus",
        "uid": "prometheus"
      },
      "decimals": 0,
      "description": "***System Memory***: Total Memory for the system.\\\n***InnoDB Buffer Pool Data***: InnoDB maintains a storage area called the buffer pool for caching data and indexes in memory.\\\n***TokuDB Cache Size***: Similar in function to the InnoDB Buffer Pool,  TokuDB will allocate 50% of the installed RAM for its own cache.\\\n***Key Buffer Size***: Index blocks for MYISAM tables are buffered and are shared by all threads. key_buffer_size is the size of the buffer used for index blocks.\\\n***Adaptive Hash Index Size***: When InnoDB notices that some index values are being accessed very frequently, it builds a hash index for them in memory on top of B-Tree indexes.\\\n ***Query Cache Size***: The query cache stores the text of a SELECT statement together with the corresponding result that was sent to the client. The query cache has huge scalability problems in that only one thread can do an operation in the query cache at the same time.\\\n***InnoDB Dictionary Size***: The data dictionary is InnoDB ‘s internal catalog of tables. InnoDB stores the data dictionary on disk, and loads entries into memory while the server is running.\\\n***InnoDB Log Buffer Size***: The MySQL InnoDB log buffer allows transactions to run without having to write the log to disk before the transactions commit.",
      "editable": true,
      "error": false,
      "fill": 6,
      "grid": {},
      "gridPos": {
        "h": 7,
        "w": 24,
        "x": 0,
        "y": 54
      },
      "id": 50,
      "legend": {
        "alignAsTable": true,
        "avg": true,
        "current": false,
        "hideEmpty": true,
        "hideZero": true,
        "max": true,
        "min": true,
        "rightSide": true,
        "show": true,
        "sort": "avg",
        "sortDesc": true,
        "total": false,
        "values": true
      },
      "lines": true,
      "linewidth": 2,
      "links": [
        {
          "title": "Detailed descriptions about metrics",
          "url": "https://www.percona.com/doc/percona-monitoring-and-management/dashboard.mysql-overview.html#mysql-internal-memory-overview"
        }
      ],
      "nullPointMode": "null",
      "percentage": false,
      "pointradius": 5,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [
        {
          "alias": "System Memory",
          "fill": 0,
          "stack": false
        }
      ],
      "spaceLength": 10,
      "stack": true,
      "steppedLine": false,
      "targets": [
        {
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "expr": "node_memory_MemTotal_bytes{instance=\"$host\"}",
          "format": "time_series",
          "intervalFactor": 2,
          "legendFormat": "System Memory",
          "refId": "G",
          "step": 4
        },
        {
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "expr": "mysql_global_status_innodb_page_size{instance=\"$host\"} * on (instance) mysql_global_status_buffer_pool_pages{instance=\"$host\",state=\"data\"}",
          "format": "time_series",
          "hide": false,
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "InnoDB Buffer Pool Data",
          "refId": "A",
          "step": 20
        },
        {
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "expr": "mysql_global_variables_innodb_log_buffer_size{instance=\"$host\"}",
          "format": "time_series",
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "InnoDB Log Buffer Size",
          "refId": "D",
          "step": 20
        },
        {
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "expr": "mysql_global_variables_innodb_additional_mem_pool_size{instance=\"$host\"}",
          "format": "time_series",
          "interval": "$interval",
          "intervalFactor": 2,
          "legendFormat": "InnoDB Additional Memory Pool Size",
          "refId": "H",
          "step": 40
        },
        {
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "expr": "mysql_global_status_innodb_mem_dictionary{instance=\"$host\"}",
          "format": "time_series",
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "InnoDB Dictionary Size",
          "refId": "F",
          "step": 20
        },
        {
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "expr": "mysql_global_variables_key_buffer_size{instance=\"$host\"}",
          "format": "time_series",
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "Key Buffer Size",
          "refId": "B",
          "step": 20
        },
        {
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "expr": "mysql_global_variables_query_cache_size{instance=\"$host\"}",
          "format": "time_series",
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "Query Cache Size",
          "refId": "C",
          "step": 20
        },
        {
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "expr": "mysql_global_status_innodb_mem_adaptive_hash{instance=\"$host\"}",
          "format": "time_series",
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "Adaptive Hash Index Size",
          "refId": "E",
          "step": 20
        },
        {
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "expr": "mysql_global_variables_tokudb_cache_size{instance=\"$host\"}",
          "format": "time_series",
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "TokuDB Cache Size",
          "refId": "I",
          "step": 20
        }
      ],
      "thresholds": [],
      "title": "MySQL Internal Memory Overview",
      "tooltip": {
        "msResolution": false,
        "shared": true,
        "sort": 0,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "mode": "time",
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "format": "bytes",
          "label": "",
          "logBase": 1,
          "min": 0,
          "show": true
        },
        {
          "format": "short",
          "logBase": 1,
          "show": true
        }
      ],
      "yaxis": {
        "align": false
      }
    },
    {
      "collapsed": false,
      "datasource": {
        "type": "prometheus",
        "uid": "prometheus"
      },
      "gridPos": {
        "h": 1,
        "w": 24,
        "x": 0,
        "y": 61
      },
      "id": 390,
      "panels": [],
      "targets": [
        {
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "refId": "A"
        }
      ],
      "title": "Command, Handlers, Processes",
      "type": "row"
    },
    {
      "aliasColors": {},
      "bars": false,
      "dashLength": 10,
      "dashes": false,
      "datasource": {
        "type": "prometheus",
        "uid": "prometheus"
      },
      "decimals": 2,
      "description": "**Top Command Counters**\n\nThe Com_{{xxx}} statement counter variables indicate the number of times each xxx statement has been executed. There is one status variable for each type of statement. For example, Com_delete and Com_update count [``DELETE``](https://dev.mysql.com/doc/refman/5.7/en/delete.html) and [``UPDATE``](https://dev.mysql.com/doc/refman/5.7/en/update.html) statements, respectively. Com_delete_multi and Com_update_multi are similar but apply to [``DELETE``](https://dev.mysql.com/doc/refman/5.7/en/delete.html) and [``UPDATE``](https://dev.mysql.com/doc/refman/5.7/en/update.html) statements that use multiple-table syntax.",
      "editable": true,
      "error": false,
      "fill": 2,
      "grid": {},
      "gridPos": {
        "h": 7,
        "w": 24,
        "x": 0,
        "y": 62
      },
      "id": 14,
      "legend": {
        "alignAsTable": true,
        "avg": true,
        "current": false,
        "hideEmpty": false,
        "hideZero": false,
        "max": true,
        "min": true,
        "rightSide": true,
        "show": true,
        "sort": "avg",
        "sortDesc": true,
        "total": false,
        "values": true
      },
      "lines": true,
      "linewidth": 2,
      "links": [
        {
          "title": "Server Status Variables (Com_xxx)",
          "url": "https://dev.mysql.com/doc/refman/5.7/en/server-status-variables.html#statvar_Com_xxx"
        }
      ],
      "nullPointMode": "null",
      "percentage": false,
      "pointradius": 5,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [],
      "spaceLength": 10,
      "stack": false,
      "steppedLine": false,
      "targets": [
        {
          "calculatedInterval": "2m",
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "datasourceErrors": {},
          "errors": {},
          "expr": "topk(5, rate(mysql_global_status_commands_total{instance=\"$host\"}[$interval])>0) or topk(5, irate(mysql_global_status_commands_total{instance=\"$host\"}[5m])>0)",
          "format": "time_series",
          "hide": false,
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "Com_{{ command }}",
          "metric": "",
          "refId": "B",
          "step": 20
        }
      ],
      "thresholds": [],
      "title": "Top Command Counters",
      "tooltip": {
        "msResolution": false,
        "shared": true,
        "sort": 0,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "mode": "time",
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "format": "short",
          "logBase": 1,
          "min": 0,
          "show": true
        },
        {
          "format": "short",
          "logBase": 1,
          "min": 0,
          "show": true
        }
      ],
      "yaxis": {
        "align": false
      }
    },
    {
      "aliasColors": {},
      "bars": true,
      "dashLength": 10,
      "dashes": false,
      "datasource": {
        "type": "prometheus",
        "uid": "prometheus"
      },
      "decimals": 2,
      "description": "**Top Command Counters Hourly**\n\nThe Com_{{xxx}} statement counter variables indicate the number of times each xxx statement has been executed. There is one status variable for each type of statement. For example, Com_delete and Com_update count [``DELETE``](https://dev.mysql.com/doc/refman/5.7/en/delete.html) and [``UPDATE``](https://dev.mysql.com/doc/refman/5.7/en/update.html) statements, respectively. Com_delete_multi and Com_update_multi are similar but apply to [``DELETE``](https://dev.mysql.com/doc/refman/5.7/en/delete.html) and [``UPDATE``](https://dev.mysql.com/doc/refman/5.7/en/update.html) statements that use multiple-table syntax.",
      "editable": true,
      "error": false,
      "fill": 6,
      "grid": {},
      "gridPos": {
        "h": 7,
        "w": 24,
        "x": 0,
        "y": 69
      },
      "id": 39,
      "legend": {
        "alignAsTable": true,
        "avg": true,
        "current": false,
        "max": true,
        "min": true,
        "rightSide": true,
        "show": true,
        "sort": "avg",
        "sortDesc": true,
        "total": false,
        "values": true
      },
      "lines": false,
      "linewidth": 2,
      "links": [
        {
          "title": "Server Status Variables (Com_xxx)",
          "url": "https://dev.mysql.com/doc/refman/5.7/en/server-status-variables.html#statvar_Com_xxx"
        }
      ],
      "nullPointMode": "null",
      "percentage": false,
      "pointradius": 5,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [],
      "spaceLength": 10,
      "stack": true,
      "steppedLine": false,
      "targets": [
        {
          "calculatedInterval": "2m",
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "datasourceErrors": {},
          "errors": {},
          "expr": "topk(5, increase(mysql_global_status_commands_total{instance=\"$host\"}[1h])>0)",
          "format": "time_series",
          "interval": "1h",
          "intervalFactor": 1,
          "legendFormat": "Com_{{ command }}",
          "metric": "",
          "refId": "A",
          "step": 3600
        }
      ],
      "thresholds": [],
      "timeFrom": "24h",
      "title": "Top Command Counters Hourly",
      "tooltip": {
        "msResolution": false,
        "shared": true,
        "sort": 0,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "mode": "time",
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "format": "short",
          "logBase": 1,
          "min": 0,
          "show": true
        },
        {
          "format": "short",
          "logBase": 1,
          "min": 0,
          "show": true
        }
      ],
      "yaxis": {
        "align": false
      }
    },
    {
      "aliasColors": {},
      "bars": false,
      "dashLength": 10,
      "dashes": false,
      "datasource": {
        "type": "prometheus",
        "uid": "prometheus"
      },
      "decimals": 2,
      "description": "**MySQL Handlers**\n\nHandler statistics are internal statistics on how MySQL is selecting, updating, inserting, and modifying rows, tables, and indexes.\n\nThis is in fact the layer between the Storage Engine and MySQL.\n\n* `read_rnd_next` is incremented when the server performs a full table scan and this is a counter you don't really want to see with a high value.\n* `read_key` is incremented when a read is done with an index.\n* `read_next` is incremented when the storage engine is asked to 'read the next index entry'. A high value means a lot of index scans are being done.",
      "editable": true,
      "error": false,
      "fill": 2,
      "grid": {},
      "gridPos": {
        "h": 7,
        "w": 24,
        "x": 0,
        "y": 76
      },
      "id": 8,
      "legend": {
        "alignAsTable": true,
        "avg": true,
        "current": false,
        "hideZero": true,
        "max": true,
        "min": true,
        "rightSide": true,
        "show": true,
        "sort": "avg",
        "sortDesc": true,
        "total": false,
        "values": true
      },
      "lines": true,
      "linewidth": 2,
      "links": [],
      "nullPointMode": "null",
      "percentage": false,
      "pointradius": 5,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [],
      "spaceLength": 10,
      "stack": false,
      "steppedLine": false,
      "targets": [
        {
          "calculatedInterval": "2m",
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "datasourceErrors": {},
          "errors": {},
          "expr": "rate(mysql_global_status_handlers_total{instance=\"$host\", handler!~\"commit|rollback|savepoint.*|prepare\"}[$interval]) or irate(mysql_global_status_handlers_total{instance=\"$host\", handler!~\"commit|rollback|savepoint.*|prepare\"}[5m])",
          "format": "time_series",
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "{{ handler }}",
          "metric": "",
          "refId": "J",
          "step": 20
        }
      ],
      "thresholds": [],
      "title": "MySQL Handlers",
      "tooltip": {
        "msResolution": false,
        "shared": true,
        "sort": 0,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "mode": "time",
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "format": "short",
          "logBase": 1,
          "min": 0,
          "show": true
        },
        {
          "format": "short",
          "logBase": 1,
          "min": 0,
          "show": true
        }
      ],
      "yaxis": {
        "align": false
      }
    },
    {
      "aliasColors": {},
      "bars": false,
      "dashLength": 10,
      "dashes": false,
      "datasource": {
        "type": "prometheus",
        "uid": "prometheus"
      },
      "decimals": 2,
      "editable": true,
      "error": false,
      "fill": 2,
      "grid": {},
      "gridPos": {
        "h": 7,
        "w": 24,
        "x": 0,
        "y": 83
      },
      "id": 28,
      "legend": {
        "alignAsTable": true,
        "avg": true,
        "current": false,
        "hideZero": true,
        "max": true,
        "min": true,
        "rightSide": true,
        "show": true,
        "sort": "avg",
        "sortDesc": true,
        "total": false,
        "values": true
      },
      "lines": true,
      "linewidth": 2,
      "links": [],
      "nullPointMode": "null",
      "percentage": false,
      "pointradius": 5,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [],
      "spaceLength": 10,
      "stack": false,
      "steppedLine": false,
      "targets": [
        {
          "calculatedInterval": "2m",
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "datasourceErrors": {},
          "errors": {},
          "expr": "rate(mysql_global_status_handlers_total{instance=\"$host\", handler=~\"commit|rollback|savepoint.*|prepare\"}[$interval]) or irate(mysql_global_status_handlers_total{instance=\"$host\", handler=~\"commit|rollback|savepoint.*|prepare\"}[5m])",
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "{{ handler }}",
          "metric": "",
          "refId": "A",
          "step": 20
        }
      ],
      "thresholds": [],
      "title": "MySQL Transaction Handlers",
      "tooltip": {
        "msResolution": false,
        "shared": true,
        "sort": 0,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "mode": "time",
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "format": "short",
          "logBase": 1,
          "min": 0,
          "show": true
        },
        {
          "format": "short",
          "logBase": 1,
          "min": 0,
          "show": true
        }
      ],
      "yaxis": {
        "align": false
      }
    },
    {
      "aliasColors": {},
      "bars": false,
      "dashLength": 10,
      "dashes": false,
      "datasource": {
        "type": "prometheus",
        "uid": "prometheus"
      },
      "decimals": 2,
      "editable": true,
      "error": false,
      "fill": 0,
      "grid": {},
      "gridPos": {
        "h": 7,
        "w": 24,
        "x": 0,
        "y": 90
      },
      "id": 40,
      "legend": {
        "alignAsTable": true,
        "avg": true,
        "current": false,
        "hideZero": true,
        "max": true,
        "min": false,
        "rightSide": true,
        "show": true,
        "sort": "avg",
        "sortDesc": true,
        "total": false,
        "values": true
      },
      "lines": true,
      "linewidth": 2,
      "links": [],
      "nullPointMode": "null",
      "percentage": false,
      "pointradius": 5,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [],
      "spaceLength": 10,
      "stack": false,
      "steppedLine": false,
      "targets": [
        {
          "calculatedInterval": "2m",
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "datasourceErrors": {},
          "errors": {},
          "expr": "mysql_info_schema_threads{instance=\"$host\"}",
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "{{ state }}",
          "metric": "",
          "refId": "A",
          "step": 20
        }
      ],
      "thresholds": [],
      "title": "Process States",
      "tooltip": {
        "msResolution": false,
        "shared": true,
        "sort": 0,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "mode": "time",
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "format": "short",
          "logBase": 1,
          "min": 0,
          "show": true
        },
        {
          "format": "short",
          "logBase": 1,
          "min": 0,
          "show": true
        }
      ],
      "yaxis": {
        "align": false
      }
    },
    {
      "aliasColors": {},
      "bars": true,
      "dashLength": 10,
      "dashes": false,
      "datasource": {
        "type": "prometheus",
        "uid": "prometheus"
      },
      "decimals": 2,
      "editable": true,
      "error": false,
      "fill": 6,
      "grid": {},
      "gridPos": {
        "h": 7,
        "w": 24,
        "x": 0,
        "y": 97
      },
      "id": 49,
      "legend": {
        "alignAsTable": true,
        "avg": true,
        "current": false,
        "hideZero": true,
        "max": true,
        "min": false,
        "rightSide": true,
        "show": true,
        "sort": "avg",
        "sortDesc": true,
        "total": false,
        "values": true
      },
      "lines": false,
      "linewidth": 2,
      "links": [],
      "nullPointMode": "null",
      "percentage": false,
      "pointradius": 5,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [],
      "spaceLength": 10,
      "stack": true,
      "steppedLine": false,
      "targets": [
        {
          "calculatedInterval": "2m",
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "datasourceErrors": {},
          "errors": {},
          "expr": "topk(5, avg_over_time(mysql_info_schema_threads{instance=\"$host\"}[1h]))",
          "interval": "1h",
          "intervalFactor": 1,
          "legendFormat": "{{ state }}",
          "metric": "",
          "refId": "A",
          "step": 3600
        }
      ],
      "thresholds": [],
      "timeFrom": "24h",
      "title": "Top Process States Hourly",
      "tooltip": {
        "msResolution": false,
        "shared": true,
        "sort": 0,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "mode": "time",
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "format": "short",
          "logBase": 1,
          "min": 0,
          "show": true
        },
        {
          "format": "short",
          "logBase": 1,
          "min": 0,
          "show": true
        }
      ],
      "yaxis": {
        "align": false
      }
    },
    {
      "collapsed": false,
      "datasource": {
        "type": "prometheus",
        "uid": "prometheus"
      },
      "gridPos": {
        "h": 1,
        "w": 24,
        "x": 0,
        "y": 104
      },
      "id": 391,
      "panels": [],
      "targets": [
        {
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "refId": "A"
        }
      ],
      "title": "Query Cache",
      "type": "row"
    },
    {
      "aliasColors": {},
      "bars": false,
      "dashLength": 10,
      "dashes": false,
      "datasource": {
        "type": "prometheus",
        "uid": "prometheus"
      },
      "decimals": 2,
      "description": "**MySQL Query Cache Memory**\n\nThe query cache has huge scalability problems in that only one thread can do an operation in the query cache at the same time. This serialization is true not only for SELECTs, but also for INSERT/UPDATE/DELETE.\n\nThis also means that the larger the `query_cache_size` is set to, the slower those operations become. In concurrent environments, the MySQL Query Cache quickly becomes a contention point, decreasing performance. MariaDB and AWS Aurora have done work to try and eliminate the query cache contention in their flavors of MySQL, while MySQL 8.0 has eliminated the query cache feature.\n\nThe recommended settings for most environments is to set:\n  ``query_cache_type=0``\n  ``query_cache_size=0``\n\nNote that while you can dynamically change these values, to completely remove the contention point you have to restart the database.",
      "editable": true,
      "error": false,
      "fill": 2,
      "grid": {},
      "gridPos": {
        "h": 7,
        "w": 12,
        "x": 0,
        "y": 105
      },
      "id": 46,
      "legend": {
        "alignAsTable": true,
        "avg": true,
        "current": false,
        "max": true,
        "min": true,
        "rightSide": false,
        "show": true,
        "sort": "avg",
        "sortDesc": true,
        "total": false,
        "values": true
      },
      "lines": true,
      "linewidth": 2,
      "links": [],
      "nullPointMode": "null",
      "percentage": false,
      "pointradius": 5,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [],
      "spaceLength": 10,
      "stack": false,
      "steppedLine": false,
      "targets": [
        {
          "calculatedInterval": "2m",
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "datasourceErrors": {},
          "errors": {},
          "expr": "mysql_global_status_qcache_free_memory{instance=\"$host\"}",
          "format": "time_series",
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "Free Memory",
          "metric": "",
          "refId": "F",
          "step": 20
        },
        {
          "calculatedInterval": "2m",
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "datasourceErrors": {},
          "errors": {},
          "expr": "mysql_global_variables_query_cache_size{instance=\"$host\"}",
          "format": "time_series",
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "Query Cache Size",
          "metric": "",
          "refId": "E",
          "step": 20
        }
      ],
      "thresholds": [],
      "title": "MySQL Query Cache Memory",
      "tooltip": {
        "msResolution": false,
        "shared": true,
        "sort": 0,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "mode": "time",
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "format": "bytes",
          "logBase": 1,
          "min": 0,
          "show": true
        },
        {
          "format": "short",
          "logBase": 1,
          "min": 0,
          "show": true
        }
      ],
      "yaxis": {
        "align": false
      }
    },
    {
      "aliasColors": {},
      "bars": false,
      "dashLength": 10,
      "dashes": false,
      "datasource": {
        "type": "prometheus",
        "uid": "prometheus"
      },
      "decimals": 2,
      "description": "**MySQL Query Cache Activity**\n\nThe query cache has huge scalability problems in that only one thread can do an operation in the query cache at the same time. This serialization is true not only for SELECTs, but also for INSERT/UPDATE/DELETE.\n\nThis also means that the larger the `query_cache_size` is set to, the slower those operations become. In concurrent environments, the MySQL Query Cache quickly becomes a contention point, decreasing performance. MariaDB and AWS Aurora have done work to try and eliminate the query cache contention in their flavors of MySQL, while MySQL 8.0 has eliminated the query cache feature.\n\nThe recommended settings for most environments is to set:\n``query_cache_type=0``\n``query_cache_size=0``\n\nNote that while you can dynamically change these values, to completely remove the contention point you have to restart the database.",
      "editable": true,
      "error": false,
      "fill": 2,
      "grid": {},
      "gridPos": {
        "h": 7,
        "w": 12,
        "x": 12,
        "y": 105
      },
      "height": "",
      "id": 45,
      "legend": {
        "alignAsTable": true,
        "avg": true,
        "current": false,
        "max": true,
        "min": true,
        "rightSide": false,
        "show": true,
        "sort": "avg",
        "sortDesc": true,
        "total": false,
        "values": true
      },
      "lines": true,
      "linewidth": 2,
      "links": [],
      "nullPointMode": "null",
      "percentage": false,
      "pointradius": 5,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [],
      "spaceLength": 10,
      "stack": false,
      "steppedLine": false,
      "targets": [
        {
          "calculatedInterval": "2m",
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "datasourceErrors": {},
          "errors": {},
          "expr": "rate(mysql_global_status_qcache_hits{instance=\"$host\"}[$interval]) or irate(mysql_global_status_qcache_hits{instance=\"$host\"}[5m])",
          "format": "time_series",
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "Hits",
          "metric": "",
          "refId": "B",
          "step": 20
        },
        {
          "calculatedInterval": "2m",
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "datasourceErrors": {},
          "errors": {},
          "expr": "rate(mysql_global_status_qcache_inserts{instance=\"$host\"}[$interval]) or irate(mysql_global_status_qcache_inserts{instance=\"$host\"}[5m])",
          "format": "time_series",
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "Inserts",
          "metric": "",
          "refId": "C",
          "step": 20
        },
        {
          "calculatedInterval": "2m",
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "datasourceErrors": {},
          "errors": {},
          "expr": "rate(mysql_global_status_qcache_not_cached{instance=\"$host\"}[$interval]) or irate(mysql_global_status_qcache_not_cached{instance=\"$host\"}[5m])",
          "format": "time_series",
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "Not Cached",
          "metric": "",
          "refId": "D",
          "step": 20
        },
        {
          "calculatedInterval": "2m",
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "datasourceErrors": {},
          "errors": {},
          "expr": "rate(mysql_global_status_qcache_lowmem_prunes{instance=\"$host\"}[$interval]) or irate(mysql_global_status_qcache_lowmem_prunes{instance=\"$host\"}[5m])",
          "format": "time_series",
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "Prunes",
          "metric": "",
          "refId": "F",
          "step": 20
        },
        {
          "calculatedInterval": "2m",
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "datasourceErrors": {},
          "errors": {},
          "expr": "mysql_global_status_qcache_queries_in_cache{instance=\"$host\"}",
          "format": "time_series",
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "Queries in Cache",
          "metric": "",
          "refId": "E",
          "step": 20
        }
      ],
      "thresholds": [],
      "title": "MySQL Query Cache Activity",
      "tooltip": {
        "msResolution": false,
        "shared": true,
        "sort": 0,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "mode": "time",
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "format": "short",
          "logBase": 1,
          "min": 0,
          "show": true
        },
        {
          "format": "short",
          "logBase": 1,
          "min": 0,
          "show": true
        }
      ],
      "yaxis": {
        "align": false
      }
    },
    {
      "collapsed": false,
      "datasource": {
        "type": "prometheus",
        "uid": "prometheus"
      },
      "gridPos": {
        "h": 1,
        "w": 24,
        "x": 0,
        "y": 112
      },
      "id": 392,
      "panels": [],
      "targets": [
        {
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "refId": "A"
        }
      ],
      "title": "Files and Tables",
      "type": "row"
    },
    {
      "aliasColors": {},
      "bars": false,
      "dashLength": 10,
      "dashes": false,
      "datasource": {
        "type": "prometheus",
        "uid": "prometheus"
      },
      "decimals": 2,
      "editable": true,
      "error": false,
      "fill": 2,
      "grid": {},
      "gridPos": {
        "h": 7,
        "w": 12,
        "x": 0,
        "y": 113
      },
      "id": 43,
      "legend": {
        "alignAsTable": true,
        "avg": true,
        "current": false,
        "max": true,
        "min": true,
        "rightSide": false,
        "show": true,
        "total": false,
        "values": true
      },
      "lines": true,
      "linewidth": 2,
      "links": [],
      "nullPointMode": "null",
      "percentage": false,
      "pointradius": 5,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [],
      "spaceLength": 10,
      "stack": false,
      "steppedLine": false,
      "targets": [
        {
          "calculatedInterval": "2m",
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "datasourceErrors": {},
          "errors": {},
          "expr": "rate(mysql_global_status_opened_files{instance=\"$host\"}[$interval]) or irate(mysql_global_status_opened_files{instance=\"$host\"}[5m])",
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "Openings",
          "metric": "",
          "refId": "A",
          "step": 20
        }
      ],
      "thresholds": [],
      "title": "MySQL File Openings",
      "tooltip": {
        "msResolution": false,
        "shared": true,
        "sort": 0,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "mode": "time",
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "format": "short",
          "logBase": 1,
          "min": 0,
          "show": true
        },
        {
          "format": "short",
          "logBase": 1,
          "min": 0,
          "show": true
        }
      ],
      "yaxis": {
        "align": false
      }
    },
    {
      "aliasColors": {},
      "bars": false,
      "dashLength": 10,
      "dashes": false,
      "datasource": {
        "type": "prometheus",
        "uid": "prometheus"
      },
      "decimals": 2,
      "editable": true,
      "error": false,
      "fill": 2,
      "grid": {},
      "gridPos": {
        "h": 7,
        "w": 12,
        "x": 12,
        "y": 113
      },
      "id": 41,
      "legend": {
        "alignAsTable": true,
        "avg": true,
        "current": false,
        "max": true,
        "min": true,
        "rightSide": false,
        "show": true,
        "sortDesc": true,
        "total": false,
        "values": true
      },
      "lines": true,
      "linewidth": 2,
      "links": [],
      "nullPointMode": "null",
      "percentage": false,
      "pointradius": 5,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [],
      "spaceLength": 10,
      "stack": false,
      "steppedLine": false,
      "targets": [
        {
          "calculatedInterval": "2m",
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "datasourceErrors": {},
          "errors": {},
          "expr": "mysql_global_status_open_files{instance=\"$host\"}",
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "Open Files",
          "metric": "",
          "refId": "A",
          "step": 20
        },
        {
          "calculatedInterval": "2m",
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "datasourceErrors": {},
          "errors": {},
          "expr": "mysql_global_variables_open_files_limit{instance=\"$host\"}",
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "Open Files Limit",
          "metric": "",
          "refId": "D",
          "step": 20
        },
        {
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "expr": "mysql_global_status_innodb_num_open_files{instance=\"$host\"}",
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "InnoDB Open Files",
          "refId": "B",
          "step": 20
        }
      ],
      "thresholds": [],
      "title": "MySQL Open Files",
      "tooltip": {
        "msResolution": false,
        "shared": true,
        "sort": 0,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "mode": "time",
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "format": "short",
          "logBase": 1,
          "min": 0,
          "show": true
        },
        {
          "format": "short",
          "logBase": 1,
          "min": 0,
          "show": true
        }
      ],
      "yaxis": {
        "align": false
      }
    },
    {
      "collapsed": false,
      "datasource": {
        "type": "prometheus",
        "uid": "prometheus"
      },
      "gridPos": {
        "h": 1,
        "w": 24,
        "x": 0,
        "y": 120
      },
      "id": 393,
      "panels": [],
      "targets": [
        {
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "refId": "A"
        }
      ],
      "title": "Table Openings",
      "type": "row"
    },
    {
      "aliasColors": {},
      "bars": false,
      "dashLength": 10,
      "dashes": false,
      "datasource": {
        "type": "prometheus",
        "uid": "prometheus"
      },
      "decimals": 2,
      "description": "**MySQL Table Open Cache Status**\n\nThe recommendation is to set the `table_open_cache_instances` to a loose correlation to virtual CPUs, keeping in mind that more instances means the cache is split more times. If you have a cache set to 500 but it has 10 instances, each cache will only have 50 cached.\n\nThe `table_definition_cache` and `table_open_cache` can be left as default as they are auto-sized MySQL 5.6 and above (ie: do not set them to any value).",
      "editable": true,
      "error": false,
      "fill": 2,
      "grid": {},
      "gridPos": {
        "h": 7,
        "w": 12,
        "x": 0,
        "y": 121
      },
      "id": 44,
      "legend": {
        "alignAsTable": true,
        "avg": true,
        "current": false,
        "max": true,
        "min": true,
        "rightSide": false,
        "show": true,
        "sort": "avg",
        "sortDesc": true,
        "total": false,
        "values": true
      },
      "lines": true,
      "linewidth": 2,
      "links": [
        {
          "title": "Server Status Variables (table_open_cache)",
          "url": "http://dev.mysql.com/doc/refman/5.6/en/server-system-variables.html#sysvar_table_open_cache"
        }
      ],
      "nullPointMode": "null",
      "percentage": false,
      "pointradius": 5,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [
        {
          "alias": "Table Open Cache Hit Ratio",
          "yaxis": 2
        }
      ],
      "spaceLength": 10,
      "stack": false,
      "steppedLine": false,
      "targets": [
        {
          "calculatedInterval": "2m",
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "datasourceErrors": {},
          "errors": {},
          "expr": "rate(mysql_global_status_opened_tables{instance=\"$host\"}[$interval]) or irate(mysql_global_status_opened_tables{instance=\"$host\"}[5m])",
          "format": "time_series",
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "Openings",
          "metric": "",
          "refId": "A",
          "step": 20
        },
        {
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "expr": "rate(mysql_global_status_table_open_cache_hits{instance=\"$host\"}[$interval]) or irate(mysql_global_status_table_open_cache_hits{instance=\"$host\"}[5m])",
          "format": "time_series",
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "Hits",
          "refId": "B",
          "step": 20
        },
        {
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "expr": "rate(mysql_global_status_table_open_cache_misses{instance=\"$host\"}[$interval]) or irate(mysql_global_status_table_open_cache_misses{instance=\"$host\"}[5m])",
          "format": "time_series",
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "Misses",
          "refId": "C",
          "step": 20
        },
        {
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "expr": "rate(mysql_global_status_table_open_cache_overflows{instance=\"$host\"}[$interval]) or irate(mysql_global_status_table_open_cache_overflows{instance=\"$host\"}[5m])",
          "format": "time_series",
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "Misses due to Overflows",
          "refId": "D",
          "step": 20
        },
        {
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "expr": "(rate(mysql_global_status_table_open_cache_hits{instance=\"$host\"}[$interval]) or irate(mysql_global_status_table_open_cache_hits{instance=\"$host\"}[5m]))/((rate(mysql_global_status_table_open_cache_hits{instance=\"$host\"}[$interval]) or irate(mysql_global_status_table_open_cache_hits{instance=\"$host\"}[5m]))+(rate(mysql_global_status_table_open_cache_misses{instance=\"$host\"}[$interval]) or irate(mysql_global_status_table_open_cache_misses{instance=\"$host\"}[5m])))",
          "format": "time_series",
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "Table Open Cache Hit Ratio",
          "refId": "E",
          "step": 20
        }
      ],
      "thresholds": [],
      "title": "MySQL Table Open Cache Status",
      "tooltip": {
        "msResolution": false,
        "shared": true,
        "sort": 0,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "mode": "time",
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "format": "short",
          "logBase": 1,
          "min": 0,
          "show": true
        },
        {
          "format": "percentunit",
          "logBase": 1,
          "min": 0,
          "show": true
        }
      ],
      "yaxis": {
        "align": false
      }
    },
    {
      "aliasColors": {},
      "bars": false,
      "dashLength": 10,
      "dashes": false,
      "datasource": {
        "type": "prometheus",
        "uid": "prometheus"
      },
      "decimals": 2,
      "description": "**MySQL Open Tables**\n\nThe recommendation is to set the `table_open_cache_instances` to a loose correlation to virtual CPUs, keeping in mind that more instances means the cache is split more times. If you have a cache set to 500 but it has 10 instances, each cache will only have 50 cached.\n\nThe `table_definition_cache` and `table_open_cache` can be left as default as they are auto-sized MySQL 5.6 and above (ie: do not set them to any value).",
      "editable": true,
      "error": false,
      "fill": 2,
      "grid": {},
      "gridPos": {
        "h": 7,
        "w": 12,
        "x": 12,
        "y": 121
      },
      "id": 42,
      "legend": {
        "alignAsTable": true,
        "avg": true,
        "current": false,
        "max": true,
        "min": true,
        "rightSide": false,
        "show": true,
        "sort": "avg",
        "sortDesc": true,
        "total": false,
        "values": true
      },
      "lines": true,
      "linewidth": 2,
      "links": [
        {
          "title": "Server Status Variables (table_open_cache)",
          "url": "http://dev.mysql.com/doc/refman/5.6/en/server-system-variables.html#sysvar_table_open_cache"
        }
      ],
      "nullPointMode": "null",
      "percentage": false,
      "pointradius": 5,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [],
      "spaceLength": 10,
      "stack": false,
      "steppedLine": false,
      "targets": [
        {
          "calculatedInterval": "2m",
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "datasourceErrors": {},
          "errors": {},
          "expr": "mysql_global_status_open_tables{instance=\"$host\"}",
          "format": "time_series",
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "Open Tables",
          "metric": "",
          "refId": "B",
          "step": 20
        },
        {
          "calculatedInterval": "2m",
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "datasourceErrors": {},
          "errors": {},
          "expr": "mysql_global_variables_table_open_cache{instance=\"$host\"}",
          "format": "time_series",
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "Table Open Cache",
          "metric": "",
          "refId": "C",
          "step": 20
        }
      ],
      "thresholds": [],
      "title": "MySQL Open Tables",
      "tooltip": {
        "msResolution": false,
        "shared": true,
        "sort": 0,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "mode": "time",
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "format": "short",
          "logBase": 1,
          "min": 0,
          "show": true
        },
        {
          "format": "short",
          "logBase": 1,
          "min": 0,
          "show": true
        }
      ],
      "yaxis": {
        "align": false
      }
    },
    {
      "collapsed": false,
      "datasource": {
        "type": "prometheus",
        "uid": "prometheus"
      },
      "gridPos": {
        "h": 1,
        "w": 24,
        "x": 0,
        "y": 128
      },
      "id": 394,
      "panels": [],
      "targets": [
        {
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "refId": "A"
        }
      ],
      "title": "MySQL Table Definition Cache",
      "type": "row"
    },
    {
      "aliasColors": {},
      "bars": false,
      "dashLength": 10,
      "dashes": false,
      "datasource": {
        "type": "prometheus",
        "uid": "prometheus"
      },
      "decimals": 2,
      "description": "**MySQL Table Definition Cache**\n\nThe recommendation is to set the `table_open_cache_instances` to a loose correlation to virtual CPUs, keeping in mind that more instances means the cache is split more times. If you have a cache set to 500 but it has 10 instances, each cache will only have 50 cached.\n\nThe `table_definition_cache` and `table_open_cache` can be left as default as they are auto-sized MySQL 5.6 and above (ie: do not set them to any value).",
      "editable": true,
      "error": false,
      "fill": 2,
      "grid": {},
      "gridPos": {
        "h": 7,
        "w": 24,
        "x": 0,
        "y": 129
      },
      "id": 54,
      "legend": {
        "alignAsTable": true,
        "avg": true,
        "current": false,
        "max": true,
        "min": true,
        "rightSide": false,
        "show": true,
        "sort": "avg",
        "sortDesc": true,
        "total": false,
        "values": true
      },
      "lines": true,
      "linewidth": 2,
      "links": [
        {
          "title": "Server Status Variables (table_open_cache)",
          "url": "http://dev.mysql.com/doc/refman/5.6/en/server-system-variables.html#sysvar_table_open_cache"
        }
      ],
      "nullPointMode": "null",
      "percentage": false,
      "pointradius": 5,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [
        {
          "alias": "Opened Table Definitions",
          "yaxis": 2
        }
      ],
      "spaceLength": 10,
      "stack": false,
      "steppedLine": false,
      "targets": [
        {
          "calculatedInterval": "2m",
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "datasourceErrors": {},
          "errors": {},
          "expr": "mysql_global_status_open_table_definitions{instance=\"$host\"}",
          "format": "time_series",
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "Open Table Definitions",
          "metric": "",
          "refId": "B",
          "step": 20
        },
        {
          "calculatedInterval": "2m",
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "datasourceErrors": {},
          "errors": {},
          "expr": "mysql_global_variables_table_definition_cache{instance=\"$host\"}",
          "format": "time_series",
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "Table Definitions Cache Size",
          "metric": "",
          "refId": "C",
          "step": 20
        },
        {
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "expr": "rate(mysql_global_status_opened_table_definitions{instance=\"$host\"}[$interval]) or irate(mysql_global_status_opened_table_definitions{instance=\"$host\"}[5m])",
          "format": "time_series",
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "Opened Table Definitions",
          "refId": "A",
          "step": 20
        }
      ],
      "thresholds": [],
      "title": "MySQL Table Definition Cache",
      "tooltip": {
        "msResolution": false,
        "shared": true,
        "sort": 0,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "mode": "time",
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "format": "short",
          "logBase": 1,
          "min": 0,
          "show": true
        },
        {
          "format": "short",
          "logBase": 1,
          "min": 0,
          "show": true
        }
      ],
      "yaxis": {
        "align": false
      }
    },
    {
      "collapsed": false,
      "datasource": {
        "type": "prometheus",
        "uid": "prometheus"
      },
      "gridPos": {
        "h": 1,
        "w": 24,
        "x": 0,
        "y": 136
      },
      "id": 395,
      "panels": [],
      "targets": [
        {
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "refId": "A"
        }
      ],
      "title": "System Charts",
      "type": "row"
    },
    {
      "aliasColors": {},
      "bars": false,
      "dashLength": 10,
      "dashes": false,
      "datasource": {
        "type": "prometheus",
        "uid": "prometheus"
      },
      "editable": true,
      "error": false,
      "fill": 2,
      "grid": {},
      "gridPos": {
        "h": 7,
        "w": 8,
        "x": 0,
        "y": 137
      },
      "id": 31,
      "legend": {
        "alignAsTable": false,
        "avg": true,
        "current": false,
        "hideEmpty": false,
        "max": false,
        "min": false,
        "rightSide": false,
        "show": true,
        "sort": "avg",
        "sortDesc": true,
        "total": false,
        "values": true
      },
      "lines": true,
      "linewidth": 2,
      "links": [],
      "nullPointMode": "null",
      "percentage": false,
      "pointradius": 5,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [],
      "spaceLength": 10,
      "stack": false,
      "steppedLine": false,
      "targets": [
        {
          "calculatedInterval": "2s",
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "datasourceErrors": {},
          "errors": {},
          "expr": "rate(node_vmstat_pgpgin{instance=\"$host\"}[$interval]) * 1024 or irate(node_vmstat_pgpgin{instance=\"$host\"}[5m]) * 1024",
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "Page In",
          "metric": "",
          "refId": "A",
          "step": 20,
          "target": ""
        },
        {
          "calculatedInterval": "2s",
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "datasourceErrors": {},
          "errors": {},
          "expr": "rate(node_vmstat_pgpgout{instance=\"$host\"}[$interval]) * 1024 or irate(node_vmstat_pgpgout{instance=\"$host\"}[5m]) * 1024",
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "Page Out",
          "metric": "",
          "refId": "B",
          "step": 20,
          "target": ""
        }
      ],
      "thresholds": [],
      "title": "I/O Activity",
      "tooltip": {
        "msResolution": false,
        "shared": true,
        "sort": 0,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "mode": "time",
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "format": "Bps",
          "label": "",
          "logBase": 1,
          "min": 0,
          "show": true
        },
        {
          "format": "bytes",
          "logBase": 1,
          "min": 0,
          "show": true
        }
      ],
      "yaxis": {
        "align": false
      }
    },
    {
      "aliasColors": {},
      "bars": false,
      "dashLength": 10,
      "dashes": false,
      "datasource": {
        "type": "prometheus",
        "uid": "prometheus"
      },
      "editable": true,
      "error": false,
      "fill": 6,
      "grid": {},
      "gridPos": {
        "h": 7,
        "w": 8,
        "x": 8,
        "y": 137
      },
      "height": "250px",
      "id": 37,
      "legend": {
        "alignAsTable": false,
        "avg": true,
        "current": false,
        "hideEmpty": false,
        "max": false,
        "min": false,
        "rightSide": false,
        "show": true,
        "sort": "avg",
        "sortDesc": true,
        "total": false,
        "values": true
      },
      "lines": true,
      "linewidth": 2,
      "links": [],
      "nullPointMode": "null",
      "percentage": false,
      "pointradius": 5,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [],
      "spaceLength": 10,
      "stack": true,
      "steppedLine": false,
      "targets": [
        {
          "calculatedInterval": "2s",
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "datasourceErrors": {},
          "errors": {},
          "expr": "node_memory_MemTotal_bytes{instance=\"$host\"} - (node_memory_MemFree_bytes{instance=\"$host\"} + node_memory_Buffers{instance=\"$host\"} + node_memory_Cached{instance=\"$host\"})",
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "Used",
          "metric": "",
          "refId": "A",
          "step": 20,
          "target": ""
        },
        {
          "calculatedInterval": "2s",
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "datasourceErrors": {},
          "errors": {},
          "expr": "node_memory_MemFree_bytes{instance=\"$host\"}",
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "Free",
          "metric": "",
          "refId": "B",
          "step": 20,
          "target": ""
        },
        {
          "calculatedInterval": "2s",
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "datasourceErrors": {},
          "errors": {},
          "expr": "node_memory_Buffers{instance=\"$host\"}",
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "Buffers",
          "metric": "",
          "refId": "D",
          "step": 20,
          "target": ""
        },
        {
          "calculatedInterval": "2s",
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "datasourceErrors": {},
          "errors": {},
          "expr": "node_memory_Cached{instance=\"$host\"}",
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "Cached",
          "metric": "",
          "refId": "E",
          "step": 20,
          "target": ""
        }
      ],
      "thresholds": [],
      "title": "Memory Distribution",
      "tooltip": {
        "msResolution": false,
        "shared": true,
        "sort": 0,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "mode": "time",
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "format": "bytes",
          "label": "",
          "logBase": 1,
          "min": 0,
          "show": true
        },
        {
          "format": "bytes",
          "logBase": 1,
          "min": 0,
          "show": true
        }
      ],
      "yaxis": {
        "align": false
      }
    },
    {
      "aliasColors": {
        "Load 1m": "#58140C",
        "Max Core Utilization": "#bf1b00",
        "iowait": "#e24d42",
        "nice": "#1f78c1",
        "softirq": "#806eb7",
        "system": "#eab839",
        "user": "#508642"
      },
      "bars": false,
      "dashLength": 10,
      "dashes": false,
      "datasource": {
        "type": "prometheus",
        "uid": "prometheus"
      },
      "editable": true,
      "error": false,
      "fill": 6,
      "grid": {},
      "gridPos": {
        "h": 7,
        "w": 8,
        "x": 16,
        "y": 137
      },
      "height": "",
      "id": 2,
      "legend": {
        "alignAsTable": false,
        "avg": true,
        "current": false,
        "hideEmpty": true,
        "hideZero": true,
        "max": false,
        "min": false,
        "rightSide": false,
        "show": true,
        "sort": "avg",
        "sortDesc": true,
        "total": false,
        "values": true
      },
      "lines": true,
      "linewidth": 2,
      "links": [],
      "nullPointMode": "null",
      "percentage": false,
      "pointradius": 5,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [
        {
          "alias": "Max Core Utilization",
          "lines": false,
          "pointradius": 1,
          "points": true,
          "stack": false
        },
        {
          "alias": "Load 1m",
          "color": "#58140C",
          "fill": 2,
          "stack": false,
          "yaxis": 2
        }
      ],
      "spaceLength": 10,
      "stack": true,
      "steppedLine": false,
      "targets": [
        {
          "calculatedInterval": "2s",
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "datasourceErrors": {},
          "errors": {},
          "expr": "clamp_max(((avg by (mode) ( (clamp_max(rate(node_cpu_seconds_total{instance=\"$host\",mode!=\"idle\"}[$interval]),1)) or (clamp_max(irate(node_cpu_seconds_total{instance=\"$host\",mode!=\"idle\"}[5m]),1)) ))*100 or (avg_over_time(node_cpu_seconds_total_average{instance=~\"$host\", mode!=\"total\", mode!=\"idle\"}[$interval]) or avg_over_time(node_cpu_seconds_total_average{instance=~\"$host\", mode!=\"total\", mode!=\"idle\"}[5m]))),100)",
          "format": "time_series",
          "hide": false,
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "{{ mode }}",
          "metric": "",
          "refId": "A",
          "step": 20
        },
        {
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "expr": "clamp_max(max by () (sum  by (cpu) ( (clamp_max(rate(node_cpu_seconds_total{instance=\"$host\",mode!=\"idle\",mode!=\"iowait\"}[$interval]),1)) or (clamp_max(irate(node_cpu_seconds_total{instance=\"$host\",mode!=\"idle\",mode!=\"iowait\"}[5m]),1)) ))*100,100)",
          "format": "time_series",
          "hide": true,
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "Max Core Utilization",
          "refId": "B",
          "step": 20
        },
        {
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "expr": "node_load1{instance=\"$host\"}",
          "format": "time_series",
          "hide": false,
          "intervalFactor": 2,
          "legendFormat": "Load 1m",
          "refId": "C"
        }
      ],
      "thresholds": [],
      "title": "CPU Usage / Load",
      "tooltip": {
        "msResolution": false,
        "shared": true,
        "sort": 0,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "mode": "time",
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "decimals": 1,
          "format": "percent",
          "label": "",
          "logBase": 1,
          "max": 100,
          "min": 0,
          "show": true
        },
        {
          "format": "none",
          "logBase": 1,
          "min": 0,
          "show": true
        }
      ],
      "yaxis": {
        "align": false
      }
    },
    {
      "aliasColors": {},
      "bars": false,
      "dashLength": 10,
      "dashes": false,
      "datasource": {
        "type": "prometheus",
        "uid": "prometheus"
      },
      "decimals": 2,
      "editable": true,
      "error": false,
      "fill": 2,
      "grid": {},
      "gridPos": {
        "h": 7,
        "w": 8,
        "x": 0,
        "y": 144
      },
      "height": "250px",
      "id": 36,
      "legend": {
        "alignAsTable": false,
        "avg": true,
        "current": false,
        "hideEmpty": true,
        "hideZero": true,
        "max": false,
        "min": false,
        "rightSide": false,
        "show": true,
        "total": false,
        "values": true
      },
      "lines": false,
      "linewidth": 2,
      "links": [],
      "nullPointMode": "null",
      "percentage": false,
      "pointradius": 1,
      "points": true,
      "renderer": "flot",
      "seriesOverrides": [],
      "spaceLength": 10,
      "stack": false,
      "steppedLine": false,
      "targets": [
        {
          "calculatedInterval": "2m",
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "datasourceErrors": {},
          "errors": {},
          "expr": "sum((rate(node_disk_read_time_seconds_total{device!~\"dm-.+\", instance=\"$host\"}[$interval]) / rate(node_disk_reads_completed_total{device!~\"dm-.+\", instance=\"$host\"}[$interval])) or (irate(node_disk_read_time_seconds_total{device!~\"dm-.+\", instance=\"$host\"}[5m]) / irate(node_disk_reads_completed_total{device!~\"dm-.+\", instance=\"$host\"}[5m]))\nor avg_over_time(aws_rds_read_latency_average{instance=\"$host\"}[$interval]) or avg_over_time(aws_rds_read_latency_average{instance=\"$host\"}[5m]))",
          "format": "time_series",
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "Read",
          "metric": "",
          "refId": "A",
          "step": 20,
          "target": ""
        },
        {
          "calculatedInterval": "2m",
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "datasourceErrors": {},
          "errors": {},
          "expr": "sum((rate(node_disk_write_time_seconds_total{device!~\"dm-.+\", instance=\"$host\"}[$interval]) / rate(node_disk_writes_completed_total{device!~\"dm-.+\", instance=\"$host\"}[$interval])) or (irate(node_disk_write_time_seconds_total{device!~\"dm-.+\", instance=\"$host\"}[5m]) / irate(node_disk_writes_completed_total{device!~\"dm-.+\", instance=\"$host\"}[5m])) or \navg_over_time(aws_rds_write_latency_average{instance=\"$host\"}[$interval]) or avg_over_time(aws_rds_write_latency_average{instance=\"$host\"}[5m]))",
          "format": "time_series",
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "Write",
          "metric": "",
          "refId": "B",
          "step": 20,
          "target": ""
        }
      ],
      "thresholds": [],
      "title": "Disk Latency",
      "tooltip": {
        "msResolution": false,
        "shared": true,
        "sort": 0,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "mode": "time",
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "format": "ms",
          "label": "",
          "logBase": 2,
          "show": true
        },
        {
          "format": "ms",
          "label": "",
          "logBase": 1,
          "min": 0,
          "show": true
        }
      ],
      "yaxis": {
        "align": false
      }
    },
    {
      "aliasColors": {},
      "bars": false,
      "dashLength": 10,
      "dashes": false,
      "datasource": {
        "type": "prometheus",
        "uid": "prometheus"
      },
      "editable": true,
      "error": false,
      "fill": 2,
      "grid": {},
      "gridPos": {
        "h": 7,
        "w": 8,
        "x": 8,
        "y": 144
      },
      "height": "250px",
      "id": 21,
      "legend": {
        "alignAsTable": false,
        "avg": true,
        "current": false,
        "hideEmpty": false,
        "max": false,
        "min": false,
        "rightSide": false,
        "show": true,
        "sort": "avg",
        "sortDesc": true,
        "total": false,
        "values": true
      },
      "lines": true,
      "linewidth": 2,
      "links": [],
      "nullPointMode": "null",
      "percentage": false,
      "pointradius": 5,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [
        {
          "alias": "Outbound",
          "transform": "negative-Y"
        }
      ],
      "spaceLength": 10,
      "stack": false,
      "steppedLine": false,
      "targets": [
        {
          "calculatedInterval": "2s",
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "datasourceErrors": {},
          "errors": {},
          "expr": "sum(rate(node_network_receive_bytes_total{instance=\"$host\", device!=\"lo\"}[$interval])) or sum(irate(node_network_receive_bytes_total{instance=\"$host\", device!=\"lo\"}[5m])) or sum(max_over_time(rdsosmetrics_network_rx{instance=\"$host\"}[$interval])) or sum(max_over_time(rdsosmetrics_network_rx{instance=\"$host\"}[5m])) ",
          "format": "time_series",
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "Inbound",
          "metric": "",
          "refId": "B",
          "step": 20,
          "target": ""
        },
        {
          "calculatedInterval": "2s",
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "datasourceErrors": {},
          "errors": {},
          "expr": "sum(rate(node_network_transmit_bytes_total{instance=\"$host\", device!=\"lo\"}[$interval])) or sum(irate(node_network_transmit_bytes_total{instance=\"$host\", device!=\"lo\"}[5m])) or\nsum(max_over_time(rdsosmetrics_network_tx{instance=\"$host\"}[$interval])) or sum(max_over_time(rdsosmetrics_network_tx{instance=\"$host\"}[5m]))",
          "format": "time_series",
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "Outbound",
          "metric": "",
          "refId": "A",
          "step": 20,
          "target": ""
        }
      ],
      "thresholds": [],
      "title": "Network Traffic",
      "tooltip": {
        "msResolution": false,
        "shared": true,
        "sort": 0,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "mode": "time",
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "format": "Bps",
          "label": "",
          "logBase": 1,
          "show": true
        },
        {
          "format": "bytes",
          "logBase": 1,
          "min": 0,
          "show": true
        }
      ],
      "yaxis": {
        "align": false
      }
    },
    {
      "aliasColors": {},
      "bars": false,
      "dashLength": 10,
      "dashes": false,
      "datasource": {
        "type": "prometheus",
        "uid": "prometheus"
      },
      "editable": true,
      "error": false,
      "fill": 2,
      "grid": {},
      "gridPos": {
        "h": 7,
        "w": 8,
        "x": 16,
        "y": 144
      },
      "id": 38,
      "legend": {
        "alignAsTable": false,
        "avg": true,
        "current": false,
        "hideEmpty": false,
        "max": false,
        "min": false,
        "rightSide": false,
        "show": true,
        "sort": "avg",
        "sortDesc": true,
        "total": false,
        "values": true
      },
      "lines": true,
      "linewidth": 2,
      "links": [],
      "nullPointMode": "null",
      "percentage": false,
      "pointradius": 5,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [],
      "spaceLength": 10,
      "stack": false,
      "steppedLine": false,
      "targets": [
        {
          "calculatedInterval": "2s",
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "datasourceErrors": {},
          "errors": {},
          "expr": "rate(node_vmstat_pswpin{instance=\"$host\"}[$interval]) * 4096 or irate(node_vmstat_pswpin{instance=\"$host\"}[5m]) * 4096",
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "Swap In (Reads)",
          "metric": "",
          "refId": "A",
          "step": 20,
          "target": ""
        },
        {
          "calculatedInterval": "2s",
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "datasourceErrors": {},
          "errors": {},
          "expr": "rate(node_vmstat_pswpout{instance=\"$host\"}[$interval]) * 4096 or irate(node_vmstat_pswpout{instance=\"$host\"}[5m]) * 4096",
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "Swap Out (Writes)",
          "metric": "",
          "refId": "B",
          "step": 20,
          "target": ""
        }
      ],
      "thresholds": [],
      "title": "Swap Activity",
      "tooltip": {
        "msResolution": false,
        "shared": true,
        "sort": 0,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "mode": "time",
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "format": "Bps",
          "label": "",
          "logBase": 1,
          "min": 0,
          "show": true
        },
        {
          "format": "bytes",
          "logBase": 1,
          "min": 0,
          "show": true
        }
      ],
      "yaxis": {
        "align": false
      }
    }
  ],
  "refresh": "1m",
  "schemaVersion": 38,
  "style": "dark",
  "tags": [
    "Percona",
    "MySQL"
  ],
  "templating": {
    "list": [
      {
        "allFormat": "glob",
        "auto": true,
        "auto_count": 200,
        "auto_min": "1s",
        "current": {
          "selected": false,
          "text": "auto",
          "value": "$__auto_interval_interval"
        },
        "datasource": "prometheus",
        "hide": 0,
        "includeAll": false,
        "label": "Interval",
        "multi": false,
        "multiFormat": "glob",
        "name": "interval",
        "options": [
          {
            "selected": true,
            "text": "auto",
            "value": "$__auto_interval_interval"
          },
          {
            "selected": false,
            "text": "1s",
            "value": "1s"
          },
          {
            "selected": false,
            "text": "5s",
            "value": "5s"
          },
          {
            "selected": false,
            "text": "1m",
            "value": "1m"
          },
          {
            "selected": false,
            "text": "5m",
            "value": "5m"
          },
          {
            "selected": false,
            "text": "1h",
            "value": "1h"
          },
          {
            "selected": false,
            "text": "6h",
            "value": "6h"
          },
          {
            "selected": false,
            "text": "1d",
            "value": "1d"
          }
        ],
        "query": "1s,5s,1m,5m,1h,6h,1d",
        "refresh": 2,
        "skipUrlSync": false,
        "type": "interval"
      },
      {
        "allFormat": "glob",
        "current": {
          "selected": false,
          "text": "10.16.0.7:9104",
          "value": "10.16.0.7:9104"
        },
        "datasource": {
          "type": "prometheus",
          "uid": "prometheus"
        },
        "definition": "",
        "hide": 0,
        "includeAll": false,
        "label": "Host",
        "multi": false,
        "multiFormat": "regex values",
        "name": "host",
        "options": [],
        "query": "label_values(mysql_up, instance)",
        "refresh": 1,
        "refresh_on_load": false,
        "regex": "",
        "skipUrlSync": false,
        "sort": 1,
        "type": "query",
        "useTags": false
      }
    ]
  },
  "time": {
    "from": "now-12h",
    "to": "now"
  },
  "timepicker": {
    "collapse": false,
    "enable": true,
    "hidden": false,
    "notice": false,
    "now": true,
    "refresh_intervals": [
      "5s",
      "10s",
      "30s",
      "1m",
      "5m",
      "15m",
      "30m",
      "1h",
      "2h",
      "1d"
    ],
    "status": "Stable",
    "time_options": [
      "5m",
      "15m",
      "1h",
      "6h",
      "12h",
      "24h",
      "2d",
      "7d",
      "30d"
    ],
    "type": "timepicker"
  },
  "timezone": "browser",
  "title": "MySQL Overview",
  "uid": "MQWgroiiz",
  "version": 2,
  "weekStart": ""
}

原文链接: https://dashen.tech/2020/08/11/Prometheus告警规则配置/

版权声明: 转载请注明出处.

清澄秋爽

苹果树下的思索者书写是对思维的缓存

Prometheus告警规则配置

Alertmanager的作用

Prometheus的rules，是yaml文件还是rules文件？

Prometheus和node_exporter什么关系？必须要使用到吗

这段配置什么意思?

配置文件中的groups和rules有何区别？

Prometheus 新增规则后，是否需要重启？

将告警信息发送到邮箱

将告警信息发送到IM中

发送飞书/Lark

发送slack

发送企业微信

alertmanager.yml中route部分如果有多个receiver，该怎么写？

查看是否触发告警

为什么会出现只有邮件或slack能收到的情况？怎样才能同时都收到

alertmanager 子路由routes配置多个receivers

告警分级

静默(Silence)与抑制(Inhibition)

Inhibition 抑制

Silence 静默

alertmanager 静默是什么意思？

hostCpuUsageAlert 这个是prometheus默认的告警规则吗？如何停止或静默

抑制的持续时间想设置为永远，该怎么写？

alertmanager 静默某台机器某个告警

文章目录

Alertmanager的作用

Prometheus的rules，是yaml文件还是rules文件？

Prometheus和node_exporter什么关系？必须要使用到吗

这段配置什么意思?

配置文件中的groups和rules有何区别？

Prometheus 新增规则后，是否需要重启？

将告警信息发送到邮箱

将告警信息发送到IM中

发送飞书/Lark

发送slack

发送企业微信

alertmanager.yml中route部分如果有多个receiver，该怎么写？

查看是否触发告警

为什么会出现只有邮件或slack能收到的情况？怎样才能同时都收到

alertmanager 子路由routes配置多个receivers

告警分级

静默(Silence)与抑制(Inhibition)

Inhibition 抑制

Silence 静默

alertmanager 静默 是什么意思？

hostCpuUsageAlert 这个是prometheus默认的告警规则吗？如何停止或静默

抑制的持续时间想设置为永远，该怎么写？

alertmanager 静默某台机器某个告警

文章目录

alertmanager 静默是什么意思？