fix(ROX-31121): Monitoring deployment from Prometheus Community chart#1807
fix(ROX-31121): Monitoring deployment from Prometheus Community chart#1807tommartensen merged 2 commits intomasterfrom
Conversation
|
A single node development cluster (infra-pr-1807) was allocated in production infra for this PR. CI will attempt to deploy 🔌 You can connect to this cluster with: 🛠️ And pull infractl from the deployed dev infra-server with: 🚲 You can then use the dev infra instance e.g.: Further Development☕ If you make changes, you can commit and push and CI will take care of updating the development cluster. 🚀 If you only modify configuration (chart/infra-server/configuration) or templates (chart/infra-server/{static,templates}), you can get a faster update with: LogsLogs for the development infra depending on your @redhat.com authuser: Or: |
Replace Bitnami kube-prometheus with Prometheus Community kube-prometheus-stack (83.5.1), refresh requirements.lock, and rewrite monitoring-values.yaml for the new subchart: same resource limits, PVC-backed Prometheus, disabled node/kube-state/kube control-plane scrapers, Slack routing via alertmanagerConfigSelector, tuned defaultRules for disabled targets, no custom image repos, and Grafana off.
Why: Bitnami kube-prometheus chart is outdated and the legacy images were deleted from DockerHub.
Also includes a fix for #1804 if
NO_MONITORINGis unset.I tested the monitoring stack in a separate GKE cluster and it produced a test alert in Slack: https://redhat-internal.slack.com/archives/C06L0LDBEGZ/p1776413056761179