In a previous post, we added a PrometheusRule for Argo that caused an alert to fire when an Argo Workflow failed. We were able to see the alert fire in AlertManager. AlertManager is another component of Prometheus responsible for sending notifications for when an alert is firing.
In a previous post we went through how to make smaller git commits. Sometimes during code review or development we find a commit that would be easier to understand if it was split into multiple commits. We can use what was discussed in the previous post combined with Git’s reset command to split an existing commit.
There are lots of great articles on the benefits of making smaller Git commits such as being easier to review, identify sources of bugs later with git bisect, and resolve merge conflicts. These articles seem to skim over on the how to make smaller commits.
This post builds on top of Viewing Argo’s Prometheus metrics and assumes you have a Kubernetes cluster running Argo and Prometheus.
In the previous post a ServiceMonitor was created to instruct Prometheus on how to pull metrics from Argo’s workflow-controller-metrics service. Now, we’ll add a PrometheusRule to fire off an alert when any Argo Workflow fails.
Argo is a workflow manager for Kubernetes. Prometheus is a monitoring tool for collecting metrics. Argo may be configured to expose an endpoint for Prometheus to collect metrics. kind is a tool to run a Kubernetes cluster within a docker container. kind is great for local development and testing.