Argo is a workflow manager for Kubernetes. Prometheus is a monitoring tool for collecting metrics. Argo may be configured to expose an endpoint for Prometheus to collect metrics. kind is a tool to run a Kubernetes cluster within a docker container. kind is great for local development and testing.
When I first started playing with Argo and its Prometheus metrics, there wasn’t a quick resource to figure it out. On top of that, using kind can sometimes make it more challenging since the cluster is running within a docker container.
Updated (December 06, 2020):
- Use Argo v2.11.8 instead of v2.7.2
- Use kube-prometheus v0.6.0 instead of v0.3.0
- Use kind v0.9.0 instead of v0.7.0
- Use kubectl v1.19.4 instead of v1.17.4
Install kind
For this we’ll be using kind v0.9.0, which can be installed from its GitHub Release.
For 64bit Linux, this can be done via:
|
|
Create kind cluster
Creating a Kubernetes cluster is relatively easy with kind. First make sure Docker is running. Then create a cluster via:
|
|
This will create a v1.19.1 Kubernetes cluster.
Install kubectl
kubectl will help with deploying new resources to our cluster. We’ll be using kubectl v1.19.4. The Kubernetes docs have instructions for multiple operating systems.
For 64bit Linux, the instructions are:
|
|
Deploy Argo
Argo’s Workflow Controller is responsible for detecting new Workflows and executing Workflows. This controller may be configured to observe all namespaces in a Kubernetes cluster or a single namespace. We’ll use a namespace-scoped Argo installation.
A namespace-scoped Argo may be installed to the argo namespace via:
|
|
By default, Argo will use the default service account in the namespace where a Workflow runs. Usually, the default service account doesn’t have enough rights to do observe resources created by a Workflow. We can bind the argo namespace’s default service account to the cluster-admin role for the sake of this tutorial. In a production environment, you’d want to create a new service account and configure Argo to use that instead of running as cluster-admin.
|
|
Configure Argo to work in kind
kind’s cluster does not use Docker, which is the default container runtime used by Argo’s Workflow controller. Instead Kind uses containerd. Fortunately, Argo’s Workflow Controller may be configured. Argo documents the configuration pretty well in their workflow-controller-configmap docs.
We’ll set this configuration in the Workflow Controller’s configmap by:
|
|
Install Argo CLI
Argo has a CLI that aids with submitting Argo Workflows. We’ll be using v2.11.8, which can also be installed from its GitHub Release.
For 64bit Linux, this can be done via:
|
|
Run hello-world Argo Workflow
We’ll want to execute an Argo Workflow so that there is data for Prometheus metrics. Argo has several examples. We’ll use the hello-world example to keep this simple.
The hello-world Workflow may be executed by running:
|
|
Deploy Prometheus
The CoreOS organization has created a project called kube-prometheus, which can be used to deploy Prometheus within a Kubernetes cluster.
We’ll use v0.6.0 of kube-prometheus which can be deployed via:
|
|
Configure Prometheus RBAC
The kube-prometheus installation sets up RBAC for Prometheus so that Prometheus may access resources in the default, kube-system, and monitoring namespaces. Our Argo installation is in the argo namespace, so we’ll need to add additional RBAC permissions for Prometheus to access the argo resources.
First, we’ll create a prometheus-k8s role in the argo namespace:
|
|
Next we’ll need to bind this role to Prometheus’ service account:
|
|
Create workflow-controller-metrics ServiceMonitor
Before we start collecting metrics from Argo, we’ll want to label the service so that we can select the service in a bit. To do that, run:
|
|
We’ll need to create a ServiceMonitor so that Prometheus knows to scrape the workflow-controller-metrics service. The easiest way is to create a YAML file with the following contents:
|
|
We’ll assume this YAML file is created at ~/workflow-controller-metrics-servicemonitor.yaml
.
We can then create the ServiceMonitor via:
|
|
It can take up to 5 minutes for Prometheus/Kubernetes to detect a new ServiceMonitor. Fortunately we can watch the logs of Prometheus’ config reloader container to see when it gets a new configuration.
|
|
After a few minutes you should see a log message indicating the configuration reload has been triggered.
View Argo metrics on Prometheus dashboard
First, we’ll want to expose the Prometheus dashboard on a local port so that we can view the dashboard while it’s running in the Kubernetes cluster created by kind.
|
|
Now navigate to http://localhost:9090 in your browser.
Click the “Status” dropdown and then click on “Service Discovery.” You should see
a argo/workflow-controller-metrics/0
entry. If you expand this entry you’ll
notice the workflow-controller-metrics target labels being picked up, while the
argo-service’s labels are all dropped.
If you then click the “Status” dropdown and then click on “Targets,” you’ll see
argo/workflow-controller-metrics/0
at the top. From there we can see the
workflow–controller-metrics is up and running. This will have status about
how long ago Prometheus has scraped the workflow-controller’s metrics.
Back on the “Graph” page, you can enter an expression like: argo_workflows_count{}
and
see how many Argo workflows are in the following states: Error, Failed, Pending, Running,
Skipped, and Succeeded. The metric with the label status=Succeeded
should have a value of 1
because of our successful hello-world workflow.
From here, you’re able to use Prometheus’ dashboard to explore metrics created by Argo’s workflow-controller.