Viewing Argo’s Prometheus metrics in a kind cluster

Argo is a workflow manager for Kubernetes. Prometheus is a monitoring tool for collecting metrics. Argo may be configured to expose an endpoint for Prometheus to collect metrics. kind is a tool to run a Kubernetes cluster within a docker container. kind is great for local development and testing.

When I first started playing with Argo and its Prometheus metrics, there wasn’t a quick resource to figure it out. On top of that, using kind can sometimes make it more challenging since the cluster is running within a docker container.

Updated (December 06, 2020):

  • Use Argo v2.11.8 instead of v2.7.2
  • Use kube-prometheus v0.6.0 instead of v0.3.0
  • Use kind v0.9.0 instead of v0.7.0
  • Use kubectl v1.19.4 instead of v1.17.4

Install kind

For this we’ll be using kind v0.9.0, which can be installed from its GitHub Release.

For 64bit Linux, this can be done via:

1
2
3
4
5
6
7
curl https://github.com/kubernetes-sigs/kind/releases/download/v0.9.0/kind-linux-amd64 \
  --location \
  --output ~/kind

chmod +x ~/kind

~/kind version

Create kind cluster

Creating a Kubernetes cluster is relatively easy with kind. First make sure Docker is running. Then create a cluster via:

1
~/kind create cluster

This will create a v1.19.1 Kubernetes cluster.

Install kubectl

kubectl will help with deploying new resources to our cluster. We’ll be using kubectl v1.19.4. The Kubernetes docs have instructions for multiple operating systems.

For 64bit Linux, the instructions are:

1
2
3
4
5
6
7
curl https://storage.googleapis.com/kubernetes-release/release/v1.19.4/bin/linux/amd64/kubectl \
  --location \
  --output ~/kubectl

chmod +x ~/kubectl

~/kubectl version

Deploy Argo

Argo’s Workflow Controller is responsible for detecting new Workflows and executing Workflows. This controller may be configured to observe all namespaces in a Kubernetes cluster or a single namespace. We’ll use a namespace-scoped Argo installation.

A namespace-scoped Argo may be installed to the argo namespace via:

1
2
3
4
5
6
7
8
9
~/kubectl create namespace argo

~/kubectl create \
  --filename https://raw.githubusercontent.com/argoproj/argo/v2.11.8/manifests/namespace-install.yaml \
  --namespace argo

~/kubectl wait deployment workflow-controller \
  --for condition=Available \
  --namespace argo

By default, Argo will use the default service account in the namespace where a Workflow runs. Usually, the default service account doesn’t have enough rights to do observe resources created by a Workflow. We can bind the argo namespace’s default service account to the cluster-admin role for the sake of this tutorial. In a production environment, you’d want to create a new service account and configure Argo to use that instead of running as cluster-admin.

1
2
3
4
~/kubectl create rolebinding default-admin \
  --clusterrole cluster-admin \
  --namespace argo \
  --serviceaccount=argo:default

Configure Argo to work in kind

kind’s cluster does not use Docker, which is the default container runtime used by Argo’s Workflow controller. Instead Kind uses containerd. Fortunately, Argo’s Workflow Controller may be configured. Argo documents the configuration pretty well in their workflow-controller-configmap docs.

We’ll set this configuration in the Workflow Controller’s configmap by:

1
2
3
4
~/kubectl patch configmap workflow-controller-configmap \
  --namespace argo \
  --patch '{"data": {"containerRuntimeExecutor": "pns"}}' \
  --type merge

Install Argo CLI

Argo has a CLI that aids with submitting Argo Workflows. We’ll be using v2.11.8, which can also be installed from its GitHub Release.

For 64bit Linux, this can be done via:

1
2
3
4
5
6
7
8
9
curl https://github.com/argoproj/argo/releases/download/v2.11.8/argo-linux-amd64.gz \
  --location \
  --output ~/argo.gz

gunzip ~/argo.gz

chmod +x ~/argo

~/argo version

Run hello-world Argo Workflow

We’ll want to execute an Argo Workflow so that there is data for Prometheus metrics. Argo has several examples. We’ll use the hello-world example to keep this simple.

The hello-world Workflow may be executed by running:

1
2
3
~/argo submit https://raw.githubusercontent.com/argoproj/argo/v2.11.8/examples/hello-world.yaml \
  --namespace=argo \
  --watch

Deploy Prometheus

The CoreOS organization has created a project called kube-prometheus, which can be used to deploy Prometheus within a Kubernetes cluster.

We’ll use v0.6.0 of kube-prometheus which can be deployed via:

1
2
3
4
5
6
7
8
9
git clone https://github.com/coreos/kube-prometheus.git ~/kube-prometheus

cd ~/kube-prometheus

git checkout v0.6.0

~/kubectl create --filename ~/kube-prometheus/manifests/setup/
until ~/kubectl get servicemonitors --all-namespaces ; do sleep 1; done
~/kubectl create --filename ~/kube-prometheus/manifests/

Configure Prometheus RBAC

The kube-prometheus installation sets up RBAC for Prometheus so that Prometheus may access resources in the default, kube-system, and monitoring namespaces. Our Argo installation is in the argo namespace, so we’ll need to add additional RBAC permissions for Prometheus to access the argo resources.

First, we’ll create a prometheus-k8s role in the argo namespace:

1
2
3
4
~/kubectl create role prometheus-k8s \
  --namespace argo \
  --resource services,endpoints,pods \
  --verb get,list,watch

Next we’ll need to bind this role to Prometheus' service account:

1
2
3
4
~/kubectl create rolebinding prometheus-k8s \
  --namespace argo \
  --role prometheus-k8s \
  --serviceaccount monitoring:prometheus-k8s

Create workflow-controller-metrics ServiceMonitor

Before we start collecting metrics from Argo, we’ll want to label the service so that we can select the service in a bit. To do that, run:

1
2
~/kubectl label service workflow-controller-metrics app=workflow-controller \
  --namespace argo

We’ll need to create a ServiceMonitor so that Prometheus knows to scrape the workflow-controller-metrics service. The easiest way is to create a YAML file with the following contents:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: workflow-controller-metrics
  namespace: argo
spec:
  endpoints:
    - port: metrics
  namespaceSelector:
    matchNames:
      - argo
  selector:
    matchLabels:
      app: workflow-controller

We’ll assume this YAML file is created at ~/workflow-controller-metrics-servicemonitor.yaml. We can then create the ServiceMonitor via:

1
2
~/kubectl create \
  --filename ~/workflow-controller-metrics-servicemonitor.yaml

It can take up to 5 minutes for Prometheus/Kubernetes to detect a new ServiceMonitor. Fortunately we can watch the logs of Prometheus' config reloader container to see when it gets a new configuration.

1
2
3
~/kubectl logs prometheus-k8s-0 prometheus-config-reloader \
  --follow \
  --namespace monitoring

After a few minutes you should see a log message indicating the configuration reload has been triggered.

View Argo metrics on Prometheus dashboard

First, we’ll want to expose the Prometheus dashboard on a local port so that we can view the dashboard while it’s running in the Kubernetes cluster created by kind.

1
2
~/kubectl port-forward service/prometheus-k8s 9090 \
  --namespace monitoring

Now navigate to http://localhost:9090 in your browser. Click the “Status” dropdown and then click on “Service Discovery.” You should see a argo/workflow-controller-metrics/0 entry. If you expand this entry you’ll notice the workflow-controller-metrics target labels being picked up, while the argo-service’s labels are all dropped.

If you then click the “Status” dropdown and then click on “Targets,” you’ll see argo/workflow-controller-metrics/0 at the top. From there we can see the workflow–controller-metrics is up and running. This will have status about how long ago Prometheus has scraped the workflow-controller’s metrics.

Back on the “Graph” page, you can enter an expression like: argo_workflows_count{} and see how many Argo workflows are in the following states: Error, Failed, Pending, Running, Skipped, and Succeeded. The metric with the label status=Succeeded should have a value of 1 because of our successful hello-world workflow.

From here, you’re able to use Prometheus' dashboard to explore metrics created by Argo’s workflow-controller.