Distributed Tracing with Jaeger on Kubernetes

Introduction
Kubernetes is an open source orchestrator for deploying micro services.
Distributed tracing, also called distributed request tracing, is a method used to profile and monitor applications, especially those built using a
microservices architecture. Distributed tracing helps pinpoint where failures occur and what causes poor performance.

In this article, we will set up a distributed tracing system using Jaeger to a spring boot application.

Key Terms in Distributed Tracing
Span

Logical unit of work in Jaeger which provides following key features.

  • Operation Name
  • Start Time of Operation
  • Duration of Operation

Trace
Data execution path through the system.

Baggage
Key-value pairs are added to the span context and propagated throughout the trace. An external process can inject baggage by setting the special HTTP Header jaeger-baggage on a request.

Jaeger Architecture

Jaeger Components

Jaeger Client
Application tracing starts with the Jaeger client which uses Jaeger Go client to initialize the Jaeger configuration via environment variables such as JAEGER_SERVICE_NAME, JAEGER_AGENT_HOST, JAEGER_AGENT_PORT and JAEGER_PROPAGATION.

Jaeger Agent
The Jaeger agent is a daemon which receives spans from Jaeger clients over UDP, batches and forwards them to the collectors.

Jaeger Collector
The Jaeger collector takes care of traces from Jaeger agents and runs them through a processing pipeline and store them in specified storage backend.
The Jaeger collector is stateless and can be scaled to any number of instances on demand.

Jaeger Query
Jaeger Query is a service that retrieves traces from storage and format them to display on the UI.

Jaeger Setup
Jaeger operator — Installation

Before deploying Jaeger, you need to have kubectl credentials that allow you to create and modify policies, deploy services, create service account, custom resource definition, role and role binding objects.

kubectl create namespace observability
kubectl create -f
https://raw.githubusercontent.com/jaegertracing/jaeger-operator/master/deploy/crds/jaegertracing.io_jaegers_crd.yaml
kubectl create -n observability -f
https://raw.githubusercontent.com/jaegertracing/jaeger-operator/master/deploy/service_account.yaml
kubectl create -n observability -f
https://raw.githubusercontent.com/jaegertracing/jaeger-operator/master/deploy/operator.yaml

For Standalone Server
kubectl create -n observability -f https://raw.githubusercontent.com/jaegertracing/jaeger-operator/master/deploy/role.yaml
kubectl create -n observability -f
https://raw.githubusercontent.com/jaegertracing/jaeger-operator/master/deploy/role_binding.yaml

For Cluster Environment
kubectl create -f https://raw.githubusercontent.com/jaegertracing/jaeger-operator/master/deploy/cluster_role.yaml
kubectl create -f
https://raw.githubusercontent.com/jaegertracing/jaeger-operator/master/deploy/cluster_role_binding.yaml

By default, Jaeger Operator will watch the namespace where it installed. To look all namespaces, first download the operator.yaml file and provide WATCH_NAMESPACE value as empty as shown below.
File Name: operator.yaml
Entry:

Jaeger Instance — Installation
Creating Jaeger instance is associated with deployment strategy. Strategy is already defined in custom resource file which selects the jaeger architecture for backend.

Deployment Strategy
AllInOne Strategy
Default Strategy where agent, collector and query service are packed together to use in memory storage by default.

Production Strategy
Here backend storage will be Cassandra or Elasticsearch where agent can be installed either as daemon set or sidecar. The query and collector services are configured with a supported storage type. By default, agent will be installed as sidecar. To install agent as daemonset specify agent strategy as DaemonSet.
File Name: jaeger.yaml
Entry:

If strategy is not specified, then by default Jaeger will be deployed as sidecar.

Streaming Strategy
This strategy is designed to work together with production strategy by providing a streaming capability(Kafka) which provides the benefit of reducing the pressure on the backend storage, under high load situations.

Reference
https://www.jaegertracing.io/docs/1.21/operator/#production-strategy

In our sample application, we have enabled jaeger agent as a DaemonSet and used production strategy with storage as Elasticsearch.

What is DaemonSet?
DaemonSet will ensure that one pod is enabled in each active nodes.
When we add a node in our cluster, the daemon set will automatically deploy a pod onto the new node.

Pre-requisite

  • Cluster Environment created with AKS or EKS or GCP
  • Kubectl installed
  • Elasticsearch installed

STEP 1 : Create Namespace

kubectl create namespace jaeger

Verify that Namespace is created properly by executing the following command
kubectl get namespaces

STEP 2 : Deploy Jaeger Agent as DaemonSet(Backend Elasticsearch)

kubectl apply –f jaeger.yaml –n jaeger

Note, Jaeger can be deployed on any properly configured Kubernetes cluster in the cloud or on premises.

Verify running services by executing below command
kubectl get svc -n jaeger

Verify pods are created properly by executing the following command
kubectl get pods -n jaeger

To access the Jaeger UI, we need to enable Jaeger query service as Load Balancer. Execute the following command to get the URL/External-IP for jaeger.

kubectl patch service my-jaeger-query — patch ‘{“spec”:{“type”:”LoadBalancer”}}’ -n jaeger

Use the External-IP to access the Jaeger application in browser. Please ensure that you provide the port 16686 in the URL in which Jaeger service is being enabled.

Enabling Jaeger Tracing in Application

STEP 1: Add Client Libraries in pom.xml
 opentracing-api
 opentracing-spring-cloud-starter
 jaeger-client

Add above dependencies in your pom.xml.

STEP 2: Add a bean to initialize the tracer in main class

STEP 3: Add imports for jaeger tracing

Screenshot for span
Please find the code snippet below for enabling span in your application.

Application need to propagate the appropriate http headers for traces.
Specify JAEGER_PROPAGATION and its value b3 in your application deployment yaml file under env.

Jaeger agent as sidecar
To enable Jaeger as a sidecar we need to provide the following annotation under the metadata section of the application deployment yaml file.
“sidecar.jaegertracing.io/inject”: “true”

Jaeger agent as DaemonSet
Specify JAEGER_AGENT_HOST in your application deployment yaml file under env to use Node IP.

Screenshot

Conclusion

We have successfully enabled distributed tracing system using Jaeger to our application in Cluster environment which is essential to find the root cause of the the issues, correlation and context of the requests communicating between the microservices.
Without it, teams can be blind into their production environment when there is a performance issue or other errors.