Labs ICT
⭐ Pro Login

Distributed Tracing

Tracking requests across microservices with Jaeger and Zipkin

Distributed Tracing

In microservices architectures, a single user request may travel through dozens of services. Distributed tracing tracks requests across service boundaries to identify bottlenecks and failures.

How Tracing Works


  User Request Flow:
  β”Œβ”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”
  β”‚Gateway│───▢│ Auth  │───▢│ Order │───▢│Paymentβ”‚
  β””β”€β”€β”€β”¬β”€β”€β”€β”˜    β””β”€β”€β”€β”¬β”€β”€β”€β”˜    β””β”€β”€β”€β”¬β”€β”€β”€β”˜    β””β”€β”€β”€β”¬β”€β”€β”€β”˜
      β”‚            β”‚            β”‚            β”‚
      β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                   β–Ό
  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
  β”‚              Trace Timeline                  β”‚
  β”‚  β”œβ”€ Gateway ──────────────────────────────  β”‚
  β”‚    β”œβ”€ Auth ────────                      β”‚  β”‚
  β”‚                   β”œβ”€ Order ───────────   β”‚  β”‚
  β”‚                              β”œβ”€Payment──  β”‚  β”‚
  β”‚                                              β”‚
  β”‚  Total: 250ms  |  Bottleneck: Payment: 120msβ”‚
  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Trace Concepts

  • Trace β€” The entire journey of a request through the system
  • Span β€” A single unit of work within a trace (e.g., a database query)
  • Context Propagation β€” Passing trace IDs across service boundaries
  • Sampling β€” Collecting a subset of traces to reduce overhead

Tracing Tools


  Jaeger (by Uber):
  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
  β”‚  Jaeger Agent    β†’ Collects spans   β”‚
  β”‚  Jaeger Collector β†’ Processes spans  β”‚
  β”‚  Jaeger Query    β†’ Search & query   β”‚
  β”‚  Jaeger UI       β†’ Visualization    β”‚
  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

  Zipkin (by Twitter):
  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
  β”‚  Collector  β†’ Receives trace data    β”‚
  β”‚  Storage    β†’ MySQL, ES, Cassandra   β”‚
  β”‚  API        β†’ Query interface        β”‚
  β”‚  UI         β†’ Dependency graph       β”‚
  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

  OpenTelemetry (CNCF Standard):
  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
  β”‚  Unified API for traces, metrics,    β”‚
  β”‚  and logs. Exports to any backend.   β”‚
  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸ§ͺ Quick Quiz

What is distributed tracing used for?