Span

Question: What is a span in relation to traces, logs and metrics?

A span is the foundational unit of distributed tracing, representing a single logical operation (e.g., an HTTP request, a database call, or a function invocation) with timing and contextual metadata [7]. To understand spans deeply—and how they relate to traces, logs, and metrics—we must examine their structural, semantic, and operational relationships.

1. Span as the Atomic Unit of a Trace

A trace is a directed acyclic graph (DAG) of spans that captures the end-to-end journey of a request across services [1]. Each span:

Has a start/end timestamp and duration,
Contains attributes (key-value metadata, e.g., HTTP status, user ID),
May include events (timestamped annotations like “query started”),
Has a parent-child relationship with other spans (e.g., a gateway span may have child spans for auth and DB calls) [[3], [8]].

Example trace structure[8]:

Trace
├── Span (API Gateway)
│   ├── Span (Auth Service)
│   └── Span (User Service)
│       └── Span (Database Query)
└── Span (Response Formatting)

2. Relationship to Logs

Logs are discrete, timestamped records of events (e.g., “error: connection timeout”), often unstructured or semi-structured.
Spans can embed logs: When instrumentation libraries (e.g., OpenTelemetry) integrate with logging frameworks, log statements can be attached to spans as structured events or log records, enriching them with trace context (trace ID, span ID) [10].
This enables correlation: You can view logs within the context of a specific span—e.g., see all logs from a database query span during a failed request [10].

“When adding OpenTelemetry instrumentation on top of your existing log libraries, the log becomes a dot on a trace span” [10].

3. Relationship to Metrics

Metrics are aggregated numerical measurements over time (e.g., request rate, latency percentiles, error counts).
Spans feed into metrics indirectly:
- Span durations can be used to compute latency histograms (e.g., http.server.request.duration).
- Span attributes (e.g., http.status_code) can be aggregated into counters (e.g., http_requests_total{status="500"}).
While spans are individual, context-rich units, metrics are aggregated, summarized—but both are essential for the RED method (Rate, Errors, Duration) [7].

“Developers can acquire a comprehensive perspective of their software environment by combining distributed traces, metrics, events, and logs” [7].

4. Relationship to Traces (Recap & Nuance)

A trace is a collection of spans that together represent a single request’s path through a distributed system [3].
Spans in a trace are linked via:
- Trace ID (identifies the full trace),
- Span ID (identifies the span),
- Parent Span ID (enables tree-like nesting) [[1], [8]].
Spans may also have links to spans in other traces (e.g., for batch processing or async workflows) [1].

5. Practical Implications

Troubleshooting: A trace gives you a map; logs give you narrative detail; metrics give you signal-level trends. For example:
- A metric alert (e.g., high error rate) → drill into traces to find failing spans → inspect embedded logs for root cause [14].
Context propagation: Spans carry trace context (trace ID, span ID, sampling flags) across service boundaries, enabling distributed correlation [9].

Summary

Concept	Role	Relationship to Span
Span	Smallest unit of work in a trace	—
Trace	Collection of spans forming a request path	Spans are its building blocks [3]
Logs	Event records with timestamps	Logs can be attached to spans as events or structured metadata [10]
Metrics	Aggregated numerical signals	Span data (duration, status) is used to derive metrics [7]

In essence, spans unify the three pillars of observability—they are the contextual glue that lets you correlate logs (what happened), metrics (how often/long), and traces (how it flows) into actionable insights [[4], [14]].

References

Traces | OpenTelemetry (brave)
OpenTelemetry - Understanding Traces vs. Spans | SigNoz (brave)
Logs vs Metrics vs Traces - Engineering Fundamentals Playbook (google)
Observability primer | OpenTelemetry (brave)
Unpacking Observability: Understanding Logs, Events, Spans, and Traces | Dzero Labs (google)
OpenTelemetry demystified: a deep dive into distributed tracing | CNCF (google)
What Are Spans in Distributed Tracing? - LogicMonitor (startpage)
Traces & Spans: Observability Basics You Should Know - Last9 (startpage)
software-skills/skills/system-design/references/key-concepts … (aol)
Tracing the Line: Understanding Logs vs. Traces - Honeycomb (google)
A Deep Dive into OpenTelemetry. Part 1 - AWS in Plain English (google)
Deep Dive into OpenTelemetry in Saleor (google)
Logging Observability - OpenClaw AI Agent Skill | LLMBase (aol)
Learning Observability from Scratch: Logs, Metrics, and Traces | by Milind Nair | Mar, 2026 | Medium (brave)
A Deep Dive Into OpenTelemetry Metrics | Tiger Data (aol)
GitHub - tokio-rs/tracing: Application level tracing for Rust. (aol)

Ling's Notes

Explorer