← Blog'a Dön
DEVOPS

What Is Observability? The Three Pillars Explained

F. Çağrı Bilgehan24 Ocak 202610 dk okuma
observabilitymonitoringdevopsprometheus

What Is Observability? The Three Pillars Explained

When your system goes down, you ask "what happened?" Monitoring tells you "something broke." Observability answers "why it broke, where it broke, and how to fix it."

Monitoring vs Observability

| Feature | Monitoring | Observability | |---------|-----------|--------------| | Focus | Track known issues | Discover unknown issues | | Approach | Dashboards + alerts | Querying + exploration | | Question | "Is the system running?" | "Why isn't it running?" |

Three Pillars

1. Metrics

Numerical measurements stored as time-series data:

http_requests_total{method="GET", status="200"} 15234
http_request_duration_seconds{quantile="0.99"} 0.250
cpu_usage_percent 45.2

RED Method (Request-oriented): Rate, Errors, Duration USE Method (Resource-oriented): Utilization, Saturation, Errors

Tools: Prometheus, Grafana, Datadog, CloudWatch

2. Logs

Text-based records of events. Structured (JSON) logs are preferred:

{
  "timestamp": "2026-02-14T21:30:00Z",
  "level": "error",
  "service": "payment-service",
  "traceId": "abc-123",
  "message": "Payment failed",
  "userId": 42,
  "error": "Insufficient funds"
}

Tools: ELK Stack, Loki, Fluentd

3. Traces

Track a request's journey through the system end-to-end. Critical in distributed systems:

[Client] → [API Gateway] → [Order Service] → [Payment Service]
   0ms        5ms              15ms              45ms

Tools: Jaeger, Zipkin, OpenTelemetry, Datadog APM

OpenTelemetry

The vendor-agnostic open standard for collecting observability data (metrics, logs, traces). Avoids vendor lock-in by providing a single API for all observability signals.

Observability Tools

| Tool | Type | Strength | |------|------|----------| | Prometheus | Metrics | Open source, powerful queries | | Grafana | Visualization | Multi-source dashboards | | Jaeger | Tracing | Distributed tracing | | ELK Stack | Logging | Full-text search | | Datadog | All-in-one | Integrated solution |

Alerting Best Practices

  • Only create actionable alerts
  • Avoid alert fatigue
  • Define severity levels (P1-P4)
  • Prepare runbooks
  • Implement on-call rotation

Conclusion

Observability is the key to debugging and performance optimization in modern distributed systems. Monitor with Metrics, understand with Logs, find bottlenecks with Traces.

Learn Observability and DevOps practices on the DevOps career path at LabLudus.

İlgili Yazılar

Infrastructure as Code (IaC) Nedir? Terraform ve Altyapı Otomasyonu

Infrastructure as Code nedir? Terraform, Pulumi, CloudFormation ile altyapı otomasyonu, versiyon kontrolü ve tekrarlanabilir deployment rehberi.

What Is Infrastructure as Code? Terraform & Automation Guide

IaC explained: Terraform, Pulumi, CloudFormation for infrastructure automation, version control, and repeatable deployments.