Training Agenda

Prometheus
& Alertmanager

Prometheus is the de facto standard for metrics collection in cloud-native environments — a pull-based time-series database with a powerful query language (PromQL) and a rich ecosystem of exporters. Alertmanager handles routing, grouping, silencing, and notification of alerts generated by Prometheus rules. Together they form the monitoring backbone for most Kubernetes deployments. This training covers Prometheus from scrape configuration through PromQL and alert routing.

1 day On-site, remote, or hybrid Up to 20 participants German or English
What We Cover
Metrics collection, querying, and alerting for cloud-native systems
Module 1

Prometheus Setup & PromQL

  • Prometheus architecture: scraping, TSDB, rules, Alertmanager
  • Scrape configuration: static configs, service discovery (Kubernetes, file-based)
  • Exporters: node_exporter, blackbox_exporter, JMX exporter for JVM, custom exporters
  • Push gateway: when it's the right answer and when it isn't
  • Labels: the data model — cardinality trade-offs and best practices
  • PromQL fundamentals: instant vectors, range vectors, selectors, offset
  • Aggregation operators: sum, avg, max, count by label
  • Rate and irate: calculating per-second rates from counters
  • Histogram quantiles: histogram_quantile for p99 latency
  • Recording rules: pre-computing expensive queries
  • Metric types: Counter, Gauge, Histogram, Summary — choosing correctly
Module 2

Alert Rules & Alertmanager

  • Alerting rules: expr, for, labels, annotations
  • SLO-based alerting: burn rate alerts for error budgets
  • Alertmanager configuration: routes, receivers, grouping, inhibition rules
  • Receivers: Slack, PagerDuty, email, webhook — practical configuration
  • Silences and maintenance windows
  • Federation: scraping metrics from remote Prometheus instances
  • Prometheus Operator (kube-prometheus-stack): ServiceMonitor, PodMonitor, PrometheusRule CRDs
  • Thanos and Mimir: long-term storage options for Prometheus data
  • Remote write: sending metrics to Grafana Cloud, Cortex, VictoriaMetrics
Learning Outcomes
What your team walks away with

Platform engineers who can set up Prometheus, write PromQL queries that surface real problems, and configure Alertmanager to route alerts to the right people at the right time.

Book the Prometheus & Alertmanager training

A focused one-day course — pairs naturally with the Grafana training for a complete observability stack day.

Get in touch