Prometheus Setup & PromQL
- Prometheus architecture: scraping, TSDB, rules, Alertmanager
- Scrape configuration: static configs, service discovery (Kubernetes, file-based)
- Exporters: node_exporter, blackbox_exporter, JMX exporter for JVM, custom exporters
- Push gateway: when it's the right answer and when it isn't
- Labels: the data model — cardinality trade-offs and best practices
- PromQL fundamentals: instant vectors, range vectors, selectors, offset
- Aggregation operators: sum, avg, max, count by label
- Rate and irate: calculating per-second rates from counters
- Histogram quantiles: histogram_quantile for p99 latency
- Recording rules: pre-computing expensive queries
- Metric types: Counter, Gauge, Histogram, Summary — choosing correctly
Alert Rules & Alertmanager
- Alerting rules: expr, for, labels, annotations
- SLO-based alerting: burn rate alerts for error budgets
- Alertmanager configuration: routes, receivers, grouping, inhibition rules
- Receivers: Slack, PagerDuty, email, webhook — practical configuration
- Silences and maintenance windows
- Federation: scraping metrics from remote Prometheus instances
- Prometheus Operator (kube-prometheus-stack): ServiceMonitor, PodMonitor, PrometheusRule CRDs
- Thanos and Mimir: long-term storage options for Prometheus data
- Remote write: sending metrics to Grafana Cloud, Cortex, VictoriaMetrics
Platform engineers who can set up Prometheus, write PromQL queries that surface real problems, and configure Alertmanager to route alerts to the right people at the right time.
- Configure Prometheus scraping with Kubernetes service discovery and install relevant exporters
- Write PromQL queries for rate, aggregations, and histogram quantiles
- Define alerting rules based on SLOs and configure Alertmanager routing
- Deploy kube-prometheus-stack and manage monitoring config via CRDs
Book the Prometheus & Alertmanager training
A focused one-day course — pairs naturally with the Grafana training for a complete observability stack day.
Get in touch