Training Agenda

Apache Flink

Apache Flink is the stream processing framework for stateful, event-time computations — processing unbounded data streams with exactly-once guarantees, millisecond latency, and fault tolerance through distributed snapshots. Where Spark Streaming microbatches, Flink processes truly continuously. Flink is the production choice for real-time fraud detection, event-driven pipelines, and streaming ETL at scale. This training covers the Flink DataStream API, Table API/SQL, stateful operations, and deployment on Kubernetes.

2 days On-site, remote, or hybrid Up to 20 participants German or English
What We Cover
Stateful stream processing with exactly-once guarantees
Day 1

Flink Architecture & DataStream API

  • Flink architecture: JobManager, TaskManagers, parallelism, slots
  • Flink execution model: dataflow graphs, operators, task chaining
  • DataStream API: map, filter, flatMap, keyBy, window
  • Event time vs processing time vs ingestion time
  • Watermarks: generating watermarks, handling late elements
  • Windowing: tumbling, sliding, session windows
  • Stateful functions: ValueState, ListState, MapState, BroadcastState
  • Checkpointing: RocksDB backend, checkpoint interval, exactly-once
  • Kafka source and sink: FlinkKafkaConsumer/Producer, FLIP-27 sources
  • Side outputs: routing late data and error records
Day 2

Table API, SQL & Production Deployment

  • Flink SQL and Table API: CREATE TABLE, SELECT, JOIN, aggregations
  • Temporal joins: joining streams with slowly-changing dimension tables
  • CDC ingestion: Flink CDC connectors for MySQL, PostgreSQL — debezium-based
  • Iceberg sink: writing to Iceberg tables from Flink
  • Savepoints: stateful upgrades, rescaling, migration
  • Flink on Kubernetes: Kubernetes Operator, application mode, session mode
  • Backpressure analysis: Flink UI metrics, identifying bottlenecks
  • Metrics and monitoring: Prometheus reporter, Grafana dashboards for Flink
  • Exactly-once end-to-end: Kafka transactions + Flink checkpoints
  • Flink vs Spark Structured Streaming: when to choose each
Learning Outcomes
What your team walks away with

Data and platform engineers who can build, tune, and operate stateful Flink streaming pipelines — from first source to exactly-once sinks in production.

Book the Apache Flink training

Available as a standalone 2-day course or combined with Apache Kafka and Iceberg for a complete streaming data platform week.

Get in touch