Till Rohrmann – Fault Tolerance and Job Recovery in Apache Flink

•

5 recomendaciones•7,410 vistas

Flink Forward

Flink Forward 2015

Tecnología

Fault Tolerance and Job
Recovery in Apache Flink™
Till Rohrmann
trohrmann@apache.org
@stsffap

Better be safe than sorry
 Failures will happen
 EMC estimated $1.7 billion costs due to
data loss and system downtime
 Recovery will save you time and costs
 Switch between algorithms
 Live upgrade of your system
3

Fault tolerance guarantees
 At most once
• No guarantees at all
 At least once
• For many applications sufficient
 Exactly once
 Flink provides all guarantees
5

Checkpoints
 Consistent snapshots of distributed data
stream and operator state
6

Barriers
 Markers for checkpoints
 Injected in the data flow
7

$Operator State  Stateless operators  System state  User defined state 9 ds.filter(_ != 0) ds.keyBy(0).window(TumblingTimeWindows.of(5, TimeUnit.SECONDS)) public class CounterSum implements RichReduceFunction<Long> { private OperatorState<Long> counter; @Override public Long reduce(Long v1, Long v2) throws Exception { counter.update(counter.value() + 1); return v1 + v2; } @Override public void open(Configuration config) { counter = getRuntimeContext().getOperatorState(“counter”, 0L, false); } }$

Advantages
 Separation of app logic from recovery
• Checkpointing interval is just a config
parameter
 High throughput
• Controllable checkpointing overhead
 Low impact on latency
14

Without high availability
17
JobManager
TaskManager

With high availability
18
JobManager
TaskManager
Stand-by
JobManager
Apache Zookeeper™
KEEP GOING

Persisting jobs
19
JobManager
Client
TaskManagers
Apache Zookeeper™
Job
1. Submit job

Persisting jobs
20
JobManager
Client
TaskManagers
Apache Zookeeper™
1. Submit job
2. Persist execution graph

Persisting jobs
21
JobManager
Client
TaskManagers
Apache Zookeeper™
1. Submit job
2. Persist execution graph
3. Write handle to ZooKeeper

Persisting jobs
22
JobManager
Client
TaskManagers
Apache Zookeeper™
1. Submit job
2. Persist execution graph
3. Write handle to ZooKeeper
4. Deploy tasks

Handling checkpoints
23
JobManager
Client
TaskManagers
Apache Zookeeper™
1. Take snapshots

Handling checkpoints
24
JobManager
Client
TaskManagers
Apache Zookeeper™
1. Take snapshots
2. Persist snapshots
3. Send handles to JM

Handling checkpoints
25
JobManager
Client
TaskManagers
Apache Zookeeper™
1. Take snapshots
2. Persist snapshots
3. Send handles to JM
4. Create global checkpoint

Handling checkpoints
26
JobManager
Client
TaskManagers
Apache Zookeeper™
1. Take snapshots
2. Persist snapshots
3. Send handles to JM
4. Create global checkpoint
5. Persist global checkpoint

Handling checkpoints
27
JobManager
Client
TaskManagers
Apache Zookeeper™
1. Take snapshots
2. Persist snapshots
3. Send handles to JM
4. Create global checkpoint
5. Persist global checkpoint
6. Write handle to ZooKeeper

TL;DL
 Job recovery mechanism with low latency
and high throughput
 Exactly one processing semantics
 No single point of failure
 Flink will always keep processing
your data
31

Más contenido relacionado

La actualidad más candente

Apache Flink Berlin Meetup May 2016Stephan Ewen

Gelly-Stream: Single-Pass Graph Streaming Analytics with Apache FlinkVasia Kalavri

K. Tzoumas & S. Ewen – Flink Forward KeynoteFlink Forward

Unified Stream and Batch Processing with Apache FlinkDataWorks Summit/Hadoop Summit

Flink Forward Berlin 2017: Pramod Bhatotia, Do Le Quoc - StreamApprox: Approx...Flink Forward

Apache Flink at Strata San Jose 2016Kostas Tzoumas

Till Rohrmann - Dynamic Scaling - How Apache Flink adapts to changing workloadsFlink Forward

Flink Forward Berlin 2017: Piotr Wawrzyniak - Extending Apache Flink stream p...Flink Forward

Alexander Kolb – Flink. Yet another Streaming Framework?Flink Forward

Stateful Distributed Stream ProcessingGyula Fóra

Data Stream Processing with Apache FlinkFabian Hueske

Christian Kreuzfeld – Static vs Dynamic Stream ProcessingFlink Forward

Flink Forward Berlin 2017: Dongwon Kim - Predictive Maintenance with Apache F...Flink Forward

Flink Forward SF 2017: Stephan Ewen - Convergence of real-time analytics and ...Flink Forward

Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)Apache Flink Taiwan User Group

Flink Streaming @BudapestDataGyula Fóra

Flink Forward SF 2017: Cliff Resnick & Seth Wiesman - From Zero to Streami...Flink Forward

Flink Forward SF 2017: Srikanth Satya & Tom Kaitchuck - Pravega: Storage Rei...Flink Forward

Continuous Processing with Apache Flink - Strata London 2016Stephan Ewen

Flink Forward Berlin 2017: Jörg Schad, Till Rohrmann - Apache Flink meets Apa...Flink Forward

La actualidad más candente (20)

Apache Flink Berlin Meetup May 2016

Gelly-Stream: Single-Pass Graph Streaming Analytics with Apache Flink

K. Tzoumas & S. Ewen – Flink Forward Keynote

Unified Stream and Batch Processing with Apache Flink

Flink Forward Berlin 2017: Pramod Bhatotia, Do Le Quoc - StreamApprox: Approx...

Apache Flink at Strata San Jose 2016

Till Rohrmann - Dynamic Scaling - How Apache Flink adapts to changing workloads

Flink Forward Berlin 2017: Piotr Wawrzyniak - Extending Apache Flink stream p...

Alexander Kolb – Flink. Yet another Streaming Framework?

Stateful Distributed Stream Processing

Data Stream Processing with Apache Flink

Christian Kreuzfeld – Static vs Dynamic Stream Processing

Flink Forward Berlin 2017: Dongwon Kim - Predictive Maintenance with Apache F...

Flink Forward SF 2017: Stephan Ewen - Convergence of real-time analytics and ...

Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)

Flink Streaming @BudapestData

Flink Forward SF 2017: Cliff Resnick & Seth Wiesman - From Zero to Streami...

Flink Forward SF 2017: Srikanth Satya & Tom Kaitchuck - Pravega: Storage Rei...

Continuous Processing with Apache Flink - Strata London 2016

Flink Forward Berlin 2017: Jörg Schad, Till Rohrmann - Apache Flink meets Apa...

Destacado

Jim Dowling – Interactive Flink analytics with HopsWorks and ZeppelinFlink Forward

Fabian Hueske – Juggling with Bits and BytesFlink Forward

Anwar Rizal – Streaming & Parallel Decision Tree in FlinkFlink Forward

Kamal Hakimzadeh – Reproducible Distributed ExperimentsFlink Forward

Flink Case Study: Bouygues TelecomFlink Forward

Flink 0.10 @ Bay Area Meetup (October 2015)Stephan Ewen

Sebastian Schelter – Distributed Machine Learing with the Samsara DSLFlink Forward

Fabian Hueske – Cascading on FlinkFlink Forward

Maximilian Michels – Google Cloud Dataflow on Top of Apache FlinkFlink Forward

Assaf Araki – Real Time Analytics at ScaleFlink Forward

Apache Flink Training: System OverviewFlink Forward

Vasia Kalavri – Training: Gelly School Flink Forward

Introduction to Apache Flink - Fast and reliable big data processingTill Rohrmann

Albert Bifet – Apache Samoa: Mining Big Data Streams with Apache FlinkFlink Forward

Mikio Braun – Data flow vs. procedural programming Flink Forward

Marc Schwering – Using Flink with MongoDB to enhance relevancy in personaliza...Flink Forward

Moon soo Lee – Data Science Lifecycle with Apache Flink and Apache ZeppelinFlink Forward

Apache Flink Training: DataStream API Part 1 BasicFlink Forward

Flink Apachecon PresentationGyula Fóra

Slim Baltagi – Flink vs. SparkFlink Forward

Destacado (20)

Jim Dowling – Interactive Flink analytics with HopsWorks and Zeppelin

Fabian Hueske – Juggling with Bits and Bytes

Anwar Rizal – Streaming & Parallel Decision Tree in Flink

Kamal Hakimzadeh – Reproducible Distributed Experiments

Flink Case Study: Bouygues Telecom

Flink 0.10 @ Bay Area Meetup (October 2015)

Sebastian Schelter – Distributed Machine Learing with the Samsara DSL

Fabian Hueske – Cascading on Flink

Maximilian Michels – Google Cloud Dataflow on Top of Apache Flink

Assaf Araki – Real Time Analytics at Scale

Apache Flink Training: System Overview

Vasia Kalavri – Training: Gelly School

Introduction to Apache Flink - Fast and reliable big data processing

Albert Bifet – Apache Samoa: Mining Big Data Streams with Apache Flink

Mikio Braun – Data flow vs. procedural programming

Marc Schwering – Using Flink with MongoDB to enhance relevancy in personaliza...

Moon soo Lee – Data Science Lifecycle with Apache Flink and Apache Zeppelin

Apache Flink Training: DataStream API Part 1 Basic

Flink Apachecon Presentation

Slim Baltagi – Flink vs. Spark

Similar a Till Rohrmann – Fault Tolerance and Job Recovery in Apache Flink

Flink 0.10 - Upcoming FeaturesAljoscha Krettek

Ufuc Celebi – Stream & Batch Processing in one SystemFlink Forward

Strata Singapore: GearpumpReal time DAG-Processing with Akka at ScaleSean Zhong

Introduction to Apache Apex - CoDS 2016Bhupesh Chawda

Real-time Stream Processing using Apache ApexApache Apex

Fyber - airflow best practices in productionItai Yaffe

Unified Stream Processing at Scale with Apache Samza by Jake Maes at Big Data...Big Data Spain

Apache Apex: Stream Processing Architecture and ApplicationsThomas Weise

Apache Apex: Stream Processing Architecture and Applications Comsysto Reply GmbH

Apache Flink Internals: Stream & Batch Processing in One System – Apache Flin...ucelebi

Streaming Dataflow with Apache Flink huguk

Unified Stream Processing at Scale with Apache Samza - BDS2017Jacob Maes

Client-Server-Kommunikation mit dem Command Patternpgt technology scouting GmbH

Introduction to Apache Apex by Thomas WeiseBig Data Spain

Oracle Drivers configuration for High AvailabilityLudovico Caldara

When Web Services Go BadSteve Loughran

Intro to Apache Apex - Next Gen Platform for Ingest and TransformApache Apex

Heart of the SwarmKit: Store, Topology & Object ModelDocker, Inc.

Apache Big Data EU 2016: Next Gen Big Data Analytics with Apache ApexApache Apex

Container Orchestration from Theory to PracticeDocker, Inc.

Similar a Till Rohrmann – Fault Tolerance and Job Recovery in Apache Flink (20)

Flink 0.10 - Upcoming Features

Ufuc Celebi – Stream & Batch Processing in one System

Strata Singapore: GearpumpReal time DAG-Processing with Akka at Scale

Introduction to Apache Apex - CoDS 2016

Real-time Stream Processing using Apache Apex

Fyber - airflow best practices in production

Unified Stream Processing at Scale with Apache Samza by Jake Maes at Big Data...

Apache Apex: Stream Processing Architecture and Applications

Apache Flink Internals: Stream & Batch Processing in One System – Apache Flin...

Streaming Dataflow with Apache Flink

Unified Stream Processing at Scale with Apache Samza - BDS2017

Client-Server-Kommunikation mit dem Command Pattern

Introduction to Apache Apex by Thomas Weise

Oracle Drivers configuration for High Availability

When Web Services Go Bad

Intro to Apache Apex - Next Gen Platform for Ingest and Transform

Heart of the SwarmKit: Store, Topology & Object Model

Apache Big Data EU 2016: Next Gen Big Data Analytics with Apache Apex

Container Orchestration from Theory to Practice

Más de Flink Forward

Building a fully managed stream processing platform on Flink at scale for Lin...Flink Forward

Evening out the uneven: dealing with skew in FlinkFlink Forward

“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...Flink Forward

Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...Flink Forward

Introducing the Apache Flink Kubernetes OperatorFlink Forward

Autoscaling Flink with Reactive ModeFlink Forward

Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...Flink Forward

One sink to rule them all: Introducing the new Async SinkFlink Forward

Tuning Apache Kafka Connectors for Flink.pptxFlink Forward

Flink powered stream processing platform at PinterestFlink Forward

Apache Flink in the Cloud-Native EraFlink Forward

Where is my bottleneck? Performance troubleshooting in FlinkFlink Forward

Using the New Apache Flink Kubernetes Operator in a Production DeploymentFlink Forward

The Current State of Table API in 2022Flink Forward

Flink SQL on Pulsar made easyFlink Forward

Dynamic Rule-based Real-time Market Data AlertsFlink Forward

Exactly-Once Financial Data Processing at Scale with Flink and PinotFlink Forward

Processing Semantically-Ordered Streams in Financial ServicesFlink Forward

Tame the small files problem and optimize data layout for streaming ingestion...Flink Forward

Batch Processing at Scale with Flink & IcebergFlink Forward

Más de Flink Forward (20)

Building a fully managed stream processing platform on Flink at scale for Lin...

Evening out the uneven: dealing with skew in Flink

“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...

Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...

Introducing the Apache Flink Kubernetes Operator

Autoscaling Flink with Reactive Mode

Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...

One sink to rule them all: Introducing the new Async Sink

Tuning Apache Kafka Connectors for Flink.pptx

Flink powered stream processing platform at Pinterest

Apache Flink in the Cloud-Native Era

Where is my bottleneck? Performance troubleshooting in Flink

Using the New Apache Flink Kubernetes Operator in a Production Deployment

The Current State of Table API in 2022

Flink SQL on Pulsar made easy

Dynamic Rule-based Real-time Market Data Alerts

Exactly-Once Financial Data Processing at Scale with Flink and Pinot

Processing Semantically-Ordered Streams in Financial Services

Tame the small files problem and optimize data layout for streaming ingestion...

Batch Processing at Scale with Flink & Iceberg

Último

Decarbonising Buildings: Making a net-zero built environment a realityIES VE

Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3

Time Series Foundation Models - current state and future directionsNathaniel Shimoni

A Framework for Development in the AI AgeCprime

DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy

Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq

Scale your database traffic with Read & Write split using MySQL RouterMydbops

TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey

Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA

2024 April Patch TuesdayIvanti

A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3

Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3

DevEX - reference for building teams, processes, and platformsSergiu Bodiu

Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Scott Andery

Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda

From Family Reminiscence to Scholarly Archive .Alan Dix

Rise of the Machines: Known As Drones...Rick Flair

New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada

Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited

Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen

Till Rohrmann – Fault Tolerance and Job Recovery in Apache Flink

1. Fault Tolerance and Job Recovery in Apache Flink™ Till Rohrmann trohrmann@apache.org @stsffap

2. 2

3. Better be safe than sorry  Failures will happen  EMC estimated $1.7 billion costs due to data loss and system downtime  Recovery will save you time and costs  Switch between algorithms  Live upgrade of your system 3

4. Fault Tolerance 4

5. Fault tolerance guarantees  At most once • No guarantees at all  At least once • For many applications sufficient  Exactly once  Flink provides all guarantees 5

6. Checkpoints  Consistent snapshots of distributed data stream and operator state 6

7. Barriers  Markers for checkpoints  Injected in the data flow 7

8. 8  Alignment for multi-input operators

9. Operator State  Stateless operators  System state  User defined state 9 ds.filter(_ != 0) ds.keyBy(0).window(TumblingTimeWindows.of(5, TimeUnit.SECONDS)) public class CounterSum implements RichReduceFunction<Long> { private OperatorState<Long> counter; @Override public Long reduce(Long v1, Long v2) throws Exception { counter.update(counter.value() + 1); return v1 + v2; } @Override public void open(Configuration config) { counter = getRuntimeContext().getOperatorState(“counter”, 0L, false); } }

10. 10

11. 11

12. 12

13. 13

14. Advantages  Separation of app logic from recovery • Checkpointing interval is just a config parameter  High throughput • Controllable checkpointing overhead  Low impact on latency 14

15. 15

16. Cluster High Availability 16

17. Without high availability 17 JobManager TaskManager

18. With high availability 18 JobManager TaskManager Stand-by JobManager Apache Zookeeper™ KEEP GOING

19. Persisting jobs 19 JobManager Client TaskManagers Apache Zookeeper™ Job 1. Submit job

20. Persisting jobs 20 JobManager Client TaskManagers Apache Zookeeper™ 1. Submit job 2. Persist execution graph

21. Persisting jobs 21 JobManager Client TaskManagers Apache Zookeeper™ 1. Submit job 2. Persist execution graph 3. Write handle to ZooKeeper

22. Persisting jobs 22 JobManager Client TaskManagers Apache Zookeeper™ 1. Submit job 2. Persist execution graph 3. Write handle to ZooKeeper 4. Deploy tasks

23. Handling checkpoints 23 JobManager Client TaskManagers Apache Zookeeper™ 1. Take snapshots

24. Handling checkpoints 24 JobManager Client TaskManagers Apache Zookeeper™ 1. Take snapshots 2. Persist snapshots 3. Send handles to JM

25. Handling checkpoints 25 JobManager Client TaskManagers Apache Zookeeper™ 1. Take snapshots 2. Persist snapshots 3. Send handles to JM 4. Create global checkpoint

26. Handling checkpoints 26 JobManager Client TaskManagers Apache Zookeeper™ 1. Take snapshots 2. Persist snapshots 3. Send handles to JM 4. Create global checkpoint 5. Persist global checkpoint

27. Handling checkpoints 27 JobManager Client TaskManagers Apache Zookeeper™ 1. Take snapshots 2. Persist snapshots 3. Send handles to JM 4. Create global checkpoint 5. Persist global checkpoint 6. Write handle to ZooKeeper

28. Conclusion 28

29. 29

30. 30

31. TL;DL  Job recovery mechanism with low latency and high throughput  Exactly one processing semantics  No single point of failure  Flink will always keep processing your data 31

32. flink.apache.org @ApacheFlink

Notas del editor

30 nodes, 4 cores, 15 GB Flink 720,000 events per second per core 690,000 with checkpointing activated Storm With at-least-once: 2,600 events per second per core
GCE 30 instances with 4 cores and 15 GB of memory each. Flink master from July, 24th, Storm 0.9.3. All the code used for the evaluation can be found here. Flink 1.5 million elements per second per core Aggregate Throughput in cluster 182 million elements per second. Storm 82,000 elements per second per core Aggregate 0.57 million elements per second Storm with Acknowledge 4,700 elements per second per core, Latency 30-120 milliseconds Trident: 75,000 elements per second per core
Flink 0 Buffer timeout: latency median 0 msec, 99 %tile 20 msec 24,500 events per second per core

Till Rohrmann – Fault Tolerance and Job Recovery in Apache Flink

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Destacado

Destacado (20)

Similar a Till Rohrmann – Fault Tolerance and Job Recovery in Apache Flink

Similar a Till Rohrmann – Fault Tolerance and Job Recovery in Apache Flink (20)

Más de Flink Forward

Más de Flink Forward (20)

Último

Último (20)

Till Rohrmann – Fault Tolerance and Job Recovery in Apache Flink

Notas del editor