Se ha denunciado esta presentación.
Utilizamos tu perfil de LinkedIn y tus datos de actividad para personalizar los anuncios y mostrarte publicidad más relevante. Puedes cambiar tus preferencias de publicidad en cualquier momento.

Event streaming (Symfony World 2020)

164 visualizaciones

Publicado el

Event Streaming: An alternative to CRUD & Batch processing. Some of the things that will go wrong with distributed systems.

Publicado en: Ingeniería
  • Sé el primero en comentar

  • Sé el primero en recomendar esto

Event streaming (Symfony World 2020)

  1. 1. @samuelroze Event Streaming Some things you want to know about. @samuelroze
  2. 2. @samuelroze Introduction • My name is Samuel Rozé, I am VPoE at Birdie Care. Core Team member of Symfony, for my work on Messenger. • This is an architecture talk. • We will briefly discuss the values of using stream processing event streaming. • We will see the consequences of living the dream of managing distributed systems. TL;DR: plenty of things will go wrong.
  3. 3. @samuelroze@samuelroze 1. Why …is event streaming even interesting?
  4. 4. @samuelroze Your product works… you split your services.
  5. 5. @samuelroze Your services are talking to each other.
  6. 6. @samuelroze Now you need to introduce targeted discounts…
  7. 7. @samuelroze Now you need to introduce targeted discounts…
  8. 8. @samuelroze How will this service get its data? 1. Pull via an API • A lot of data will be moved each time the “discount” service computes discounts for a customer. • “Discounts” is able to work only when the 3 other services are available (cascading failures). • “Discounts” needs to know about where are the other services and how to talk to them. 2. Using “batch” • Potentially contains loads of duplicated information (full load each time or the period is “over X days”) • Not real-time. “Wait a few days for your marketing preferences to be propagated” • A lot can go wrong with all services properly creating exports every night.
  9. 9. @samuelroze Event streaming • Events are flowing in real-time, from and to multiple services. • To receive a specific event, services don’t have to know who is sending events, just that they can expect these messages. • Much higher availability because data goes to the service that requires it when they are online. • (When bus does persistence) New consumers create their context by going through all the events that have happened in the system. • Writing code that works well with the nature of the distributed system is hard. • You need a real governance about how is the message bus used, they are your new API contracts.
  10. 10. @samuelroze Event streaming, as a diagram
  11. 11. @samuelroze Everything we are going to talk about is true for…
  12. 12. @samuelroze@samuelroze 2. What will go wrong? It’s not “if”.
  13. 13. @samuelroze Let’s start with a simple use-case. Here we write on the `Basket` entity for example
  14. 14. @samuelroze Your message is sent to a queue
  15. 15. @samuelroze@samuelroze Problem A Are you sure that the message 
 was sent to the queue?
  16. 16. @samuelroze A. Are you sure that your message has been sent?
  17. 17. @samuelroze A. Are you sure that your message has been sent?
  18. 18. @samuelroze A. Distributed transactions are not really a thing.
  19. 19. @samuelroze A. What might happen if we don’t care about that? • Your local “basket” table might have the new product but no worker receives the “ProductAddedToBasket” event. • Your local “basket” table might NOT have the new product but workers have received the “ProductAddedToBasket” event (most likely if you use database transactions for the entire request) • Imagine the event being about “payment successful” or even potentially life- changing like “fall in the home has been detected”… 😬
  20. 20. @samuelroze A. The outbox pattern
  21. 21. @samuelroze A. Publishing messages to bus consistently • In a nutshell, write your message & side effects to your database as part of one transaction and then get something else to pull the message from the database and send it to your queue. • With Symfony, the simplest is actually to use the Doctrine transport for Symfony Messenger, with the doctrine transaction middleware. • Alternatively, you can use a dedicated library for this. • EventSaucePHP/DoctrineOutboxMessageDispatcher • italolelis/outboxer
  22. 22. @samuelroze A. Using the Doctrine transport
  23. 23. @samuelroze@samuelroze Problem B You will receive duplicated messages.
  24. 24. @samuelroze B. “At least once delivery”
  25. 25. @samuelroze B. What might happen if we don’t care about that? • You consume twice “ProductAddedToBasket”: the product is added twice instead of just once (as per the user request). • Depending on your business logic, it might be very important. For example, what if it is about “Money added to bank account” or “Medication dose taken”.
  26. 26. @samuelroze B. You need some idempotence. • You will receive the same message multiple times, it’s just a matter of time. • There isn’t much a framework could do, you own the business logic; you need to handle it by yourself. • Use an idempotency key. A key that represents a single message and allows you to know whether or not it’s been processed already. • (By the way, this also applies to HTTP requests. Stripe’s API is a good example.)
  27. 27. @samuelroze B. Using the idempotency key in the handler • One option is to have your idempotency key is part of your message. Your team needs to know why it is useful and how to use it.
  28. 28. @samuelroze B. How to use your “idempotency key”
  29. 29. @samuelroze B. How to use your “idempotency key”
  30. 30. @samuelroze@samuelroze Problem C Processing messages in parallel.
  31. 31. @samuelroze C. Concurrently processing messages
  32. 32. @samuelroze C. What might happen if we don’t care about that? • You will lose some state in whatever you updated based on the events, at some point. • The easiest solution: don’t process things concurrently. But, not really practical when things start to scale.
  33. 33. @samuelroze C. Locking! Optimistic vs Pessimistic The optimist… • Assumes that everything will go right most of the time. • It validates that everything has happened as expected when writing its state to a consistent storage. • a.k.a. HTTP’s If-Match, … The pessimist… • Believes that I most cases, this won’t work. • Before doing any work, it ensures nobody else is doing it. • a.k.a. “mutex”, “advisory locks”, etc…
  34. 34. @samuelroze C. Pessimistic locking with Symfony Lock
  35. 35. @samuelroze C. Optimistic locking with Doctrine’s “versions"
  36. 36. @samuelroze C. Optimistic locking with Doctrine’s “versions" What’s happening behind the scene with optimistic locking:
  37. 37. @samuelroze@samuelroze Problem D Message ordering
  38. 38. @samuelroze D. Know when there is no ordering guarantee
  39. 39. @samuelroze D. What might happen if we don’t care about that? • Hopefully your business logic doesn’t rely too much on the events being ordered… make sure this is true. • For example, we rely on “access_granted” and “access_revoked” events to configure some permission rules. If they are consumed in the wrong order… this is a different meaning 💥
  40. 40. @samuelroze D. There are buses that guarantee order
  41. 41. @samuelroze D. They scale using partitions
  42. 42. @samuelroze D. Order guaranteed means blocking messages.
  43. 43. @samuelroze D. For the infrastructure to guarantee ordering… • You need a message bus that supports it (Kafka, SQS Fifo, Kinesis, etc…). • You need to carefully design your partitions (or “shards”) so that you know all message of a specific aggregate will always go to the same partition (a.k.a. routing keys). • You need to carefully manage all the errors. You can’t afford a wrong message blocking an entire topic. But you can’t really post-pone only one single message…
  44. 44. @samuelroze@samuelroze To wrap up… A few learnings (hopefully).
  45. 45. @samuelroze We’ve seen a few ways it can go wrong. • When publishing a message to a bus. Outbox pattern FTW. • When receiving multiple time the same message. Idempotence FTW. • When concurrently consuming messages.
 You need to use optimistic or pessimistic locking. • You can request ordering from your infrastructure. But needs careful partition design & error management.
  46. 46. @samuelroze Thank you! @samuelroze
  47. 47. @samuelroze Want to read more? • Martin Kleppman’s book. • idempotency-key/ •