You want to use Kafka? Or do you really need a Queue?

CodeOpinion

Просмотров 23 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 31 май 2024
Do you want to use Kafka? Or do you need a message broker and queues? While they can seem similar, they have different purposes. I'm going to explain the differences, so you don't try to brute force patterns and concepts in Kafka that are better used for a message broker.
🔗 EventStoreDB
eventsto.re/codeopinion
🔔 Subscribe: / @codeopinion
💥 Join this channel to get access to source code & demos!
/ @codeopinion
🔥 Don't have the JOIN button? Support me on Patreon!
/ codeopinion
📝 Blog: codeopinion.com
👋 Twitter: / codeopinion
✨ LinkedIn: / dcomartin
📧 Weekly Updates: mailchi.mp/63c7a0b3ff38/codeo...
0:00 Intro
0:43 Log
2:48 Messages
5:19 Broker
7:29 Partitions
#eventdrivenarchitecture #softwarearchitecture #softwaredesign
Наука

Комментарии • 53

@andreipacurariu2013 Год назад ⁺³⁵
Another great video Derek.
For what it's worth, the metamodel that I always use to explain messaging to people is this: Message break down into two types - Requests and Events. Requests further break down into two types - Commands and Queries. So, from a modelling perspective, Messages and Requests are abstract concepts, Queries, Commands and Events are concrete. You can provide a nice UML diagram that shows this.
Requests are owned by the consumer therefore you could have multiple logical producers but a single logical consumer. Events are owned by the publisher therefore you can have a single logical producer but zero or many logical consumers. Requests are unidirectional - they require for there to be a consumer, a specific consumer and exactly one consumer on the end of the line to process the request. The producer is aware of the consumer and it is logically coupled with it. They producer needs the consumer. Due to this coupling, requests are blocking for the producer because the producer requests something that it needs in order to continue the business process it was executing. Whether it requested for something to happen, via a command, or it requested some information via a query, the producer needs a response before it can continue. Events on the other hand are broadcast - they do not require for there to be any consumer or there can be multiple consumers. The producer is not aware nor interested in these consumers. The producer has no expectations of these consumers. Events are fire-and-forget. As part of the business process it was executing, the producer can publish an event and continue with what it was doing without expecting anything from anyone.
So indeed, the idea that Events break down into Messages and Commands is completely wrong. Messages break down into Requests (Commands and Queries) and Events. Requests ask someone specific for something specific - whether it is to do something or to provide some information. Requests imply coupling and are blocking. Events notify the world and whoever may or may not be interested that something has happened without expecting anything in return. Events do not imply coupling, they imply that the producer is completely de-coupled and unaware of the consumers. Events are non-blocking.
The whole idea of a Command Event is idiotic and whoever came up with it needs to learn more about messaging and distributed systems in particular and about architecture in general. What they need not do is publish articles that confuse people when the general awareness and knowledge of messaging and distributed systems is already very low.
People also need to learn about logical versus system vs phyisical boundaries. This is really key to actually understanding and defining architectures.
@andreyklimenko7588 Год назад ⁺³
One of the best explanations that I've seen on this topic. There is so much confusion on messaging topic, especially after Kafka was widely adopted devs often see event streaming jsut as a high-throughput put queue.
@erickhernandezdarias7042 11 месяцев назад
No necesariamente el modelo de peticiones (comando o consulta) tendría que bloquear al productor, todo depende del diseño lógico de comunicación que se adopte para resolver un problema, tanto desde la perspectiva de un productor alojado en el frontend como de un productor alojado en el backend. Estos modelos de comunicación pueden imponer esperar una respuesta ya sea sincronizada (tal vez usando web services) o asíncronica usando websockets u otra tecnología similar. En cualquier caso, no es obligatorio asumir que usar peticiones deba ser bloqueante para el productor.
@sodrechavessodre6199 Год назад ⁺⁵
Kafka has many configuration options. It can be configured to act as message broker as you described. now the question is "should it be done ?" I don't know, but it sure can be done.
@MrHuno92 Год назад ⁺⁶
8:00 I think that there can be multiple Consumers for a partition in kafka. They just have to be assigned to different Consumer Groups. You can't have two Consumers assigned to the same Partition within the same Consumer Group. That actually helps the process messages only once if you have only one Consumer Group. Kafka actually stores how far you've read into a partition. Its for a consumer to decide when to "ACK" how far it read through. There is also a built in hashing algorithm called murmur that helps spreading messages across partitions efficiently.
@CodeOpinion Год назад ⁺¹
Correct. Single consumer (per consumer group) can be assigned to a partition.
@baseman00 Год назад ⁺⁴
Really glad you got a chance to cover Kafka and really like you're suggesting it's not for everything. But there are somethings that when explained seem confusing. My concern is where consumer and ability to ack that state has been persisted correctly (not still attempting to be persisted as in competing consumer scenarios)
Kafka is first designed (and most well known) to handle at-least-once message delivery. In this, a partition is acting as a means for consensus, so messages can be handled in correct order (tracked by Kafka offset).
In a competing consumer you have multiple opportunities for downstream state to be persisted. This leads to people to start writing logic in their consumers to handle all sorts of async blocking scenarios (frequently leading to messy developer assumptions). Instead Kafka users simply just say to the Kafka partition they're acking with "you're ready to pull the next message". This is why people use Kafka for communicating in transactional scenarios like finance and across data centers (where arrival of messages can pollute data communicated to another boundary to reliably persist). It's also why tools like connector solutions are heavily built around Kafka ecosystem (because they reliably replicate data between multiple locations). It doesn't mean Kafka is blocked like what was encountered in the service bus days (blocking the world)... We can still have multiple consumers by providing multiple partitions defined by unique key to feed partitions(... Where a good topic to cover is aggregate root IDs)
At-most-once delivery semantics where a queue pops / fires-and-forgets you'll find competing consumers more often because focus is not guarantees on data arriving and persisting in order... It's notification (like broadcast messages). Think communication bridges for notification (mqtt and otherwise) or telemetry data where data that's lost or polluted by a dirty read/write/retry on restart isn't a big deal.
@CodeOpinion Год назад
All on board with what you're saying. I did mention partitions and a single consumer (within a consumer group) will process messages in order in this video. I also talked about this in a video related to message ordering. ruclips.net/video/ILEb5LsSf5w/видео.html
I generally try and avoid message ordering for when consuming messages as a part of workflow. Generally workflow can be kicked off by many different messages and you can use them as "policy". I illustrated this in this video: ruclips.net/video/rO9BXsl4AMQ/видео.html
@baseman00 Год назад
@@CodeOpinion Saw this video and it's a really good one. But when in data warehousing keeping consensus between different services you cannot ensure that with at-most-once message delivery. That's what I'm pointing out here... the "resilience" factor needed for at-least-once message delivery is more than just notification... it forms the basis for ACID 2.0 in eventually consistent setups. It's hard to beat Kafka for resilience.
@mathiasdemestral9411 Год назад ⁺¹
I really liked the video, since I recently read in twitter that kafka is not a proper message broker, but didn't really know the difference. Just to point out, Kafka have ack feature as well, the consumers can commit the offset of the last message they consumed and Kafka stores it in a special topic, impeding that a well configured consumer, of the same consumer group, process old messages from the topic, only consumers from a new consumer group would read old messages.
@scottspitlerII Год назад ⁺¹
Very well spoken examinations, your knowledge and channel are very well refined :) good job ❤️
@CodeOpinion Год назад
Glad you think so!
@frozencanuck3521 Год назад ⁺¹
Very well explained. The distinction between a message being a command and event is critical to help make key architectural decision on how they are produced, stored, and consumed. “Command event” is an idea that only adds needless confusion. It more often than not leads to compromised solutions.
@MrGarkin Год назад
So kafka just calls "event' what you call "message". And call "message" what you call "event".
And calls "command event" what you call "command".
It's just a different ontology.
@FahmiNoorFiqri Год назад ⁺⁸
My workplace treat Kafka as "the correct broker for microservices" and ignores alternatives like RabbitMQ or ampq. I don't even consider about commands and event, I usually treat them as the same. Thanks for the video!
@dayanshuwang2508 Год назад
what you described might be simple but concerning tho ha
@krozaine Год назад ⁺¹
Thank you once again on your clear thoughts on commands and events! The practical problem in industry I think is : Once a central Kafka infrastructure is setup, it is very easy to add new topics and start using it which makes friction to adopt very less.
On the Low Level Development side, tried and tested integrations with established Kafka infra for messages in an already existing micro-service gives reliability & confidence and near-zero code changes that piece of producer/consumer module. This would otherwise require development efforts for any other broker integration (like ActiveMQ, for example)
For such situations, people generally convince themselves with such technical jugglery of words by calling them Command Events or treating them as same or stop caring about the implementation and keeping the logical hop of this distinction (of commands and events) in their minds
@CodeOpinion Год назад
Yes, agreed. Shoe horning concepts into a situation that doesn't support it exactly, but getting as close as you can.
@Boss-gr4jw Год назад
So true. I have been telling others the same things when someone mentions Kafka or is planning to use it.
@CodeOpinion Год назад
Yup. Treat it as what it is, a partitioned log. If you have that need, then use it. The problem I have, which I'm explaining (or trying to) in this video is it's not a queue. So don't try and treat or force it to try and be one.
@digitalhome6575 Год назад ⁺¹
Good explanation, as always, but since you namedropped Kafka, it'd make some sense to namedrop what you consider to be some good message brokers too? Azure /Service Bus/ Queues? RabbitMQ? Perhaps a good follow-up video to this :)
@YazanAlaboudi Год назад
I remember going on that site and seeing the term "Command Event" and immediately closing the website. I agree with you. It's absolutely garbage
@MiningForPies Год назад
Using the outbox pattern with RabbitMQ you can reproduce the events for a new consumer as well. That’s my plan. Can’t use Kafka as there’s no support for it where I work. It’s taken a long time to get them to move away from monolithic massive transactions.
@Fred-yq3fs Год назад ⁺¹
So Kafka is good to let multiple microservices react to the same event, but it's not too good if an event needs to be consumed exactly once. Correct? In that case, what would your recommendation be? Use a message queue for such "event-commands"? or reconsider the design? (i.e. why a command is required rather than an event)
@essamal-mansouri2689 Год назад ⁺¹
I feel almost personally responsible for inspiring this video because it was exactly related to my question on one of your previous videos. That said, i think i still disagree that i “need” a message broker even if a message broker is technically more appropriate model. At my current company we have this problem of “technology proliferation “ where we find ourselves paying for all kinds of software that isn’t used except rarely or used incorrectly anyways so i feel like just “making it work” with what we have which is kafka seems like an ok compromise to avoid adding complexity. Plus in our case, users require that those commands need to be audited later so a log also seems appropriate.
@CodeOpinion Год назад
Thanks for the comment! I can absolutely understand the problem of "technology proliferation" and using existing infrastructure. However, I just don't think trying to shove a round peg in a square hole leads to a good outcome. Meaning, don't try and do workflow orchestration and come up with the concept of a "command event". Don't try and use concepts and patterns that work well with a queue with something that just isn't a queue.
@Unleash132 Год назад ⁺²
Thanks for the video! Very Informative.
I was wondering, have you ever tried the project "MassTransit" and what do you think about it?
@CodeOpinion Год назад ⁺²
Yes, MassTransit is a good messaging library that supports many of the different patterns/concepts you'll find in a message and event driven architecture.
@MrDomenic123 Год назад
Thanks for this insightful video 🙂 I highly appreciate your videos as they cover so much useful information about topics I am also really interested in.
I got one question though: If you send a command using a message broker, how does the service sending that command receive feedback from the service processing it. I heard about the request-reply pattern (pseudosynchronous approach). Furthermore, I also read about approaches that "commands never fail" (from a business perspective) in which case you can always assume that your command will be successfully processed.
@CodeOpinion Год назад ⁺¹
Yes, check out request-reply here: ruclips.net/video/6UC6btG3wVI/видео.html
Trying to achieve the concept of commands not failing can be pretty helpful. Its not about trivial validation but rather often times business rules that need to occur in concurrent environments.
@everydreamai Год назад ⁺¹
Well, RUclips decided to eat my comment, don't have the will to rewrite it, but briefly Kafka gives you a lot of power to change how to produce and consume events at runtime in terms of the broker and without changing its infrastructure. Adding consumer groups and reading from the start of the stream, etc. That can be extremely useful, especially when you want to later add services without having to write special code to load historic data from another store. You can write the new consumer service and have it read from the start of the stream and let it catch up.
@CodeOpinion Год назад
Agreed, being able to start at the beginning of a stream is awesome. However, trying to shove everything into a partitioned log and view everything as an event is getting a lot of people into a pile of trouble.
@TL-zy8pe Год назад ⁺⁵
Thanks Derek for another video!:)
Great explaination of a difference between commands and events but when it comes to technologies behind them I think it misses some points.
Kafka also has a way to acknowledge messages - consumers commit processed messages which makes commit offset progress forward but the key thing is that it only happens within a consumer GROUP. That's true that you can add a new consumer group which could start consuming messages from the beginning but as long as you're using the same consumer group you can keep track of what messages where already consumed so you won't process the same message multiple times.
That's some assumption you need to make but you also make similar assumption in case of events - there should be only one publisher. However you cannot technically enforce it in Kafka to allow only one instance of publisher to send message to given topic. Does it make Kafka bad for handling events?
I'm not saying that Kafka is the best way to handle commands but I wouldn't avoid it at all costs just because you may misconfigure consumer groups and read commands again (it doesn't even imply that reading commands again will make them being processed). Maybe in some cases it would be even beneficial to keep tech stack simpler and handle both commands and events using single mechanism in smaller projects.
@CodeOpinion Год назад
Totally get that it may be "simpler" to leverage a single piece of infrastructure rather than two. Totally makes sense. However, my issue is when you start forcing messaging patterns that are typical with queues on top of a a partition event log. Then you end up with even more complexity.
I did mention consumer groups and partitions. I realize I didn't emphasize enough possibly. I did mention it's benefits more in another video about message ordering.
Also, it's about semantics for me. I want a single consumer of a command. And only a single consumer. That consumer will be in the logical boundary that owns the command definition/schema. When a command is published, I know there is a single consumer for it because that's the point of sending the command. To invoke behavior. Now if you can enforce that from a dev experience, process, etc., point of view, than all good by me.
@alexanderbikk8055 Год назад
Hmm interesting. But I still can use Events with Message broker using Topics.
So in general we can use it both for commands and evets. The difference that we can't store data as in log based message brokers like Kafka. And we can't processe streams.
But I agree when we need message command for things like request reply it's more naturaly to use ServiceBus or RabbitMq, or even simple queues rather then Kafka.
But if you already have Kafka for events it's difficult to say should we add brokers for commands like messages or try to use Kafka like mentioned in one of comments below.
@gertrude1310 Год назад
You can actually have multiple consumer instances process event logs from the same topic if they are part of a common consumer group.
@gertrude1310 Год назад
I guess you addressed that later on and mentioned how it's bound to the partitions #. 🤙
@rezcan Год назад
Thanks!
@CodeOpinion Год назад
Thank you!
@LawZist 6 месяцев назад
Hi Derek, as always - love your videos!
Q: i have a command workflow where the consumer sending an event to another service and get the result later in the reply queue. Once the result came back i have some data i need from the original request, so my question is, where do i store the request data? Should i save it in memory? What can happen if i have multiple consumer instances?
@CodeOpinion 6 месяцев назад
You'd have to persist it somewhere durable likely. Most orchestration type libraries or services have the ability to persist state for long running workflows. For example, in the .NET Space, NServiceBus, Mass Transit, and Wolverine all provide this type of functionality to maintain state.
@LawZist 6 месяцев назад
Thanks for the quick reply! Would it be reasonable to use RPC call instead (with rabbit direct reply) and wait for the response (let assume it will take 1 minute) in some async fashion? (Nodejs promise, go routine, async Task C# that is waiting for the response but do not block)?
@colonelsanders2038 10 месяцев назад
Question: You seem to be implying that you should use a queue for commands, but it seems to me that the order of processing is usually very important in commands. If you have a topic like "orders", you want to make sure that PlaceOrder and CancelOrder are processed in order. A queue with competing consumers doesn't guarantee this. A kafka topic using the order id as a key would.
Don't we usually want strict ordering of processing of an aggregate? How does that fit into your advice to use a queue for commands?
@suikast420 Год назад
For most of the cases NATS can be suiatble choise.
@essamal-mansouri2689 Год назад
Hey I know this video is a bit old but I'm hoping you're still checking the comments... Do you have an opinion on Apache Pulsar? It supports both "topologies" where you can use it similar to Kafka or similar to RabbitMQ. Consumers can also go back and replay events similar to Kafka but generally you can configure a topic to be "exclusive" with only 1 consumer allowed and it makes additional ordering guarantees there. Last but not least, there's "proxies" that can be used as a compatibility layer for applications that are designed to work with Kafka or with RabbitMQ, etc.
I really think this is the best of both worlds and would allow both events and commands on the same broker. What do you think?
@CodeOpinion Год назад
Thanks for the comment! I'm aware of pulsar but haven't used it in any meaningful way. I'll have to dig into it more, especially around competing consumers.
@basilthomas790 Год назад
Once again excellent video but I have 2 concerns:
1) publishing commands: in CQRS, commands are executed by a command processor and upon a successful state change, an event(s) is published. Therefore to me, commands should be executed immediately by a command processor or queued to the command processor for execution asynchronously. I only publish events to be executed asynchronously and am only concerned if the message does not get accepted by message broker for processing which I can process as an exception. I only used RPC calls like REST or gRPC to process my commands or queries.
2) The basic architecture of Kafka is totally geared towards fully distributed asynchronous streaming of events. What Kafka is extremely good at is pull based consumers that subscribe to events and only process the events when the consumer is ready to accept another event for processing. Most message brokers on the other hand are centrally pushed message based with more complicated message handling patterns which Kafka clearly minimized in their design for extremely high performance. While Kafka can work as a message broker, I would use it purely for high performance asynchronous event stream processing only.
@CodeOpinion Год назад
Thanks for the comment. Here's a few comments on your comment 😂.
"commands are executed by a command processor and upon a successful state change, an event(s) is published". Yes, a command is changing state, but it does not mean it's publishing an event. It can't but that has nothing to do with CQRS. In terms of async or non-async events, I generally always try and make them async as there are more options on how you handle failures of an individual consumer. Event consumers executing all within the same process/thread as the publisher can be problematic with failures as consumers aren't executing independently. As for Kafka, yes agree, the differences in push/pull (cursor) and retention are a big factor.
@basilthomas790 Год назад ⁺¹
@@CodeOpinion " a command is changing state, but it does not mean it's publishing an event. It can't but that has nothing to do with CQRS"
Correct but why on earth would you use CQRS without raising an event on a state change if the whole purpose of CQRS is reading and writing of state as two separate behaviors.
Once you raise an event on a state change, you can use CQRS in a real event driven architecture which is way more important than a single minded view of CQRS as primarily a design pattern separating your read from writes. The CQRS design pattern is a building block just like Event Sourcing which to me means use as required and with caution.
Putting that aside, I really have a strong opinion that commands/queries should be executed and events should be published by the caller and while semantics may not be an issue for you, in an event driven architecture, publishing events clearly only matters such that the event gets received/acknowledged by the message broker or else a system exception should be thrown. From a coupling point of view, the event publisher has no care whether the event gets consumed and who consumes said event and how many times that event is consumed. Commands on the other hand only really matter if they are executed sync/async and state is changed as expected or a business error maybe/is raised if your command process/aggregate is not in an expected state. Queries are obviously always executed synchronously but there are ways they too can be executed async as well.
These are completely different concerns/messaging patterns and should be modelled separately and explicitly instead of confusingly like 'publishing commands'.
I uses RPC messages to execute commands/queries and publish events using Kafka to uncoupled event consumers for ridiculous scalability when using separate containers for processing commands, queries and events. Yes, I only use 2 microservices for all my 100's of commands/queries as I treat them as messages.
Implementing architecture is always full of tradeoffs depending on XY and Z which you should have a good handle on before you start your implementation!!
@allinvanguard Год назад
I tend to find amqp a lot more useful for business processes which require reliability and clear configurability & transparency about who gets what under which circumstances. Kafka is great for scale, but it can't possibly make the same guarantees amqp based brokers can make.
@CodeOpinion Год назад
If you were doing pure event choreography, then can see it being used, but if you want to get in as a process manager for workflow orchestration, which is going to involve commands, then not so much.
@georgehelyar Год назад
AMQP is just a protocol. Comparing Kafka and AMQP is like comparing apples and oranges.
For example, Azure Event Hubs is similar to Kafka, and Azure Service Bus is similar to RabbitMQ, but both of them use AMQP. The points made in this video also apply when choosing between AEH and ASB.
@allmonty Месяц назад ⁺¹
Messy explanation. Kafka is a message broker.

Следующие

Автовоспроизведение

Wix.com - 5 Event Driven Architecture Pitfalls!