I’ve been testing dynamoDb with a single table design and dynamodb streams for cdc. It’s a small project but so far is performing really well. The built-in retry features are fantastic and also help control back pressure. Always looking out for alternatives
We use a very simple variant of the workflow pattern that we call “mutation queue”. This queue is used internally by the service to send itself commands. During command execution, we first update the database, then publish the event, then finally commit/ack the message. If something fails, the queuing system retries the whole thing. Of course processing needs to be idempotent.
I'm a fan of outbox where I can use it because it greatly simplifies things. It's my default go-to for greenfield projects due to that - having to deal with the mess of all that other stuff when I don't know if it will be strictly necessary just never feels fun. That said, depending upon the business model of the project, I might resort to something else upfront. But for example, when I worked on an ecommerce integration with a third party, I wasn't going to sweat the performance impact of outbox when there are only a couple thousand of orders per day.
Very nice video. Thanks to your videos I discovered dotnetcore cap and have been using it in my office work ever since. Also the Temporal tool seems very cool (looks like a lighter version of Camunda). Would love to see a video where you explain a workflow with dotnetcore cap and temporal. Keep up the amazing work.
The bigger issue for the outbox pattern is that it is the probability of sending messages multiple times. For instance, you want to delete the messages being sent from the original database to indicate that those messages are sent. In general, you'll have to: BEGIN TRANSACTION; SELECT messages publish messages DELETE FROM messages COMMIT TRANSACTION; the process may crash or be killed before deleting or before the transaction commit. or, if you are lucky to have DELETE RETURNING - you can delete instead of select, but the process still can crash before the transaction commit.
Yes, hence why you want all consumers always to be idempotent. Generally you'll be dealing with at least once delivery and you can go the inbox route but regardless, you want consumers to be idempotent.
great stuff! maybe another approach is to implement a "mini-saga" that first persists to the db, then tries to publish the message....if message publication fails then rollback db change....fail fast
@@samuelnettey Hehe funny yeah, but I might have misunderstood @Mandolinean. If he meant "persist both business data AND message to db, then try to send message immediately" then that's a fine approach. 1) you get to send message asap, and 2) it's persisted and can be retried in case of failure. There would be greater risk of sending a duplicate of that message though... But an inbox at the receiving end should be used to handle duplicates.
Outbox has the same problems. You got message from DB and then.. the same situation happens. You need to publish event and update database. As I understand now you have 1001 ways to shoot your foot
No, it's not the same. Persisting business state changes and your event is about consistency. Publishing the same event more than once from the outbox isn't the same. Consumers generally need to be idempotent, which is now the issue. You'd rather be consistent than not, generally (depending on the context)
What about using something like eventstoredb? Wouldn’t that make the database and the messaging mechanism one and the same, allowing transactional update without the overhead of the outbox? (Which imo is a solid pattern - an architect introduced me to it 15 years ago calling it a ‘buffer table’)
Why not using EventStoreDb also for integration events outside the bounded context? A consumer subscribed to streams can retrieve events and write them again in a special "integration" stream where other external bounded contexts can be subscribed to. Is there any downside?
@@CodeOpinion In an eventsourced system, the writing of events is the transaction, so using a generic subscription to these events to push to a broker solves the above issues easily, inside or outside the service boundary. I typically have a subscriber inside the bounded context which both writes read models for that context, and pushes a message to say, RabbitMQ or whatever.
There are 2 things that bother me when events come to play. First - in the end the assumption should made (as was in case with McDonalds) that on failure to publish event to broker, such event would be successfully persisted to S3 or DynamoDB. But there's no guarantee of success, so there should be a compensatory mechanism to cover this failure case in a form of saga or fallback operation and boom we have distributed transactions just to send email notification. Such complexity that should be introduced every time event is used to communicate either interprocess or across logical boundaries smells Accidental. Second - proper integration testing of systems that use event communication mechanisms seems to be another problem as there's no way to know that event processing is over and email was sent. To make such test bulletproof there should be an event store for the test to check that some event has happened. So ultimately it feels like - "Once you stepped over the line and introduced at least 1 event to your system - your in outer space and its up to you to design and build another space station that would let you survive and allow you to have consistency and atomicity of operations." I hope I'm wrong as this is what stops me from going that lane right now.
Hey Derek, Thanks for the video. What do you think about MassTransit's In-Memory Outbox? They have released a traditional outbox, but wanted to get your thoughts on the their initial implementation. Also, came across MT's courier which seems to be exactly what you were describing at the end (as opposed to traditional sagas) or are they different?. Thanks for hte vids.
The in-memory outbox obviously means it's not durable, so if you're process crashes, you lose those message and they will fail to publish. If you *need* guarantees you will publish, you won't with this approach. I'm not familiar enough with it's courier to really comment. Just reading the docs now, seems interesting.
Great video on the subject; out of interest what tooling do you use to create the animations? Example at 7:33 of this video you show a message (envelope) transition.
How does outbox pattern solves the issue, it is just moving the problem from one place to another, it could still publish the message and then failed updating/removing it from database, only way to solve is, outbox pattern + idempotency of the message, so that duplicate message will be handled by the consumer
You're not moving the problem. The problem is saving state and not publishing an event. The outbox solves that problem. It does introduce an issue of consumers being idempotent, but that's generally always the case even without the outbox pattern if you're broker has at least one delivery semantics.
Hi Derek! Great explanation. What’s your opinion on using Hangfire as message publisher with outbox pattern? Do you think it’s stable enough to rely on it for that task?
@@CodeOpinion Exactly. I’m mostly using EF and Hangfire with Postgres and all context changes, as well as job enqueing/scheduling, are within one TransactionScope.
Hi Derek, thanks for the video. The Transactional Outbox pattern is a great way to ensure the atomicity of data, but I have some questions. If a service that need to publish message, it will implement the transactional outbox pattern. So if I have multiple services, each service will have their own outbox table and their own worker service? Is it possible to use the outbox table in a separate database from these services? It means these services access the same additional database that is responsible for outbox table. I hope to receive your advice. Tks
If I understand correctly, in the outbox pattern we still don't have a transaction that would span (a) publishing message to the broker, and (b) removing the serialized event from the DB. As a result, each event will be published "'at least once". How much of a problem that is and how do you deal with it?
Exactly, the scheduler/publisher could fail removing the message from the outbox which would result in a duplicate. Implications are requiring your consumers to be idempotent.
@@coolY2k you will need 2 different consumers. One for saving and another for publishing. We can't have single consumer since that lead us back to same problem.
@@tharun8164 No, you can. Put unique Id in message. Start the process - get the message from queue but do not remove it, check the database if message with that id has already been processed if not process it, send to message broker with same id (each process will have to check for duplicate messages), remove from queue
@@coolY2k this is like the outbox pattern but in reverse. Write the event atomically to a queue and then have a separate process read from the queue and atomically update the DB. This works, but now your DB is only eventually consistent and that will come with a whole new set of problems.
I’ve been testing dynamoDb with a single table design and dynamodb streams for cdc. It’s a small project but so far is performing really well. The built-in retry features are fantastic and also help control back pressure.
Always looking out for alternatives
We use a very simple variant of the workflow pattern that we call “mutation queue”. This queue is used internally by the service to send itself commands. During command execution, we first update the database, then publish the event, then finally commit/ack the message.
If something fails, the queuing system retries the whole thing. Of course processing needs to be idempotent.
I'm a fan of outbox where I can use it because it greatly simplifies things. It's my default go-to for greenfield projects due to that - having to deal with the mess of all that other stuff when I don't know if it will be strictly necessary just never feels fun.
That said, depending upon the business model of the project, I might resort to something else upfront. But for example, when I worked on an ecommerce integration with a third party, I wasn't going to sweat the performance impact of outbox when there are only a couple thousand of orders per day.
Exactly. Context specific.
That uroboros logo of your sponsor is so cool! But you and your content are even cooler! Thank you!
I appreciate that!
Very nice video. Thanks to your videos I discovered dotnetcore cap and have been using it in my office work ever since. Also the Temporal tool seems very cool (looks like a lighter version of Camunda). Would love to see a video where you explain a workflow with dotnetcore cap and temporal. Keep up the amazing work.
I would like to see too about the last topic mentioned on this video (Temporal and Workflows)
The bigger issue for the outbox pattern is that it is the probability of sending messages multiple times. For instance, you want to delete the messages being sent from the original database to indicate that those messages are sent. In general, you'll have to:
BEGIN TRANSACTION;
SELECT messages
publish messages
DELETE FROM messages
COMMIT TRANSACTION;
the process may crash or be killed before deleting or before the transaction commit.
or, if you are lucky to have DELETE RETURNING - you can delete instead of select, but the process still can crash before the transaction commit.
Yes, hence why you want all consumers always to be idempotent. Generally you'll be dealing with at least once delivery and you can go the inbox route but regardless, you want consumers to be idempotent.
great stuff! maybe another approach is to implement a "mini-saga" that first persists to the db, then tries to publish the message....if message publication fails then rollback db change....fail fast
What if the rollback fails? 😄
@@aleksg2925 😂😂
@@samuelnettey Hehe funny yeah, but I might have misunderstood @Mandolinean. If he meant "persist both business data AND message to db, then try to send message immediately" then that's a fine approach. 1) you get to send message asap, and 2) it's persisted and can be retried in case of failure. There would be greater risk of sending a duplicate of that message though... But an inbox at the receiving end should be used to handle duplicates.
Wow.. I take from this video alot!! Thank you friend you are the best 👌🏽
No problem 😊
Outbox has the same problems. You got message from DB and then.. the same situation happens. You need to publish event and update database. As I understand now you have 1001 ways to shoot your foot
No, it's not the same. Persisting business state changes and your event is about consistency. Publishing the same event more than once from the outbox isn't the same. Consumers generally need to be idempotent, which is now the issue. You'd rather be consistent than not, generally (depending on the context)
Gr8t content as always. Could you please some tutorial around Debezium CDC with Postgress ? Any end to end code sample for .Net/C#?
I'll add it to my topic suggestion list! Thanks!
This is great!
What about using something like eventstoredb? Wouldn’t that make the database and the messaging mechanism one and the same, allowing transactional update without the overhead of the outbox? (Which imo is a solid pattern - an architect introduced me to it 15 years ago calling it a ‘buffer table’)
Within a logical service boundary, yes!
Why not using EventStoreDb also for integration events outside the bounded context?
A consumer subscribed to streams can retrieve events and write them again in a special "integration" stream where other external bounded contexts can be subscribed to.
Is there any downside?
@@CodeOpinion In an eventsourced system, the writing of events is the transaction, so using a generic subscription to these events to push to a broker solves the above issues easily, inside or outside the service boundary. I typically have a subscriber inside the bounded context which both writes read models for that context, and pushes a message to say, RabbitMQ or whatever.
There are 2 things that bother me when events come to play.
First - in the end the assumption should made (as was in case with McDonalds) that on failure to publish event to broker, such event would be successfully persisted to S3 or DynamoDB. But there's no guarantee of success, so there should be a compensatory mechanism to cover this failure case in a form of saga or fallback operation and boom we have distributed transactions just to send email notification. Such complexity that should be introduced every time event is used to communicate either interprocess or across logical boundaries smells Accidental.
Second - proper integration testing of systems that use event communication mechanisms seems to be another problem as there's no way to know that event processing is over and email was sent. To make such test bulletproof there should be an event store for the test to check that some event has happened.
So ultimately it feels like - "Once you stepped over the line and introduced at least 1 event to your system - your in outer space and its up to you to design and build another space station that would let you survive and allow you to have consistency and atomicity of operations."
I hope I'm wrong as this is what stops me from going that lane right now.
You could even just have a read replica. You could write to the database but the read replicas night not return it in a select query yet
Hey Derek, Thanks for the video. What do you think about MassTransit's In-Memory Outbox? They have released a traditional outbox, but wanted to get your thoughts on the their initial implementation. Also, came across MT's courier which seems to be exactly what you were describing at the end (as opposed to traditional sagas) or are they different?. Thanks for hte vids.
The in-memory outbox obviously means it's not durable, so if you're process crashes, you lose those message and they will fail to publish. If you *need* guarantees you will publish, you won't with this approach. I'm not familiar enough with it's courier to really comment. Just reading the docs now, seems interesting.
The current version of MT also supports a transactional outbox.
@@arnonoordover4054 Transactional outbox is powerful.
Great video on the subject; out of interest what tooling do you use to create the animations? Example at 7:33 of this video you show a message (envelope) transition.
Just PowerPoint. Animations.
@@CodeOpinion you know I was just joking that PowerPoint is probably the easiest solution 😎
How does outbox pattern solves the issue, it is just moving the problem from one place to another, it could still publish the message and then failed updating/removing it from database, only way to solve is, outbox pattern + idempotency of the message, so that duplicate message will be handled by the consumer
You're not moving the problem. The problem is saving state and not publishing an event. The outbox solves that problem. It does introduce an issue of consumers being idempotent, but that's generally always the case even without the outbox pattern if you're broker has at least one delivery semantics.
Can transactions and confirmation of message delivery solve this problem?
Hi Derek! Great explanation. What’s your opinion on using Hangfire as message publisher with outbox pattern? Do you think it’s stable enough to rely on it for that task?
Not sure how you would use hangfire with an outbox. You need to be using the same underlying DB connection and transaction.
@@CodeOpinion Exactly. I’m mostly using EF and Hangfire with Postgres and all context changes, as well as job enqueing/scheduling, are within one TransactionScope.
Hi Derek, thanks for the video. The Transactional Outbox pattern is a great way to ensure the atomicity of data, but I have some questions. If a service that need to publish message, it will implement the transactional outbox pattern. So if I have multiple services, each service will have their own outbox table and their own worker service? Is it possible to use the outbox table in a separate database from these services? It means these services access the same additional database that is responsible for outbox table.
I hope to receive your advice. Tks
Anything is possible, as long as you maintain atomicity.
Did you do a video on the Listen to Yourself design pattern?
No, but it's on my list of topic ideas
@@CodeOpinion I can't wait to see it 🙂
If I understand correctly, in the outbox pattern we still don't have a transaction that would span (a) publishing message to the broker, and (b) removing the serialized event from the DB. As a result, each event will be published "'at least once". How much of a problem that is and how do you deal with it?
Exactly, the scheduler/publisher could fail removing the message from the outbox which would result in a duplicate. Implications are requiring your consumers to be idempotent.
@@CodeOpinion One can use the inbox pattern with the outbox to help out with the idempotent part?
what do you think about using temporal with a monolith?
Ya, I have talked about monoliths and being event driven and/or workflow. Check out: ruclips.net/video/bxGkavGaEiM/видео.html
Queue before DB storing and message broker?
Queue what before?
@@CodeOpinion you can just send command directly to queue. And process that listens to queue can manage both operations?
@@coolY2k you will need 2 different consumers. One for saving and another for publishing. We can't have single consumer since that lead us back to same problem.
@@tharun8164 No, you can. Put unique Id in message. Start the process - get the message from queue but do not remove it, check the database if message with that id has already been processed if not process it, send to message broker with same id (each process will have to check for duplicate messages), remove from queue
@@coolY2k this is like the outbox pattern but in reverse. Write the event atomically to a queue and then have a separate process read from the queue and atomically update the DB.
This works, but now your DB is only eventually consistent and that will come with a whole new set of problems.
Thank you for support Ukraine!
nobody knows what "first class" means
Nobody? It means they are a primary concept