Get the source code for this video for FREE → the-dotnet-weekly.kit.com/outbox-scaling P.S. In the "Enabling RabbitMQ batch publish" section: ConfigureBatchPublish was deprecated with MassTransit.RabbitMQ v8.3.2. This version uses the new RabbitMQ.Client v7, which was rewritten to use the TPL and async/await. You will see a similar performance improvement just from upgrading to this version.
Currently, you are the best content creator for .NET. Your content perfectly addresses our daily needs and the challenges we face in the projects we're working on. Thank you so much for your dedication to delivering extremely high-quality content. I hope you never stop teaching us.
Milan i find it hard to keep up - is there any way you could slow down a little bit? It would improve the overall experience i am sure. Its a really great video btw ! I would like to see more videos that show complicated stuff that uses best enterprise practice - like this one.
Awesome video. We would definitely like to see more examples of such brilliant architectural decisions. Especially the ones you've implemented on real projects where that especially matters. Thank you!) By the way, to what degree we can increase number of messages queried and number of parallelism? Should we refer to CPU load?
The set of messages to read and update is fixed, so the db can just load the index on memory and serve the data require really fast. but in a real system the db will be overwhelmed updating the table and the index and the retrieval will be quite slower than in the "static" example. indexes are really good when the number of reads is far greater than the writes. when it is about 50/50... you can get unexpected results
@@daveanderson8348 @MilanJovanovicTech yes, the index is build over occurred_on_utc, processed_on_utc. the update phase changes only the first field. in a real system where multiple inserts in the table happen concurrently together with selects and updates the index will become the bottleneck because of fragmentation and/or because it will become unbalanced. I'd like to see the same benchmark executed while another background process fills the table with tons of rows.
Awesome! I definitely want more videos like this. I have a question regarding to the index created in 10:52. Perhaps it is a micro-optimization but I think that you could exclude the processed_on_utc field because having that the WHERE clause, there is no need for this field to be included. Thanks again for these grate tips.
Thanks for sharing this. I've a question at 1:02 , I wonder why if we have an exception for example during publishing the outbox message to the RabbitMQ then I beleive to retry it agin later , However yo said : we won't be processing it any more.
Awesome content. How does custom outbox in Dapper compares to Entity Framework (the build-in into Mass Transit). I often get database problems with it. Maybe some performance tips on Outbox using EF in next video? 😊
Hey, Milan, thank you for creating this type of content! I have a question regarding batch size in an outbox pattern. I'm concerned about concurrency issues. If I have multiple workers running simultaneously, could this cause duplicate items in the database when they process the same batch of outbox messages? How would you recommend handling this scenario?
Can you please create a book with all these topics, i have gone through a lot of material but find your videos and way of explanation filled with valuable and concise information. Thanks for all you do 🙏
@@MilanJovanovicTech I can relate & sorry for injecting this comment unrelated to your video. I just keep on following you here and remain a Patreon supporter. Thanks again for your great content
you could optimize DB update even further by using some clever UNNEST tricks with array parameters so that the DB doesn't have to parse the huge string
Message ordering is never guaranteed anyhow - with most brokers. There are some that support FIFO queues. You can solve this on the consumer by buffering incoming messages, and then processing them in order. And even this is suspectable to problems. One retry on the producer, and you may get out of order messages. That's why it's best not to depend on in-order processing. And if you do need in-order processing, than you'll probably want to model that as a Saga or similar.
The average consumer should not be able to process this data, how to create a consumer base to process this data, what are the recommendations of the database, etc
Get the source code for this video for FREE → the-dotnet-weekly.kit.com/outbox-scaling
P.S. In the "Enabling RabbitMQ batch publish" section: ConfigureBatchPublish was deprecated with MassTransit.RabbitMQ v8.3.2. This version uses the new RabbitMQ.Client v7, which was rewritten to use the TPL and async/await. You will see a similar performance improvement just from upgrading to this version.
Currently, you are the best content creator for .NET. Your content perfectly addresses our daily needs and the challenges we face in the projects we're working on. Thank you so much for your dedication to delivering extremely high-quality content. I hope you never stop teaching us.
100% agrees. Insted of doing clickbait shit Milan delivering solutions and tools for wide range of problems .NET devs face
Wow, thank you! And challenge accepted for future content 😁💪
High tier content, much appretiated!
Thanks a lot!
Thank you so much, Milan! This is fantastic content. I'm currently working on a similar implementation, so this has been incredibly helpful
Glad it was helpful!
U r awesome as always, keep it up Milan !!
Thank you! Will do!
Milan i find it hard to keep up - is there any way you could slow down a little bit? It would improve the overall experience i am sure. Its a really great video btw ! I would like to see more videos that show complicated stuff that uses best enterprise practice - like this one.
Just my style of explaining 🤷♂️
Awesome video. We would definitely like to see more examples of such brilliant architectural decisions. Especially the ones you've implemented on real projects where that especially matters. Thank you!)
By the way, to what degree we can increase number of messages queried and number of parallelism? Should we refer to CPU load?
Yep, always refer to the available resources. The optimized outbox on a single worker may be enough for most applications.
The set of messages to read and update is fixed, so the db can just load the index on memory and serve the data require really fast. but in a real system the db will be overwhelmed updating the table and the index and the retrieval will be quite slower than in the "static" example.
indexes are really good when the number of reads is far greater than the writes. when it is about 50/50... you can get unexpected results
Well it's not so static - 5 workers query and update the table at the same time (each update also updates the index).
@@MilanJovanovicTech Perhaps he means that no new records are added to the table while you are reading and updating it.
@@daveanderson8348 @MilanJovanovicTech yes, the index is build over occurred_on_utc, processed_on_utc. the update phase changes only the first field. in a real system where multiple inserts in the table happen concurrently together with selects and updates the index will become the bottleneck because of fragmentation and/or because it will become unbalanced.
I'd like to see the same benchmark executed while another background process fills the table with tons of rows.
Awesome! I definitely want more videos like this.
I have a question regarding to the index created in 10:52. Perhaps it is a micro-optimization but I think that you could exclude the processed_on_utc field because having that the WHERE clause, there is no need for this field to be included. Thanks again for these grate tips.
I think you are correct there
This was very useful and insightful, thank you :)
Thanks a ton!
Thanks for sharing this.
I've a question at 1:02 , I wonder why if we have an exception for example during publishing the outbox message to the RabbitMQ then I beleive to retry it agin later , However yo said : we won't be processing it any more.
In that request it failed. Yes, we can retry publishing it there directly but what should we do when retries fail?
Awesome content. How does custom outbox in Dapper compares to Entity Framework (the build-in into Mass Transit). I often get database problems with it.
Maybe some performance tips on Outbox using EF in next video? 😊
Will have to spend some time exploring the MT Outbox first
I was applying some of these already but i picked up a few tips to enhance my process even further🎉
Nice work!
Hey, Milan, thank you for creating this type of content! I have a question regarding batch size in an outbox pattern. I'm concerned about concurrency issues. If I have multiple workers running simultaneously, could this cause duplicate items in the database when they process the same batch of outbox messages? How would you recommend handling this scenario?
No - FOR UPDATE SKIP LOCKED solves that
And if you turn off publishing confirmation, then you can probably publish and update the database in parallel.
Too risky
Can you please create a book with all these topics, i have gone through a lot of material but find your videos and way of explanation filled with valuable and concise information. Thanks for all you do 🙏
I'm not planning on writing a book any time soon
Great set of tips Milan. Thank you for sharing. Please come over to Bluesky. Very vibrant and growing tech community there 😉
I doubt I can handle one more social media
@@MilanJovanovicTech I can relate & sorry for injecting this comment unrelated to your video. I just keep on following you here and remain a Patreon supporter. Thanks again for your great content
Thank Milan. Do you think TPL Dataflow may have helped increasing performance even more?
No idea. How would you use it here?
Very helpful
Glad you think so!
Is there a good reason not to use IASYNCENUMERABLE to process the outbox query?
Which query exactly? The one returned by Dapper?
Good video. Congrats.
What about showing processing those millions and billions of messages on the other side of consumption? 😂
My machine would crash 😅 Typically, the consumers would be a separate set of servers
@@MilanJovanovicTechfair enough 😂😂😂
Would it be possible to show an outbox in memory that mass transit uses
Yes
you could optimize DB update even further by using some clever UNNEST tricks with array parameters so that the DB doesn't have to parse the huge string
My SQL skills only go so far 😅
Would using a stored procedure save some time by caching execution plan?
Doesn't the order of the messages matter ? I'm surprised we can afford processing them in parallel.
Message ordering is never guaranteed anyhow - with most brokers. There are some that support FIFO queues.
You can solve this on the consumer by buffering incoming messages, and then processing them in order. And even this is suspectable to problems. One retry on the producer, and you may get out of order messages.
That's why it's best not to depend on in-order processing. And if you do need in-order processing, than you'll probably want to model that as a Saga or similar.
The average consumer should not be able to process this data, how to create a consumer base to process this data, what are the recommendations of the database, etc
Process in terms of # of messages?
Murderer!! You are kiling it(outbox)!
Sorry!