I Scaled My Transactional Outbox to 2B+ messages/day. Here's how

Поделиться
HTML-код
  • Опубликовано: 3 дек 2024

Комментарии • 57

  • @MilanJovanovicTech
    @MilanJovanovicTech  4 дня назад +4

    Get the source code for this video for FREE → the-dotnet-weekly.kit.com/outbox-scaling
    P.S. In the "Enabling RabbitMQ batch publish" section: ConfigureBatchPublish was deprecated with MassTransit.RabbitMQ v8.3.2. This version uses the new RabbitMQ.Client v7, which was rewritten to use the TPL and async/await. You will see a similar performance improvement just from upgrading to this version.

  • @GabrielRibeiro-of5mn
    @GabrielRibeiro-of5mn 4 дня назад +33

    Currently, you are the best content creator for .NET. Your content perfectly addresses our daily needs and the challenges we face in the projects we're working on. Thank you so much for your dedication to delivering extremely high-quality content. I hope you never stop teaching us.

    • @dy0mber847
      @dy0mber847 4 дня назад +3

      100% agrees. Insted of doing clickbait shit Milan delivering solutions and tools for wide range of problems .NET devs face

    • @MilanJovanovicTech
      @MilanJovanovicTech  3 дня назад +2

      Wow, thank you! And challenge accepted for future content 😁💪

  • @eddypartey1075
    @eddypartey1075 4 дня назад +7

    High tier content, much appretiated!

  • @chanakachathuarngaathapath7282
    @chanakachathuarngaathapath7282 4 дня назад +2

    Thank you so much, Milan! This is fantastic content. I'm currently working on a similar implementation, so this has been incredibly helpful

  • @namtrg
    @namtrg 3 дня назад

    U r awesome as always, keep it up Milan !!

  • @margosdesarian
    @margosdesarian 13 часов назад +1

    Milan i find it hard to keep up - is there any way you could slow down a little bit? It would improve the overall experience i am sure. Its a really great video btw ! I would like to see more videos that show complicated stuff that uses best enterprise practice - like this one.

  • @Great_Critic
    @Great_Critic 3 дня назад

    Awesome video. We would definitely like to see more examples of such brilliant architectural decisions. Especially the ones you've implemented on real projects where that especially matters. Thank you!)
    By the way, to what degree we can increase number of messages queried and number of parallelism? Should we refer to CPU load?

    • @MilanJovanovicTech
      @MilanJovanovicTech  2 дня назад

      Yep, always refer to the available resources. The optimized outbox on a single worker may be enough for most applications.

  • @AlwaysHCYT2
    @AlwaysHCYT2 2 дня назад +1

    The set of messages to read and update is fixed, so the db can just load the index on memory and serve the data require really fast. but in a real system the db will be overwhelmed updating the table and the index and the retrieval will be quite slower than in the "static" example.
    indexes are really good when the number of reads is far greater than the writes. when it is about 50/50... you can get unexpected results

    • @MilanJovanovicTech
      @MilanJovanovicTech  2 дня назад

      Well it's not so static - 5 workers query and update the table at the same time (each update also updates the index).

    • @daveanderson8348
      @daveanderson8348 День назад

      ​@@MilanJovanovicTech Perhaps he means that no new records are added to the table while you are reading and updating it.

    • @AlwaysHCYT2
      @AlwaysHCYT2 День назад

      @@daveanderson8348 @MilanJovanovicTech yes, the index is build over occurred_on_utc, processed_on_utc. the update phase changes only the first field. in a real system where multiple inserts in the table happen concurrently together with selects and updates the index will become the bottleneck because of fragmentation and/or because it will become unbalanced.
      I'd like to see the same benchmark executed while another background process fills the table with tons of rows.

  • @robertogonzalorodriguez845
    @robertogonzalorodriguez845 4 дня назад

    Awesome! I definitely want more videos like this.
    I have a question regarding to the index created in 10:52. Perhaps it is a micro-optimization but I think that you could exclude the processed_on_utc field because having that the WHERE clause, there is no need for this field to be included. Thanks again for these grate tips.

  • @TheBekker_
    @TheBekker_ 4 дня назад

    This was very useful and insightful, thank you :)

  • @abomalek8
    @abomalek8 4 дня назад

    Thanks for sharing this.
    I've a question at 1:02 , I wonder why if we have an exception for example during publishing the outbox message to the RabbitMQ then I beleive to retry it agin later , However yo said : we won't be processing it any more.

    • @MilanJovanovicTech
      @MilanJovanovicTech  3 дня назад

      In that request it failed. Yes, we can retry publishing it there directly but what should we do when retries fail?

  • @tjagusz
    @tjagusz 2 часа назад

    Awesome content. How does custom outbox in Dapper compares to Entity Framework (the build-in into Mass Transit). I often get database problems with it.
    Maybe some performance tips on Outbox using EF in next video? 😊

  • @nove1398
    @nove1398 4 дня назад

    I was applying some of these already but i picked up a few tips to enhance my process even further🎉

  • @guilhermeloyola
    @guilhermeloyola 10 часов назад

    Hey, Milan, thank you for creating this type of content! I have a question regarding batch size in an outbox pattern. I'm concerned about concurrency issues. If I have multiple workers running simultaneously, could this cause duplicate items in the database when they process the same batch of outbox messages? How would you recommend handling this scenario?

  • @Maxim.Shiryaev
    @Maxim.Shiryaev 4 дня назад +1

    And if you turn off publishing confirmation, then you can probably publish and update the database in parallel.

  • @UmerFarooq-w1x
    @UmerFarooq-w1x 3 дня назад

    Can you please create a book with all these topics, i have gone through a lot of material but find your videos and way of explanation filled with valuable and concise information. Thanks for all you do 🙏

  • @hjalmarhengstmann7686
    @hjalmarhengstmann7686 4 дня назад

    Great set of tips Milan. Thank you for sharing. Please come over to Bluesky. Very vibrant and growing tech community there 😉

    • @MilanJovanovicTech
      @MilanJovanovicTech  3 дня назад +1

      I doubt I can handle one more social media

    • @hjalmarhengstmann7686
      @hjalmarhengstmann7686 3 дня назад

      @@MilanJovanovicTech I can relate & sorry for injecting this comment unrelated to your video. I just keep on following you here and remain a Patreon supporter. Thanks again for your great content

  • @mymemoryleaks
    @mymemoryleaks 4 дня назад

    Thank Milan. Do you think TPL Dataflow may have helped increasing performance even more?

  • @deceptionsinner2875
    @deceptionsinner2875 2 дня назад

    Very helpful

  • @peterk4694
    @peterk4694 2 дня назад

    Is there a good reason not to use IASYNCENUMERABLE to process the outbox query?

  • @haraheiquedossantos4283
    @haraheiquedossantos4283 3 дня назад +1

    Good video. Congrats.
    What about showing processing those millions and billions of messages on the other side of consumption? 😂

    • @MilanJovanovicTech
      @MilanJovanovicTech  2 дня назад +1

      My machine would crash 😅 Typically, the consumers would be a separate set of servers

    • @haraheiquedossantos4283
      @haraheiquedossantos4283 2 дня назад

      @@MilanJovanovicTechfair enough 😂😂😂

  • @sunnypatel1045
    @sunnypatel1045 2 дня назад

    Would it be possible to show an outbox in memory that mass transit uses

  • @MegaMage79
    @MegaMage79 4 дня назад

    you could optimize DB update even further by using some clever UNNEST tricks with array parameters so that the DB doesn't have to parse the huge string

    • @MilanJovanovicTech
      @MilanJovanovicTech  3 дня назад

      My SQL skills only go so far 😅

    • @janjoska2549
      @janjoska2549 2 дня назад

      Would using a stored procedure save some time by caching execution plan?

  • @iliyan-kulishev
    @iliyan-kulishev 3 дня назад +1

    Doesn't the order of the messages matter ? I'm surprised we can afford processing them in parallel.

    • @MilanJovanovicTech
      @MilanJovanovicTech  3 дня назад

      Message ordering is never guaranteed anyhow - with most brokers. There are some that support FIFO queues.
      You can solve this on the consumer by buffering incoming messages, and then processing them in order. And even this is suspectable to problems. One retry on the producer, and you may get out of order messages.
      That's why it's best not to depend on in-order processing. And if you do need in-order processing, than you'll probably want to model that as a Saga or similar.

  • @sunzhang-d9v
    @sunzhang-d9v 2 дня назад

    The average consumer should not be able to process this data, how to create a consumer base to process this data, what are the recommendations of the database, etc

  • @phw1009
    @phw1009 День назад

    Murderer!! You are kiling it(outbox)!