High Performance Kafka Producers in Python

Поделиться
HTML-код
  • Опубликовано: 14 ноя 2024

Комментарии • 10

  • @putopavel
    @putopavel 3 месяца назад +2

    The performance analysis and optimisation breakdown in this video is probably the perfect balance between the most common and powerful tunings and brevity. Great work!
    Maybe you could comment on client (or even topic) optimisation in the upcoming videos? 🤔

    • @QuixStreams
      @QuixStreams  3 месяца назад

      Thanks! Great idea, I'll add it to the list.

  • @timelschner8451
    @timelschner8451 Месяц назад +1

    Very good tutorial, many thanks. Subscriber +1

  • @CrypticConsole
    @CrypticConsole 4 месяца назад +8

    when you set the compression at 21:50, do you need to make sure that all the consumers have the same compression type specified as the producers or is it embedded in the message and configured automatically

    • @QuixStreams
      @QuixStreams  4 месяца назад +2

      It should just work automatically. The consumer can detect the compression type when it reads each batch and decompress it on the fly.
      The only thing you need to ensure is support. So if your producers compress with, say, `lz4`, you need to make sure all your consumer libraries support reading `lz4`. With a very common algorithm like `gzip`, that's pretty much guaranteed. For the more exotic ones, I'd double check before you ship.

    • @CrypticConsole
      @CrypticConsole 4 месяца назад +1

      @@QuixStreams Thanks for the response. That is a good design choice from Kafka, especially if all your components are using the same quixstreams library as the compression support should be the same.

    • @QuixStreams
      @QuixStreams  4 месяца назад

      Absolutely. 👍

  • @CrypticConsole
    @CrypticConsole 4 месяца назад +5

    I feel like if you need super high performance IO routing synchronous python might be the wrong choice. Apache camel has an integration for GitHub event source so it might work better.

    • @QuixStreams
      @QuixStreams  4 месяца назад +1

      Yup, that's absolutely true. However, if you were parallelising, you might decide that it's better to have 3 Python producers sharing a workload instead of introducing an extra component like Camel. And with these tips, you might be looking at 3 producers instead of 10. 🙂
      (That's the shape of it at least. I'm not actually sure how you'd load-balance this specific data-source, but I thought it was an interesting one to explore the concepts with. 😉)