Webinar: 99 Ways to Enrich Streaming Data with Apache Flink - Konstantin Knauf

Поделиться
HTML-код
  • Опубликовано: 26 авг 2024

Комментарии • 5

  • @blumki
    @blumki 2 года назад

    nice presentation, that for sharing this

  • @GauravKumar-dq5mv
    @GauravKumar-dq5mv 4 года назад

    Simple Event time join, How do you handle the cold start problem with that.

    • @konstantinknauf8792
      @konstantinknauf8792 4 года назад

      Hi Gaurav, as always it depends a bit on your use case. A simple solution would be to buffer sensor events initially, until a first event from the metadata stream for this key has arrived. Alternatively, you can use a temporal table join, in which case your events would be buffered until both streams have reached the same timestamp (thereby handling the cold start). A third option would be to use the newly released State Processor API (ci.apache.org/projects/flink/flink-docs-release-1.9/dev/libs/state_processor_api.html) to crate a savepoint to start the job from initially. Hope this helps, Konstantin

    • @mathieudespriee6646
      @mathieudespriee6646 3 года назад

      I managed to do it by preloading the metadata table at init (open()) method. This duplicates data on all task managers though, and require to manually code the data fetching (I used a custom KafkaConsumer)

  • @johnjosephlonergan
    @johnjosephlonergan 5 лет назад

    If I need to do multi dimensional enrichment then what's the recommendation. If I have a order carrying customer and product info then I need to enrich both dimensions.
    See re ruclips.net/video/cJS18iKLUIY/видео.html#t=38m56s
    ... do I first map orders by customer to do a cust enrichment then remap by product to do the product enrichment?
    What's been seen to work well?