From Query Plan to Performance: Supercharging your Apache Spark Queries using the Spark UI SQL Tab

Поделиться
HTML-код
  • Опубликовано: 26 ноя 2024

Комментарии • 8

  • @carrieliu6969
    @carrieliu6969 19 дней назад

    This is very helpful thank you!

  • @viswanathana3759
    @viswanathana3759 10 месяцев назад

    Awesome presentation. Really useful

  • @Learn2Share786
    @Learn2Share786 Год назад

    is there a repository to go over the real time bad vs good written spark sql ?

  • @anirvansen2941
    @anirvansen2941 3 года назад +1

    Awesome presentation :)

  • @Sathishkumar-rl7gj
    @Sathishkumar-rl7gj 2 года назад +1

    Thanks much !!! Very useful

  • @aviyehuda
    @aviyehuda 3 года назад

    Why does HashMergeJoin not mentioned in the presentation?

  • @aviyehuda
    @aviyehuda 3 года назад

    Why does a spark query is translated to multiple spark jobs?

    • @ЭдуардСухарев-ш9ч
      @ЭдуардСухарев-ш9ч 2 года назад

      Every job is a piece of work to be executed by an executor on a cluster. A query is analyzed and then split into stages according to the transformations in the query itself. Every stage is then split into multiple jobs which can be parallelized and pipelined for best efficiency.