From Query Plan to Performance: Supercharging your Apache Spark Queries using the Spark UI SQL Tab

Поделиться
HTML-код
  • Опубликовано: 28 дек 2024

Комментарии • 8

  • @carrieliu6969
    @carrieliu6969 Месяц назад +1

    This is very helpful thank you!

  • @viswanathana3759
    @viswanathana3759 11 месяцев назад +1

    Awesome presentation. Really useful

  • @Learn2Share786
    @Learn2Share786 Год назад

    is there a repository to go over the real time bad vs good written spark sql ?

  • @anirvansen2941
    @anirvansen2941 3 года назад +1

    Awesome presentation :)

  • @Sathishkumar-rl7gj
    @Sathishkumar-rl7gj 2 года назад +1

    Thanks much !!! Very useful

  • @aviyehuda
    @aviyehuda 3 года назад

    Why does HashMergeJoin not mentioned in the presentation?

  • @aviyehuda
    @aviyehuda 3 года назад

    Why does a spark query is translated to multiple spark jobs?

    • @ЭдуардСухарев-ш9ч
      @ЭдуардСухарев-ш9ч 2 года назад +1

      Every job is a piece of work to be executed by an executor on a cluster. A query is analyzed and then split into stages according to the transformations in the query itself. Every stage is then split into multiple jobs which can be parallelized and pipelined for best efficiency.