Spark Performance Tuning | Performance Optimization | Interview Question

Поделиться
HTML-код
  • Опубликовано: 26 июл 2024
  • #Performance #Optimization #Spark #Internal: In this video , We have discussed in detail about the different way to handle performance Tuning
    Please join as a member in my channel to get additional benefits like materials in BigData , Data Science, live streaming for Members and many more
    Click here to subscribe : / @techwithviresh
    About us:
    We are a technology consulting and training providers, specializes in the technology areas like : Machine Learning,AI,Spark,Big Data,Nosql, graph DB,Cassandra and Hadoop ecosystem.
    Mastering Spark : • Spark Scenario Based I...
    Mastering Hive : • Mastering Hive Tutoria...
    Spark Interview Questions : • Cache vs Persist | Spa...
    Mastering Hadoop : • Hadoop Tutorial | Map ...
    Visit us :
    Email: techwithviresh@gmail.com
    Facebook : / tech-greens
    Twitter :
    Thanks for watching
    Please Subscribe!!! Like, share and comment!!!!
  • НаукаНаука

Комментарии • 10

  • @nikhilvalsaraj4860
    @nikhilvalsaraj4860 3 года назад +1

    very useful info

  • @Bhawnasays
    @Bhawnasays 4 года назад +6

    I personally liked your videos. can you mention your linkedin?

  • @pratikmokal7046
    @pratikmokal7046 3 года назад

    Thanks

  • @deepakgupta-hk9ig
    @deepakgupta-hk9ig 2 года назад +1

    Hi, now We have Tungsten which uses encoders for serilisaztion. SO now still we should use Kyro for serlization or tungsten will take care of it?

  • @shyamsundar.r7373
    @shyamsundar.r7373 4 года назад

    I have one common doubt, We could see spark is a cluster computing technique so spake job will be splited and sent across various node in cluster and processed in parallel and get us an output so here my doubt is while job splited and sent to nodes whether data to be processed and program code also will be sent? Please clarify.

    • @TechWithViresh
      @TechWithViresh  4 года назад

      So , distributed systems work on the architectural theme of sending code to the data, which the backbone and the breakthrough concept for handling of terabytes of data

  • @vamshi878
    @vamshi878 4 года назад +1

    Hi i have one doubt, in this performance tuning tips only when we use RDD?

    • @TechWithViresh
      @TechWithViresh  4 года назад +2

      Under the hood everything is red, be it dataset or df

    • @onbootstrap
      @onbootstrap 4 года назад

      @@TechWithViresh I don't think dataframes and datasets are under the hood powered by RDD.. can you please share any citation to the above claim? .. thanks..

    • @rahulpandit9082
      @rahulpandit9082 3 года назад +4

      @@onbootstrap RDD is building block of spark. No matter which abstraction dataframe or dataset we use, internally final computation is done on RDD..