Spark Performance Tuning | Performance Optimization | Interview Question
HTML-код
- Опубликовано: 26 июл 2024
- #Performance #Optimization #Spark #Internal: In this video , We have discussed in detail about the different way to handle performance Tuning
Please join as a member in my channel to get additional benefits like materials in BigData , Data Science, live streaming for Members and many more
Click here to subscribe : / @techwithviresh
About us:
We are a technology consulting and training providers, specializes in the technology areas like : Machine Learning,AI,Spark,Big Data,Nosql, graph DB,Cassandra and Hadoop ecosystem.
Mastering Spark : • Spark Scenario Based I...
Mastering Hive : • Mastering Hive Tutoria...
Spark Interview Questions : • Cache vs Persist | Spa...
Mastering Hadoop : • Hadoop Tutorial | Map ...
Visit us :
Email: techwithviresh@gmail.com
Facebook : / tech-greens
Twitter :
Thanks for watching
Please Subscribe!!! Like, share and comment!!!! Наука
very useful info
I personally liked your videos. can you mention your linkedin?
Thanks
Hi, now We have Tungsten which uses encoders for serilisaztion. SO now still we should use Kyro for serlization or tungsten will take care of it?
I have one common doubt, We could see spark is a cluster computing technique so spake job will be splited and sent across various node in cluster and processed in parallel and get us an output so here my doubt is while job splited and sent to nodes whether data to be processed and program code also will be sent? Please clarify.
So , distributed systems work on the architectural theme of sending code to the data, which the backbone and the breakthrough concept for handling of terabytes of data
Hi i have one doubt, in this performance tuning tips only when we use RDD?
Under the hood everything is red, be it dataset or df
@@TechWithViresh I don't think dataframes and datasets are under the hood powered by RDD.. can you please share any citation to the above claim? .. thanks..
@@onbootstrap RDD is building block of spark. No matter which abstraction dataframe or dataset we use, internally final computation is done on RDD..