Maam I have a question to you . When you say action has stages and tasks etc then, What happens really happening behind the transformation ? Is it just computing and storing it as a dataframe ?
Great content and Great delivery: Question: if RDDs are immutable, and next RDD is created on basis of previous. What happens to the previous RDDs, how many such rdds are kept to until its freed? I know I should bother about the latest one. but still.
The previous RDDs are by default deleted after successful generation of new RDD- unless we use persist method, in which case the RDD we want will be persisted in cache
Yaa that's true if it fail at any step it can go back to previous step to recalculate the step again after successful it will delete the previous Rdd's
Great work ! your explanation is clear and excellent . I feel you like your content is a hidden gem.
very nice explanation, mam. Thank you.
map()- Narrow
mappartition() - Narrow
groupbyKey()- WideSpread
reduceByKey() - WideSpread
Join()- Narrow
distinct() - WideSpread
intersect()- WideSpread
flatMap() - Narrow
filter() - Narrow
Union() - Narrow
Please correct me if I am wrong.
Join is a Very Big Wide transformation in Spark, how come you mentioned it under Narrow
Is this playlist focused on mainly on Scala ?
Hi Bhawna, very nice explanation, could you please share the notebook used during this exercise.
Nice content but bit confusion is there @Bhawna while explanation
It is very understanding and great sessions , can you please provide the notebook for future reference purpose.
Maam I have a question to you . When you say action has stages and tasks etc then, What happens really happening behind the transformation ? Is it just computing and storing it as a dataframe ?
I like your content and very informative. Thank you.
Could you please share those ppt's if possible?
Great content and Great delivery:
Question: if RDDs are immutable, and next RDD is created on basis of previous. What happens to the previous RDDs, how many such rdds are kept to until its freed? I know I should bother about the latest one. but still.
The previous RDDs are by default deleted after successful generation of new RDD- unless we use persist method, in which case the RDD we want will be persisted in cache
Yaa that's true if it fail at any step it can go back to previous step to recalculate the step again after successful it will delete the previous Rdd's
👍👌
👍
Ma'am do you provide the ppts for reference??