Apache Spark Core Concepts 01

Поделиться
HTML-код
  • Опубликовано: 10 май 2022
  • This video talks about (Architecture/DAG/Stages/Tasks/Cores/Executors/Driver/Memory Usage)
    IIf you need any guidance you can book time here, topmate.io/bhawna_bedi5674
    Join to get channel membership for exclusive content - / @cloudfitness
    Follow me on Linkedin
    / bhawna-bedi-540398102
    Instagram
    bedi_foreve...
    You can support my channel at: bhawnabedi15@okicici
    Data-bricks hands on tutorials
    • Databricks hands on tu...
    Learn data-brick in 30days
    • Learn Databricks in 30...
    Data Build tool [DBT]
    • DBT Data build tool
    Machine Learning
    • Machine Learning
    Duck DB
    • Duck DB
    Snowflake Data warehouse
    • Snowflake Datawarehouse
    Azure Event Hubs
    • Azure Event Hubs
    Azure Data Factory Interview Question
    • Azure Data Factory Int...
    SQL leet code Questions
    • SQL Interview Question...
    Azure Synapse tutorials
    • Azure Synapse Analytic...
    Azure Event Grid
    • Event Grid
    Azure Data factory CI-CD
    • CI-CD in Azure Data Fa...
    Azure Basics
    • Azure Basics
    Data Bricks interview questions
    • DataBricks Interview Q...
    Microsoft Fabric
    • Microsoft Fabric Tutor...
    Databricks Lake house Monitoring
    • Playlist
    Unity Catalog
    • Unity Catalog
    Spark Functions
    • Spark Functions
    Python For ALL
    • Python for ALL
    Snowflake Interview Questions
    • Snowflake Interview Qu...
  • НаукаНаука

Комментарии • 30

  • @roshnigaddam7526
    @roshnigaddam7526 2 года назад

    Stumbled upon this channel while preparing for an interview. I am sure I am going to be very confident after watching this play series. Amazing content. Detailed explanation. Thank you!

  • @sid5201
    @sid5201 2 года назад

    Have gone thru similar videos explaining the apache spark architecture, but this has to be the best one. Very comprehensive and clear.

  • @ferrerolounge1910
    @ferrerolounge1910 Год назад

    Never seen anyone explain things this easily! wonderful keep it coming! 👍

  • @mannykhan7752
    @mannykhan7752 2 месяца назад

    Your videos are so well-detailed and explained with great clarity. Databricks is a tricky skill to master but your videos make it very easy. Great job.

  • @shanmukhpriya
    @shanmukhpriya 2 года назад

    Seriously it's very comprehensive ,crisp and clear

  • @ankushpatil1114
    @ankushpatil1114 10 дней назад

    Thank you so much

  • @karthickj8045
    @karthickj8045 2 года назад

    Amazing series of vidoes. Thank you

  • @raghavendrareddy4765
    @raghavendrareddy4765 2 года назад

    Awesome Series
    Nice explanation

  • @pokeshoot
    @pokeshoot 9 месяцев назад

    Really nice content Bhuvana. Apprciates all your hard work behind.

  • @tatianecorrea4277
    @tatianecorrea4277 2 года назад

    Great explanation....You have a lot of didactics!

  • @sriharig9096
    @sriharig9096 2 года назад

    Excellent explanation.....Thank you

  • @sravankumar1767
    @sravankumar1767 2 года назад

    Nice explanation 👌 👍

  • @elevated_minds09
    @elevated_minds09 Год назад +1

    Hi Bhawna, I just wanted to say thank you for creating such an amazing playlist. Your explanations are so clear and easy to understand, and I really appreciate the effort you put into breaking down these complex topics. I'm working on my new project involving Databricks for machine learning, and your videos have been a lifesaver. I have a question: How is the number of tasks assigned to each core determined at each stage? Is there a default value for this?

  • @yolagatiudaykumarreddy6164
    @yolagatiudaykumarreddy6164 Год назад

    hey bhawna, videos are really help me a lot.. keep continue and create one more playlist for Realtime concepts in ADB

  • @lifechamp007
    @lifechamp007 2 года назад

    Super helpful - Thank you so much !! #StayBlessednHappy

  • @lovishaghi4674
    @lovishaghi4674 2 года назад

    Amazing video....Kindly create videos on unit testing as well in databricks using python.

  • @nagamanickam6604
    @nagamanickam6604 Год назад

    Thank you

  • @sravankumar1767
    @sravankumar1767 2 года назад +1

    In our current project we are using Delta lakes, we are Raw, Trusted, refined, provisioned, provisioned to extract. Raw to trusted- Data quality check good data goes to refined then we apply transformations in refined and provisioned layer. Provisioned to extract we have simple select statements. But we have Day 0- full load and Day 1 - Incremental loading. But I didn't get a chance to work on Day 1. Here we have created the Metadata scripts tables according to that we give job names, elt-cfg, lkup db, Metadata lkup. We have created 3 scripts for each job. In our case we are not writing any merge statement for Incremental loading. How can we find difference b/w full load and incremental loading in Delta lakes. Here Metadata scripts also same for full load and incremental loading. Is there any extra columns are available for Day 1- Incremental loading. Can u pls clarify my doubts

  • @caiyu538
    @caiyu538 10 месяцев назад

    Great if you show a code how to split the task in cluster manager, it will be easy for us to understand. Pyspark is more powerful than pandas, one of reason is that it can do tasks parallels.

  • @krishnakumar-vi3tf
    @krishnakumar-vi3tf Год назад

    You are excellent,I have one doubt ..I am partitioning on date so it creates huge partitions as days goes ,how does it process does it process each partition by a core? what is the mechanism it follows when I have hundreds of partitions? Greatly appreciate your reply !!

  • @sivagssri
    @sivagssri Год назад

    Good video, can you please do video about how to handle scd2 using data bricks. I got this ques in one of interview.

  • @user-vq5ju2uk7j
    @user-vq5ju2uk7j 2 года назад

    Amazing video. good detailed explanation. could you please do a video in-depth explanation about RDD.
    and i have one doubt in RDD. if we create many rdds and it is stored in memory then will it occupy most of the memory then will we get memory out of exception . how many rdds stored in memory? please expalain Thanks

  • @neelbanerjee7875
    @neelbanerjee7875 Год назад +1

    One query -
    No of task = no of cores in an executor?
    Or
    No of task = no of partition defined?
    Can please explain this relations among task, partition and core with an example?

    • @krishnakumar-vi3tf
      @krishnakumar-vi3tf Год назад

      Yes I have the same doubt ,does each partition reaches to each core ? in that case I am portioning the data based on date so as days goes number of partitions increase so how this is distributed to each core if I have hundreds or thousands partitions 🤔🤔

  • @jdisunil
    @jdisunil 2 года назад

    Thank you so much from the data engineering community for the great videos you are putting in. One question: are we saying until a display(df) is done the previous commands are not actually actioned? am a new bee. please correct me

  • @harishkonakandla
    @harishkonakandla Год назад

    examples when task comes into picture?

  • @jaydeeppatidar4189
    @jaydeeppatidar4189 Год назад

    This video is not for beginner. If you already have knowledge of jobs,stages and tasks then it will be helpful only.

  • @shreyaspurankar9736
    @shreyaspurankar9736 9 месяцев назад

    Can you provide ppt to me?

  • @jaydeeppatidar4189
    @jaydeeppatidar4189 Год назад

    If you explain it like this in words only then no one will understand. Take the EC2 instances and expalin in detail how the storage has been divided. This video can be truncated to 3 parts each in 20min size but you have completed everything in 24 min. only. This concept is very important and big so can't explain it in just 24 min.