spark architecture | Lec-5

Поделиться
HTML-код
  • Опубликовано: 4 апр 2023
  • In this video I have talked about spark Architecture in great details. please follow video entirely and ask doubt in comment section below.
    Directly connect with me on:- topmate.io/manish_kumar25
    For more queries reach out to me on my below social media handle.
    Follow me on LinkedIn:- / manish-kumar-373b86176
    Follow Me On Instagram:- / competitive_gyan1
    Follow me on Facebook:- / manish12340
    My Second Channel -- / @competitivegyan1
    Interview series Playlist:- • Interview Questions an...
    My Gear:-
    Rode Mic:-- amzn.to/3RekC7a
    Boya M1 Mic-- amzn.to/3uW0nnn
    Wireless Mic:-- amzn.to/3TqLRhE
    Tripod1 -- amzn.to/4avjyF4
    Tripod2:-- amzn.to/46Y3QPu
    camera1:-- amzn.to/3GIQlsE
    camera2:-- amzn.to/46X190P
    Pentab (Medium size):-- amzn.to/3RgMszQ (Recommended)
    Pentab (Small size):-- amzn.to/3RpmIS0
    Mobile:-- amzn.to/47Y8oa4 ( Aapko ye bilkul nahi lena hai)
    Laptop -- amzn.to/3Ns5Okj
    Mouse+keyboard combo -- amzn.to/3Ro6GYl
    21 inch Monitor-- amzn.to/3TvCE7E
    27 inch Monitor-- amzn.to/47QzXlA
    iPad Pencil:-- amzn.to/4aiJxiG
    iPad 9th Generation:-- amzn.to/470I11X
    Boom Arm/Swing Arm:-- amzn.to/48eH2we
    My PC Components:-
    intel i7 Processor:-- amzn.to/47Svdfe
    G.Skill RAM:-- amzn.to/47VFffI
    Samsung SSD:-- amzn.to/3uVSE8W
    WD blue HDD:-- amzn.to/47Y91QY
    RTX 3060Ti Graphic card:- amzn.to/3tdLDjn
    Gigabyte Motherboard:-- amzn.to/3RFUTGl
    O11 Dynamic Cabinet:-- amzn.to/4avkgSK
    Liquid cooler:-- amzn.to/472S8mS
    Antec Prizm FAN:-- amzn.to/48ey4Pj

Комментарии • 113

  • @Shradha_tech
    @Shradha_tech 5 месяцев назад +9

    There is a saying 'if you can't explain it simply, you don't know yourself very well ', fits so accurately. You have understood it so well that you made it even easier for others. Thank you for all the hard work.

  • @sabarnaghosh1658
    @sabarnaghosh1658 Год назад +26

    please be consistent , dont't leave midway,,i have 5 years SQL development experience , i will switch to big data spark domain within 3 months, pls don't stop midway, you are making wonderful videos

  • @boseashish
    @boseashish 2 месяца назад +3

    sir, aap ne jaan laga di hai videos banane me....bahut he sachhe videos hain...bhagwan aap ko bahut tarkki de aisi prarthna hai

  • @sachinbhoi5727
    @sachinbhoi5727 9 месяцев назад +3

    Very detailed and layman explaination which no one gives, keep it up

  • @lucky_raiser
    @lucky_raiser Год назад +3

    bro, you can be the codeWithHarry of data engineering world, keep continuing this thing. and thanks for this knowledge sharing.

  • @satyamrai2577
    @satyamrai2577 11 месяцев назад +2

    Beautifully explained. Concepts are so much easier to understand with the help of diagram.

  • @sahillohiya7658
    @sahillohiya7658 9 месяцев назад +2

    you are one of THE BEST TEACHER i have ever known

  • @kumarankit2302
    @kumarankit2302 2 месяца назад +3

    Kya bawal padhai ho Manish bhai, In future if anyone comes to me for guidance ki kaha sa padhna chiaya, I think without any doubt i will refer your channel

  • @laboni8359
    @laboni8359 10 месяцев назад +1

    Literally mind-blown by your teaching! Awsome content

  • @gchanakya2979
    @gchanakya2979 Год назад +1

    You are building my confidence in the subject. Thank you bhaiya.

  • @arju1010
    @arju1010 9 месяцев назад

    I have watched many tutorials on Spark but you are the best. The way you teaching is amazing. Sir, please don't stop to uploade tutorials like this. You are great sir. Thank you. From Bangladesh

  • @nayanikamula7109
    @nayanikamula7109 Месяц назад

    You probably won’t see this. But I watched your videos 2 days before my DE interview and I cracked it with confidence. Like you said, the fundamentals make all the difference. My understanding was so clear that they offered me the position on the spot

  • @AprajitaPandey-of2kf
    @AprajitaPandey-of2kf Месяц назад

    explained wonderfully.

  • @phulers
    @phulers 6 месяцев назад +2

    I think therir is slight confuson between AM(Application Master) and driver program: 8:28
    The AM launches the driver program within a container on a worker node.
    The driver program communicates with the AM for resource allocation and task scheduling.
    The AM acts as a bridge between the driver program and the cluster manager(YARN).

    • @lalghoda5293
      @lalghoda5293 19 дней назад

      can you explain the last point
      "The AM acts as a bridge between the driver program and the cluster manager(YARN)."
      .
      As I understand AM is created under container, which is under worker node, which then negotiates with driver or master node for resources

  • @deeksha6514
    @deeksha6514 4 месяца назад

    Superb! Explanation

  • @Suraj_a_3405
    @Suraj_a_3405 Год назад

    Thank you,This is perfect.

  • @nayanikamula7109
    @nayanikamula7109 Месяц назад

    You are a wonderful teacher. You have a gift. Please start a DE bootcamp. You’ll see great success with it I’m sure

  • @harshitgupta355
    @harshitgupta355 7 месяцев назад

    Thankyou manish bhai for this wonderful video

  • @nitiksharathore5290
    @nitiksharathore5290 2 месяца назад

    Thank you so much for this explanation, Please continue the good work

  • @ujjalroy1442
    @ujjalroy1442 Год назад

    Right said.... Very detailed👏👏👍👍

  • @HarshKumar-adi
    @HarshKumar-adi 9 дней назад

    Very Good.....

  • @shubhamwaingade4144
    @shubhamwaingade4144 5 месяцев назад

    The video summary at the end are very useful to recall everything from the video! Good thought Manish...

  • @kavyabhatnagar716
    @kavyabhatnagar716 10 месяцев назад

    Crystal clear. Thanks a lot. 👏

  • @rpraveenkumar007
    @rpraveenkumar007 Год назад +3

    Thank you, Manish. It was an absolutely crystal clear explanation. Hoping to get more in-depth videos like this.

  • @siddhantmishra6581
    @siddhantmishra6581 Месяц назад

    brilliantly explained. Loads of Thanks.

  • @atulbisht9019
    @atulbisht9019 7 месяцев назад

    thanks for the video manish

  • @rishav144
    @rishav144 Год назад +1

    very nice series

  • @anirbanadhikary7997
    @anirbanadhikary7997 Год назад

    Wonderful explanation

  • @dhairyaarya2500
    @dhairyaarya2500 2 месяца назад

    the flow of explanation and engagement were on point 💯

  • @satyammeena-bu7kp
    @satyammeena-bu7kp Месяц назад

    So Helpful ! Really a Great Explanation !

  • @mrinalraj7166
    @mrinalraj7166 5 месяцев назад

    khatarnak Manish bhai. maja aa gya

  • @Rakesh-if2tx
    @Rakesh-if2tx Год назад +1

    Please continue making videos like this with complete information... I appreciate your hard work. Time lage to lage... Concept clear hona chahiye... 😅

  • @amitgupta8179
    @amitgupta8179 5 месяцев назад +2

    Bhai concept apne deep diya hai
    Lekin mujhe avi bhi container me bahut confusion hai...
    Repeat krne ke baad bhi clear nahi hua

  • @mahnoorkhalid6496
    @mahnoorkhalid6496 10 месяцев назад

    Great

  • @himanshutrripathii2837
    @himanshutrripathii2837 Год назад

    Salute for your hard work but hope in the next video you will come up with the practical too..

  • @panyamaravind852
    @panyamaravind852 5 дней назад

    ❤❤❤❤❤❤❤ excellent explanation bro

  • @Rakesh-if2tx
    @Rakesh-if2tx Год назад

    Thank you Manish Bhai.... You're really doing a great work🙏🏻🙏🏻.... In this series please upload the videos a bit faster... 😊

  • @shubham2881
    @shubham2881 8 месяцев назад

    Stunning explanation bro 👍

  • @user-hr2tz6ny1v
    @user-hr2tz6ny1v 10 месяцев назад

    explained very well

  • @Analystmate
    @Analystmate 10 месяцев назад

    Maja aa gya bhai . Khan Sir yaad aa gye 🙂
    Thanks

  • @dishanttoraskar2885
    @dishanttoraskar2885 11 месяцев назад

    Very well explained 🤩

  • @SrihariSrinivasDhanakshirur
    @SrihariSrinivasDhanakshirur 4 месяца назад

    God level explanation!

  • @engineerbaaniya4846
    @engineerbaaniya4846 10 месяцев назад +2

    Hi Manish, I watched this completely I Understood But most of the time in interviews people ask about spark contenxt and the other way of architechture that you did not covered any view on this ?

  • @tanushreenagar3116
    @tanushreenagar3116 5 месяцев назад

    PERFECT BEST ONE EVER

  • @shivakrishna1743
    @shivakrishna1743 Год назад

    Thanks!

  • @shreyakeshari951
    @shreyakeshari951 12 дней назад

    Hi, Thank you for such informative videos
    I am not able to find Lecture-6 in Spark Fundamental Series, please guide me from where I can watch lecture -6

  • @user-dp1rw2qm1c
    @user-dp1rw2qm1c 10 месяцев назад

    Hi Manish, great explanation, I have one doubt-
    Is it possible to add more than one executor in worker node?
    asking because u demonstrated as one executor comes to one worker node only.

  • @raghudeep8873
    @raghudeep8873 Год назад

    👍👍👍👍

  • @analyticstamizan677
    @analyticstamizan677 6 месяцев назад

    Great explanation bro👌👍.. It would be nice if you add subtitles.

  • @RakeshGupta-kx5qe
    @RakeshGupta-kx5qe 11 месяцев назад

    Hi Manish Thank you very much for sharing great knowledge . Currently I have 10.5 Year Experience in IT including SQL,PLSQL(7 Year), SQL Server T-SQL (1.5 Year) and Snowflake Query Optimization 6 Month . When I was joined before 2 Year as Data Engineer (Spark with Scala) in one MNC company but He was given project on T-SQL . I was only taken trainings and search interview question and clear interview . At time I on bench what should be we take decision Please suggest me?.

  • @RAHULKUMAR-px8em
    @RAHULKUMAR-px8em Год назад

    Bhaiya total Syllabus cover kijiyga me apka Spark series follow kr rha hu

  • @rajvirkumar4787
    @rajvirkumar4787 Месяц назад

    good

  • @susreesuvramohanty261
    @susreesuvramohanty261 3 месяца назад

    When application driver will stop working?Could you please explain again ?

  • @sagarmendhe8194
    @sagarmendhe8194 Месяц назад

    Hi Manish sir agar cluster size puche interview me to kaisa batane ka

  • @aniketnaikwadi6074
    @aniketnaikwadi6074 7 месяцев назад

    hello Manish Kumar,
    hope you're doing well , Very well explained concept and very good Spark series, can you provide the pdf or link of the notes?

  • @amitpatel9670
    @amitpatel9670 Год назад +1

    osm video.. also please share the playlist or course for SQL. would really appreciate it.

    • @manish_kumar_1
      @manish_kumar_1  Год назад

      You can follow kudvenkat youtube channel for sql

  • @SurajKumar-hb7oc
    @SurajKumar-hb7oc 7 месяцев назад

    Hii @manish I have two questions
    1) What is the difference between cluster manager and resource manager?
    2) How developer tell that this type of requirements like RAM, core?

  • @krishnavamsirangu1727
    @krishnavamsirangu1727 3 месяца назад

    Hi Manish
    I am learning spark from your videos,but in this video I am bit confusing because you are saying driver is present in worker node but actual architecture diagram it is saying driver present in master.
    Could you please clarify or elaborate on this.

  • @anupandey7888
    @anupandey7888 2 месяца назад

    Hi Manish I have Question why can't the UDF in Pyspark be converted to Java code in the application Master

  • @shreyanshrathod2007
    @shreyanshrathod2007 2 месяца назад

    Thanks for the explanation Manish. One quick question, aapne yaha 5 executors 5 alag alag worker nodes pe banaye hai. Is it possible that we can have more than 1 executor available on the same worker node/ same machine?
    Thanks in advance

  • @rupeshreddy4408
    @rupeshreddy4408 3 месяца назад

    Great Explanation! But, I have doubt regarding the driver. Will there be an extra worker node for driver manager or can it be in any of the executors which process the data. What I mean is for instance if we want to process 10 GB let's say after calculation we want 16 executors, so along with driver will it be 17 executors or am I missing something here.?

  • @kudlamolka1429
    @kudlamolka1429 24 дня назад

    The Spark Code can be written in Scala itself right? Will we need Application Driver even if the code is written in Scala?

  • @quiet8691
    @quiet8691 4 месяца назад +1

    Bro yeh lecture ka notes do na pdf format m pls

  • @audiobook-pustakanbarobara2603
    @audiobook-pustakanbarobara2603 4 месяца назад

    I have one doubt please anyone resolve it.
    pyspark driver is created only in the application master if we don't use any udf(user defined function) but we write code in pyspark and that distributly process on the the worker nodes so even if I use any udf or not but our code is in pyspark only then how the worker nodes process the pyspark code even though I having only JVM and not having any python worker in worker node?

  • @shubhamkhatri6908
    @shubhamkhatri6908 9 месяцев назад

    Hi Manish, very informative video.
    I have one question, what exactly executor is?
    As per my understanding, its responsible for executing task and have cores in it for processing.
    Since each worker node has 20 core, i can create execution with any core and any memory.

    • @manish_kumar_1
      @manish_kumar_1  9 месяцев назад +1

      Worker node me se aapko kuch memory milega in form of container for your spark job. And in that container aapka executor chalega with the memory that you have asked for. So let say worker node ke paas 64 GB RAM and 16 core CPU hai. And aap bas 10 GB with 3 core manage ho to utna hi milega. Baaki ka memory kisi aur job ko milega

  • @manish_kumar_1
    @manish_kumar_1  Год назад

    Directly connect with me on:- topmate.io/manish_kumar25

  • @nitilpoddar
    @nitilpoddar 7 месяцев назад

    done

  • @yashwantdhole7645
    @yashwantdhole7645 6 месяцев назад

    Hi Manish. If code from Pyspark driver is getting converted into equivalent java code, won't the udfs too will get converted?
    If this is true? Why do we need Python Worker again in the executor?

  • @vikasvk9174
    @vikasvk9174 Год назад

    Great explanation
    I have doubt 1) what happend if we don't have 5 free worker in cluseter
    2) we have 5 free worker but we don't have enough cpu core or memory that we requested
    Thank you and waiting for you replay

    • @manish_kumar_1
      @manish_kumar_1  Год назад +1

      You will have to wait in queue. FIFO is applied by the resource manager

  • @raviyadav-dt1tb
    @raviyadav-dt1tb 6 месяцев назад

    I’m following 2024.

  • @worldthroughmyvisor
    @worldthroughmyvisor 6 месяцев назад

    what if I try to provision more executor nodes than is available on my cluster ?
    or what if I try to provision more ram or CPU cores than the capacity of my executors ?
    can you try to explain what would happen on a cluster as locally I think it is more difficult to replicate it ?

    • @manish_kumar_1
      @manish_kumar_1  6 месяцев назад

      You can try locally also. Ask for more than available RAM in your system but you are going to only available memory. If you are ask for more memory then you are not going to get that because there is a hardware limit. You will be allocated the memory available in your cluster. If already multiple jobs are running then your job will be in queue waiting to get memory to be available for the run. It runs in FIFO manner

  • @chiragsharma9430
    @chiragsharma9430 Год назад

    Hi Manish, I have one question to ask, I have seen in some job descriptions mentioning about the databricks. What does it mean when we see one must know on how to work on DataBricks? I mean when someone say a candidate should know on how to work on DataBricks what exactly they mean by that? What are the things one should know about DataBricks?
    Looking forward for your reply.

    • @manish_kumar_1
      @manish_kumar_1  Год назад

      You should know how to work with databricks. It's just a tool which you can learn very easily once you start using it

    • @chiragsharma9430
      @chiragsharma9430 Год назад

      Alright, thanks for the reply, Manish. Really appreciate your response.

  • @khurshidhasankhan4700
    @khurshidhasankhan4700 8 месяцев назад

    Sir agar ek node fail hoga to kya karenge interview me pucha hai , please give me the answer, bahut fasa raha hai interview me

  • @bangalibangalore2404
    @bangalibangalore2404 Год назад

    Hello Manish, agar hum zyada RAM ya zyada core maang le ek machine me jitnna available hai usse zyada to kya hoga?

    • @manish_kumar_1
      @manish_kumar_1  Год назад +1

      Resource wastage hoga. And aapko denge nahi extra resources kyunki RAM bahut costly Hota hai.

    • @bangalibangalore2404
      @bangalibangalore2404 11 месяцев назад

      Aur ek question hai ki files ko kya le ke aaya jata hai? Matlab files to distributed tareeka se padi hongi usi cluster me, to Jahan file hai wahin pe executor banega ki randomly banega.
      Soch lijiye file abc.csv hai machine 4, 5 me.
      To yarn se jab resources maangenge to 4, 5 me hi banayega executor ka container? Ya fir randomly cluster me kahin bhi banega?

  • @hafizadeelarif3415
    @hafizadeelarif3415 3 месяца назад

    Hallo Brother,
    I have a question: Spark is a distributed processing framework and is fail-tolerant. However, if the driver node fails, what happens?

  • @rajasekhar4023
    @rajasekhar4023 Год назад +1

    Did not Understand the JVM main(), Since Spark supports Python then why JVM needed to submit spark application. pls Explain Elaborately ?
    Thanks for Wonderful session..

    • @rajasekhar4023
      @rajasekhar4023 Год назад

      What exactly use of JVM since spark supports Python to code ?

    • @bangalibangalore2404
      @bangalibangalore2404 11 месяцев назад +1

      Spark is written in Java/ Scala, spark by default does not understand python, think that there is a language translator that changes the python code to Java Byte Code which is understood by spark. Thus python code is converted to Java code first and then the code is run
      Spark supports python due to this translator

  • @ETLMasters
    @ETLMasters Год назад

    Spark ecosystem ki study bas interview crack karne ke liye jaruri h ya fir iska practical work me bhi koi use h?

    • @manish_kumar_1
      @manish_kumar_1  Год назад

      Overall picture samajhne ke liye pata hona chahiye

  • @rukulraina7440
    @rukulraina7440 9 месяцев назад

    can anyone explain this to me if they understood it well ?

  • @Home-so8gi
    @Home-so8gi Месяц назад

    ek node me 5 containers ni ban sakte? 20gb ka

    • @manish_kumar_1
      @manish_kumar_1  Месяц назад

      Ban sakte hai. Wahi to video me bola Tha. Workload ke basis par container Banta hai

  • @user-rh1hr5cc1r
    @user-rh1hr5cc1r 3 месяца назад

    Spark Architecture:
    whenever a job is initiate, 'Spark Context' start the 'Spark Session'.It connect with 'Cluster Manager' and trying to understand how many 'Worker Node'(Slave) is required and once the information is received, the 'Driver Program'(Master) will start assign the task to the Worker Node. 'Executor' is responsible for doing all the task. Inter mediate results stored in 'Cache'. All the Worker Node connected with each other so that it can share data, logics with each other

  • @shirikantjadhav4308
    @shirikantjadhav4308 11 месяцев назад

    Hi, I'm following your video and I need PDF file so could you provide me?

    • @manish_kumar_1
      @manish_kumar_1  11 месяцев назад

      I think you haven't watched starting wala video. I don't provide pdf, you have to note it down by yourself. By this way, you are only going to get benefits

  • @moyeenshaikh4378
    @moyeenshaikh4378 9 месяцев назад

    bhai last me jo driver band hoga bole tum.. vo application driver hoga na?
    and isme ek application driver gai and dusra bhi koi driver hai kya master me?

    • @manish_kumar_1
      @manish_kumar_1  9 месяцев назад +1

      Ek job ka ek hi driver hoga. And driver band hone ke baad executor v band ho jayega.

  • @radheshyama448
    @radheshyama448 11 месяцев назад

    ❤💌💯💢

  • @moyeenshaikh4378
    @moyeenshaikh4378 9 месяцев назад

    bhai theory and practical khatam ho gaya hai kya playlist? ya bacha hai kuch?

  • @sharma-vasundhara
    @sharma-vasundhara 5 месяцев назад

    I have a question - in the video, we wanted 5 executors of 25 GB RAM, 5 cores each. And for 5 executors you used - w2, w3, w4, w7, and w8. Now, all of them have 100 GB RAM and 20 cores.
    Why can't we put 4 executors on a single machine? 4 x 25 = 100 GB, and 4 x 5 = 20 cores
    That way, our resources (executors, driver) will be spread across less number of machines. I don't know what benefits/drawbacks that might have. Just curious why can't we do this

  • @TechnoSparkBigData
    @TechnoSparkBigData 10 месяцев назад

    so driver is our application master?

    • @manish_kumar_1
      @manish_kumar_1  10 месяцев назад +1

      Nahi. Application master container ke inside Jo Application driver Banta Hai that is the driver

    • @TechnoSparkBigData
      @TechnoSparkBigData 10 месяцев назад

      @@manish_kumar_1 thanks

  • @deepakpandey836
    @deepakpandey836 Год назад

    ye fir se dekhna hoga lol

  • @CodeInQueries
    @CodeInQueries Год назад

    Hii Manish I need your linkedin profile link for connect with you.. I need a guidance

    • @manish_kumar_1
      @manish_kumar_1  Год назад +1

      Check description. You can find all of my social media handle link