Hadoop vs Spark | Lec-3 | In depth explanation

Поделиться
HTML-код
  • Опубликовано: 25 мар 2023
  • In this video I have talked about Apache spark vs hadoop. I have talked the difference in detail. If you have some doubt please shoot your questions in comment section.
    Directly connect with me on:- topmate.io/manish_kumar25
    For more queries reach out to me on my below social media handle.
    Follow me on LinkedIn:- / manish-kumar-373b86176
    Follow Me On Instagram:- / competitive_gyan1
    Follow me on Facebook:- / manish12340
    My Second Channel -- / @competitivegyan1
    Interview series Playlist:- • Interview Questions an...
    My Gear:-
    Rode Mic:-- amzn.to/3RekC7a
    Boya M1 Mic-- amzn.to/3uW0nnn
    Wireless Mic:-- amzn.to/3TqLRhE
    Tripod1 -- amzn.to/4avjyF4
    Tripod2:-- amzn.to/46Y3QPu
    camera1:-- amzn.to/3GIQlsE
    camera2:-- amzn.to/46X190P
    Pentab (Medium size):-- amzn.to/3RgMszQ (Recommended)
    Pentab (Small size):-- amzn.to/3RpmIS0
    Mobile:-- amzn.to/47Y8oa4 ( Aapko ye bilkul nahi lena hai)
    Laptop -- amzn.to/3Ns5Okj
    Mouse+keyboard combo -- amzn.to/3Ro6GYl
    21 inch Monitor-- amzn.to/3TvCE7E
    27 inch Monitor-- amzn.to/47QzXlA
    iPad Pencil:-- amzn.to/4aiJxiG
    iPad 9th Generation:-- amzn.to/470I11X
    Boom Arm/Swing Arm:-- amzn.to/48eH2we
    My PC Components:-
    intel i7 Processor:-- amzn.to/47Svdfe
    G.Skill RAM:-- amzn.to/47VFffI
    Samsung SSD:-- amzn.to/3uVSE8W
    WD blue HDD:-- amzn.to/47Y91QY
    RTX 3060Ti Graphic card:- amzn.to/3tdLDjn
    Gigabyte Motherboard:-- amzn.to/3RFUTGl
    O11 Dynamic Cabinet:-- amzn.to/4avkgSK
    Liquid cooler:-- amzn.to/472S8mS
    Antec Prizm FAN:-- amzn.to/48ey4Pj

Комментарии • 75

  • @pranavbhawane7591
    @pranavbhawane7591 5 месяцев назад +2

    Manish bhai, kya gajab admi ho yrr tum, content aur knowledge bohot kamal hai, thankyou for the videos

  • @Lakshvedhi
    @Lakshvedhi Год назад +3

    I have been following your channel for long time. I love your content. I am preparing for data for data engineering. And these videos are helping me very much. Thank you so much.

  • @yashbaviskar6317
    @yashbaviskar6317 Год назад

    Amazing content Manish bhaiya 🙌.. Looking forward to more such exciting and knowledgeable video content.....

  • @danishthev-log2264
    @danishthev-log2264 Год назад

    Aag laga diya sir ji aapne maine phle spark complete kr rkha h pr itna deeply aaj sikne ko mila mujhe aapke channel se..overwhelming content.🙂🙂

  • @rishav144
    @rishav144 Год назад +1

    well explained . Thanks for consistent videos

  • @nilavnayan4521
    @nilavnayan4521 Год назад

    Great content Manish bhai, really good comparison, good points!
    Thanks!

  • @rawat7203
    @rawat7203 Год назад

    Thank you Manish, started following you lately ... Amazing content .. Keep up the good work

  • @sunnyd9878
    @sunnyd9878 6 месяцев назад

    Bhai bahut Badhiya explain kiya Hai... excellent thanks

  • @vaibhavkamble3325
    @vaibhavkamble3325 3 месяца назад

    Right Class for individual. For beginners.❤❤❤
    Thank you.

  • @shreeb7352
    @shreeb7352 Год назад

    thanks for explaining WHYs! very helpful!

  • @coding7241
    @coding7241 Год назад

    i watched it 3 times.....awesome video

  • @nakulbageja2232
    @nakulbageja2232 Год назад

    Great work, thank you👌🙌

  • @deeksha6514
    @deeksha6514 4 месяца назад

    Best playlist over internet

  • @pratikparbhane8677
    @pratikparbhane8677 5 месяцев назад +3

    Attendance Marked

  • @talhaaziz4847
    @talhaaziz4847 2 месяца назад

    Outstanding... Keep it up. A very good and short informative videos. make more videos with more details.
    highly recommended for all

  • @ujjalroy1442
    @ujjalroy1442 Год назад

    Very detailed yaar.... Thanks

  • @SANJAYYADAV-hm2bs
    @SANJAYYADAV-hm2bs 5 месяцев назад

    Manish brother, our content is really awesome.
    Feeling lucky to find your channel.

  • @pritiiBisht
    @pritiiBisht 6 месяцев назад

    Really Appreciated. I like the content.

  • @vedant_dhamecha
    @vedant_dhamecha 8 месяцев назад

    I am watching two hours before my university exams! All clearly i can understand! Hatts off man

  • @ANJALISINGH-nr6nk
    @ANJALISINGH-nr6nk 7 месяцев назад

    You are the best.

  • @journeyWithAshutosh
    @journeyWithAshutosh 6 месяцев назад +1

    sir, pyspark ka full syllabus wala ek playlist banayi ye na plz

  • @coding7241
    @coding7241 Год назад

    thnaks

  • @ComedyXRoad
    @ComedyXRoad 3 месяца назад

    thank you brother

  • @chandrakantkumar1276
    @chandrakantkumar1276 5 месяцев назад +1

    Namastey Sir,
    Time 21:00 explanation me ek doubt hai
    Fault Tolerance jo HDFS me hota hai wo cluster level par hota hai, in-case koi node fail ho gaya tab recovery hota hai aur ye recovery master node karti hai.
    Lekin Spark to ek Compute engine hai, aur yadi storage HDFS hi ho aur yaha pe ek node fail ho jata hai to yaha pe bhi data-recovery to waise hi hoga jaise Hadoop Ecosystem me hota tha, fir DAG Spark me Fault-Tolerance ka kaam kaise kiya, Jitna mujhe samajh aa raha hai, DAG to data ko re-compute karega lekin ye nahi samajh aa raha hai ki under what circumstances Spark will have to use DAG to re-compute/re-process something. Please explain if you have any example/use-case

  • @wellwisher7333
    @wellwisher7333 Год назад

    Thanks bhai

  • @aryankhandelwal8517
    @aryankhandelwal8517 11 месяцев назад

    GOOD VIDEO🤟

  • @sanooosai
    @sanooosai 4 месяца назад

    thank you sir

  • @dataman17
    @dataman17 5 месяцев назад

    Brilliant explanations!

    • @rajandeshmukh3094
      @rajandeshmukh3094 5 месяцев назад

      Are you a fellow data engineering aspirant ?

  • @amanjha5422
    @amanjha5422 Год назад

    Bhaiya plz is series ko age lekr jaiye .
    Me bhut dino se ye sikhna chta tha and apki video bhut mstt hh ..

  • @navjotsingh-hl1jg
    @navjotsingh-hl1jg Год назад

    bro aap roz video upload karo humari consistency banni rahi gayi

  • @shubhajitadhikary1960
    @shubhajitadhikary1960 8 месяцев назад

    🔥🙇🏻🙏🏻

  • @nitilpoddar
    @nitilpoddar 7 месяцев назад

    done

  • @manish_kumar_1
    @manish_kumar_1  Год назад

    Directly connect with me on:- topmate.io/manish_kumar25

  • @rajeshwarreddyracha4655
    @rajeshwarreddyracha4655 10 месяцев назад

    Why we will use Hive, if we have already Spark in our project, Any specific reason ?

  • @lifelearningneo
    @lifelearningneo Год назад

    bhaiya Hadoop me fault tolerance to kewal storage level pe hoti hai na , application level pe fault tolerence nahi hota naa,,correct me if I am wrong

  • @harshi993
    @harshi993 4 месяца назад

    What in what ? Data storage or processing ?

  • @gchanakya2979
    @gchanakya2979 Год назад +2

    Marking my attendance 🙏

  • @LOFI_WORLD_SONG
    @LOFI_WORLD_SONG 6 месяцев назад

    I don't want to code. Can I learn data engineering or should I go for Devops engineering?

  • @chiragsharma9430
    @chiragsharma9430 Год назад

    Hi Manish can you also make a video on spark related project which could be useful for aspiring data scientists also just like the one you have created for data engineering specific.
    Thanks in advance!

  • @amitkumar-ij9sw
    @amitkumar-ij9sw 5 месяцев назад

    Manish hadoop was developed by former yahoo developer Doug Cutting not by google

  • @anshukumari6616
    @anshukumari6616 Год назад

    Thanks for the detailed explaination !!

  • @ytsh9366
    @ytsh9366 Год назад

    Hello Manish bhaiyya, I have two year experience in service based company on web development and I wanted to switch into data engineering profile I learnt SQL and learning python after watching your video and my company do not change role internally so how to switch into data engineering role pls answer this pls

    • @manish_kumar_1
      @manish_kumar_1  Год назад

      Watch one of my titled " How I bagged 12 offers "

  • @reachrishav
    @reachrishav Год назад

    Hi Manish, how do you make such notes in onenote? What stylus/device is required for this? I want to purchase a similar device for digital note-taking. Please advise.

    • @manish_kumar_1
      @manish_kumar_1  Год назад

      Pentab is required to write it on notebook or ppt. You can buy online. I have medium size one. You can find the link in description

    • @reachrishav
      @reachrishav Год назад

      @@manish_kumar_1 Is it the iPad pencil you're referring to? Will wacom one pen tablet work the same?

    • @manish_kumar_1
      @manish_kumar_1  Год назад +1

      @@reachrishav yes but it won't have any screen. You will get a pad and stylus. You have write on pentab with stylus but what ever you are writing will be shown in laptop one note or ppt or any other software that you are using

    • @reachrishav
      @reachrishav Год назад

      @@manish_kumar_1 Thanks. I guess you are using wacom tablet/stylus for this video?

  • @Watson22j
    @Watson22j Год назад

    Bhaia, 128MB to default size hota hai na block storage ka jo ki hum customise kr skte hai apne jarurat ke hisab se. To mera sawal ye tha ki, kis case me ye block storage ka size hum decrease krte hai aur kis case me increase krte hain?

    • @manish_kumar_1
      @manish_kumar_1  Год назад

      If we have many smaller size disk blocks, the seek time would be maximum (time spent to seek/look for an information). And also, having multiple small sized blocks is the burden on name node/master, as ultimately the name node stores metadata, so it has to save this disk block information.

    • @Watson22j
      @Watson22j Год назад

      @@manish_kumar_1 Thank you :)

  • @punkad2337
    @punkad2337 Год назад

    Manish sir ,
    Ap Data Engineer ka course ya tutorial videos provide kara sakte ho kya ??
    Agar kara skte please provide me link so that i will buy the tutorials or course ??

    • @manish_kumar_1
      @manish_kumar_1  Год назад +1

      Free me hi padhata hu. Aap Mera 12 offer wala video dekh lijiye. Saare free resources mil jayenge

  • @TheBest-yh1yj
    @TheBest-yh1yj 8 месяцев назад

    Bhai, I have question related to DAG. If process 3 get failed, then DAG knows the steps to generate the information of process 3. What happens when process 1 gets failed? how DAG recover forms it? and what is process?

    • @soumyaranjanrout2843
      @soumyaranjanrout2843 7 месяцев назад +2

      If "process 1" fails in the DAG, the recovery would typically involve retrying or restarting "process 1" itself. The success of this recovery depends on whether "process 1" is independent or has dependencies. If it has dependencies, those may need to be reprocessed as well to ensure a consistent state in the workflow. Essentially, DAG recovery for a failed process involves identifying the failure point, addressing it, and potentially rerunning dependent processes to maintain the integrity of the workflow.
      Thanks
      ChatGPT
      Let me elaborate it:
      A Directed Acyclic Graph (DAG) in Spark represents a computational workflow where nodes denote tasks or operations, and directed edges illustrate dependencies between these tasks. In the context of fault tolerance, if a task like "process 1" fails, the DAG aids recovery by re-executing the failed task based on information collected from its dependencies, ensuring the computational flow continues.
      Consider a scenario where you apply five transformations to a DataFrame (DF). Each transformation creates a new DF as DFs are immutable. If, for instance, "transformation 4" fails during execution, Spark retrieves information from "transformation 3's" DF (its dependency) and then re-executes "transformation 4."
      Regarding your question about "process 1" failure, if it fails, recovery involves restarting "process 1." Given interdependencies between tasks, subsequent transformations won't proceed if the initial process fails. The DAG orchestrates this recovery process by ensuring the restarting of the failed task, allowing the entire workflow to progress seamlessly.
      If I am wrong then please someone let me know because I am also beginner in Data domain.

    • @TheBest-yh1yj
      @TheBest-yh1yj 7 месяцев назад

      @@soumyaranjanrout2843 thanks for details explainantion.
      What is the meaning of "Given interdependencies between tasks, subsequent transformations won't proceed if the initial process fails."?

    • @soumyaranjanrout2843
      @soumyaranjanrout2843 5 месяцев назад

      @@TheBest-yh1yj In simpler terms, if one step in a process fails, the following steps that depend on it also get stuck until the initial issue is resolved. If I will simplify it more then as we knew every tasks are interdependent so if task 1 got failed(as per your question) then the remaining tasks that rely on its output cannot continue until the initial task is successfully completed. Hope you understood it😊

  • @adityaanand835
    @adityaanand835 Год назад

    i think the title should be Mapreduce vs Spark.. hadoop me dono use kr hi sakte h na..

    • @manish_kumar_1
      @manish_kumar_1  Год назад

      Yes it should be map reduce vs spark. But the term Hadoop vs spark is more popular

    • @adityaanand835
      @adityaanand835 Год назад +1

      @@manish_kumar_1 Dont stick with the popularity stick with the concept. to avoid confusions

  • @alakmarshafin9065
    @alakmarshafin9065 5 месяцев назад

    Minor Correction Hadoop is created by Yahoo! not google

  • @siddharthsinghh
    @siddharthsinghh Год назад

    bhaiya hadoop bhi padhna hoga kya ya spark chalega

    • @manish_kumar_1
      @manish_kumar_1  Год назад

      Hadoop me hdfs padh lijiye and yarn. MapReduce ki zarurat nahi hai

    • @siddharthsinghh
      @siddharthsinghh Год назад

      @@manish_kumar_1 ha utna dekha hu bhaiya vo great learning se tabhi kaise slow hai mapreduce samjha mai

  • @abidkhan.10
    @abidkhan.10 Год назад

    Kon sa course hai ye

  • @youtubekk3003
    @youtubekk3003 Год назад

    Bro Hadoop made by Yahoo engineers not Google

    • @sakarbakshi1977
      @sakarbakshi1977 4 месяца назад

      Bro iska lecture ka baki content sunno!! Interviewer voh puchega!!! Jo correct karva re voh nhi😅