Hadoop Tutorial - High Availability, Fault Tolerance & Secondary Name Node

Поделиться
HTML-код
  • Опубликовано: 18 сен 2024
  • Spark Programming and Azure Databricks ILT Master Class by Prashant Kumar Pandey - Fill out the google form for Course inquiry.
    forms.gle/Nxk8...
    -------------------------------------------------------------------
    Data Engineering using is one of the highest-paid jobs of today.
    It is going to remain in the top IT skills forever.
    Are you in database development, data warehousing, ETL tools, data analysis, SQL, PL/QL development?
    I have a well-crafted success path for you.
    I will help you get prepared for the data engineer and solution architect role depending on your profile and experience.
    We created a course that takes you deep into core data engineering technology and masters it.
    If you are a working professional:
    1. Aspiring to become a data engineer.
    2. Change your career to data engineering.
    3. Grow your data engineering career.
    4. Get Databricks Spark Certification.
    5. Crack the Spark Data Engineering interviews.
    ScholarNest is offering a one-stop integrated Learning Path.
    The course is open for registration.
    The course delivers an example-driven approach and project-based learning.
    You will be practicing the skills using MCQ, Coding Exercises, and Capstone Projects.
    The course comes with the following integrated services.
    1. Technical support and Doubt Clarification
    2. Live Project Discussion
    3. Resume Building
    4. Interview Preparation
    5. Mock Interviews
    Course Duration: 6 Months
    Course Prerequisite: Programming and SQL Knowledge
    Target Audience: Working Professionals
    Batch start: Registration Started
    Fill out the below form for more details and course inquiries.
    forms.gle/Nxk8...
    --------------------------------------------------------------------------
    Learn more at www.scholarnes...
    Best place to learn Data engineering, Bigdata, Apache Spark, Databricks, Apache Kafka, Confluent Cloud, AWS Cloud Computing, Azure Cloud, Google Cloud - Self-paced, Instructor-led, Certification courses, and practice tests.
    ========================================================
    SPARK COURSES
    -----------------------------
    www.scholarnes...
    www.scholarnes...
    www.scholarnes...
    www.scholarnes...
    www.scholarnes...
    KAFKA COURSES
    --------------------------------
    www.scholarnes...
    www.scholarnes...
    www.scholarnes...
    AWS CLOUD
    ------------------------
    www.scholarnes...
    www.scholarnes...
    PYTHON
    ------------------
    www.scholarnes...
    ========================================
    We are also available on the Udemy Platform
    Check out the below link for our Courses on Udemy
    www.learningjo...
    =======================================
    You can also find us on Oreilly Learning
    www.oreilly.co...
    www.oreilly.co...
    www.oreilly.co...
    www.oreilly.co...
    www.oreilly.co...
    www.oreilly.co...
    www.oreilly.co...
    www.oreilly.co...
    =========================================
    Follow us on Social Media
    / scholarnest
    / scholarnesttechnologies
    / scholarnest
    / scholarnest
    github.com/Sch...
    github.com/lea...
    ========================================

Комментарии • 89

  • @ScholarNest
    @ScholarNest  3 года назад +3

    Want to learn more Big Data Technology courses. You can get lifetime access to our courses on the Udemy platform. Visit the below link for Discounts and Coupon Code.
    www.learningjournal.guru/courses/

  • @mdshahalam3010
    @mdshahalam3010 4 года назад +27

    Nothing is tough when you have a good teacher. Kudos for your work sir.

  • @mathisinav4267
    @mathisinav4267 4 года назад +10

    No one, i repeat no one has explained hadoop with this perfection. A million thanks

  • @iwonazwierzynska4056
    @iwonazwierzynska4056 Год назад +1

    The best explanation of standby node in the Internet!!

  • @Van-pf2or
    @Van-pf2or 3 года назад +1

    Crisp, Simple and Picture is what called as best teaching. You are a best tutor.

  • @labyrinth1991
    @labyrinth1991 3 года назад +1

    I have gone through so many tutorials but the way you explained sir makes it so easy to understand hadoop. Thanks a lot sir!!

  • @prannoyroy5312
    @prannoyroy5312 4 года назад +2

    I have become a fan of your style of teaching. Thank you, sir. 😊

  • @moehijawe555
    @moehijawe555 5 лет назад +3

    Really thank you for such topics,I spent a lot of time reading books but I couldn't understand anything till I watched your tutorials. big thanks

  • @vinitsunita
    @vinitsunita Год назад

    As you described, the role of Secondary Name Node is to regularly take the checkpoint at configured interval and update the on disc FS Image by applying the editlogs that were captured in the time window when it took last checkpoint. And to further reduce the restart time of Primary Name Node, it does the same checkpoint process where it reads the on disc FS Image stored by SNN and apply the editlogs entry to create latest FS Image and store it in memory. Few questions wrt these : -
    1. Where does SNN stores the FS Image. Is it inside disc on local file system ?
    2. How does primary name node get access to that Secondary NN ?

  • @rjlifeandtech8675
    @rjlifeandtech8675 4 года назад +1

    1. zookeeper election
    2. split-brain concepts
    3. Hadoop 3, erasure coding and storage policies
    Could you please explain all above

  • @arjunpandey1617
    @arjunpandey1617 4 года назад

    You make things very simple to understand..... Hats off to your effort !!

  • @pubgkiller2903
    @pubgkiller2903 2 года назад

    You are the best teacher.. Thanks a lot

  • @NilanshuSharma1
    @NilanshuSharma1 5 лет назад +1

    Thanks. It's very clear. Piece of advice for viewers: These tutorials can easily be watched in 2x speed.

  • @swatikorade5251
    @swatikorade5251 2 года назад

    i learn HDFS from last 7 days but still my concepts are not clear..but today i watched your video i am clear with everything...thank you

  • @worthwatchingeslam
    @worthwatchingeslam 7 лет назад +2

    Great explanation, thanks for your efforts :)

  • @maliknauman3566
    @maliknauman3566 2 года назад

    Excellent explanation Sir, Hat's off.

  • @anilkumar-dp1jk
    @anilkumar-dp1jk 6 лет назад +2

    Awesome Sir ..Thank You

  • @bgsuresh0
    @bgsuresh0 5 лет назад

    Your explanation is very clear thank you. Kindly keep update the new videos.

  • @ramswaroop1520
    @ramswaroop1520 2 года назад

    What an Explanation 🙏🙏🙏🙏🙏🙏❤️❤️❤️❤️❤️❤️❤️

  • @tarunrey619
    @tarunrey619 7 лет назад

    Explanation was clear.
    I have few questions ?
    1)while setting cluster using Hadoop 2,Initially how will zookeeper elects the leader among the namenodes?
    2)Can you explain the funcitonality of failcontrollers of namenode?

    • @joker-cy6qo
      @joker-cy6qo 4 года назад

      Bro i would love to answer
      When u setup a new cluster the NN will be the active NN which u have selected to be a NN
      AND
      Later if it fails the zkfc(zookeeper failover controller ) is responsible for making standby node as a active node
      Hope this will help u

    • @joker-cy6qo
      @joker-cy6qo 4 года назад

      When u set up a new cluster the active namenode will be the one which you selected and if NN goes down the zookeeper will work here the demand of zookeeper ZKFC which stands for zookeeper failover and it is responsible for making standby namenode active namenode

  • @rohitbhagwat3031
    @rohitbhagwat3031 3 года назад

    Great Work sir. Thanx for video.

  • @eldos11
    @eldos11 4 года назад +1

    This was beautiful! Thank you.

  • @wajay2006
    @wajay2006 4 года назад

    Very good Tutorial. Only thing I want to say is fsimage is not only in memory but also stored on disk. Please excuse me if I am not correct on this point.

  • @kumarpolisetty3048
    @kumarpolisetty3048 4 года назад

    Really nice explanation. If you can start practical implementation of one POC with end to end project , it will be very useful for all of us. Thanks for your efforts and time.

  • @mahalaxmanraochappedi4690
    @mahalaxmanraochappedi4690 6 лет назад

    Presentation and explanation was excellent..

  • @tusharmayekar4649
    @tusharmayekar4649 2 года назад

    It was very good information.

  • @amansehgal9917
    @amansehgal9917 7 лет назад +1

    Highly recommended for anyone who wishes to learn about how fault tolerance is managed in HDFS.
    In addition to this, I've a question: Are block recovery, lease recovery and pipeline recovery done in addition to the methods describe in video for fault tolerance or these are done at deeper level of the described methods?

  • @kanmanik5674
    @kanmanik5674 4 года назад

    Very good tutorial. Easy to understand.

  • @surabhibtech
    @surabhibtech 3 года назад

    very useful explaination

  • @dineshshinkar2163
    @dineshshinkar2163 6 лет назад

    Simple and superb explained

  • @BhimSella
    @BhimSella 4 года назад

    Thanks for the detailed explanation.

  • @anandansubash
    @anandansubash 7 лет назад

    Thanks for your clear explanation. Awesome!

  • @aasthajain6148
    @aasthajain6148 5 лет назад

    Awesome sir...great explanation👌👌

  • @renukaasodaria494
    @renukaasodaria494 7 лет назад

    very nice.I could not understand too much about secondary name node but will try to understand it.

    • @ScholarNest
      @ScholarNest  7 лет назад

      Why? is it because the explanation is not clear? You can ask your doubts if there are any?

    • @renukaasodaria494
      @renukaasodaria494 7 лет назад

      no explantn is so nice but my fsimages nd editlog is not clear so

    • @renukaasodaria494
      @renukaasodaria494 7 лет назад

      nd thank u very much

  • @rajivraghu9857
    @rajivraghu9857 5 лет назад

    Very nicely explained.

  • @aks8989
    @aks8989 5 лет назад

    Very nice explanation!

  • @sridharthogaru4403
    @sridharthogaru4403 6 лет назад

    great and clear explanation thanks.

  • @prometeo34
    @prometeo34 5 лет назад

    Great Tutorial..thanks for sharing

  • @sureshm6906
    @sureshm6906 5 лет назад

    Very informative. Thanks

  • @nagamanickam6604
    @nagamanickam6604 3 года назад

    Thank you sir

  • @sujitunim
    @sujitunim 6 лет назад

    Awesome tutorial

  • @Sridevi-ht9nj
    @Sridevi-ht9nj 7 лет назад

    very well explained

  • @uddisasuresh9264
    @uddisasuresh9264 6 лет назад

    Great explanation

  • @avikthedrummer
    @avikthedrummer Год назад

    Hello! Is there any ppt format of this video? Need to explain students.. the representation is superb

  • @nguyen4so9
    @nguyen4so9 7 лет назад

    Excellent !

  • @enriquewilliams8676
    @enriquewilliams8676 2 года назад

    Good.

  • @krishnap6035
    @krishnap6035 4 года назад

    Good lecture.

  • @dhirenmistry167
    @dhirenmistry167 6 лет назад

    Superb...Thank you so much

  • @shramandas2721
    @shramandas2721 2 года назад

    Why cant we dump the fsimage directly to disk during restarting of the NameNode . After restarting it can read the fsimage and then push it to memory it will be faster.

  • @maheshkumar3657
    @maheshkumar3657 5 лет назад

    nice tutorial sir

  • @aakashpatel1003
    @aakashpatel1003 5 лет назад

    very good video

  • @worthwatchingeslam
    @worthwatchingeslam 6 лет назад

    Great work, I have 2 questions.
    -Regarding the checkpoint activity does the secondary NN keeps the "on Disk FS" Image on it's local HD or is it on the Active NN HD ?
    -and the hour between each checkpoint is it configurable?

  • @nitskrishna
    @nitskrishna 7 лет назад

    nicely explained

  • @mrionutube
    @mrionutube 5 лет назад

    Thank you for this excellent tutorial. I am new to this topic and all the tutorials or blogs I went through, did not put up a clear picture of what is happening with Checkpoint process of SNN and that of NN too. So, can you please confirm my understanding about this topic (Related to NON HA mode) ?...
    1) After every Checkpoint run, SNN clears the Edit Log on Name Node as well? So at any time, Edit log on NN has data only since the last Checkpoint run on SNN.
    2) fsimage of the NN gets updated automatically in real time (i.e as and when changes are made to the file system). Which means , Name Node always has latest fsimage in its memory at all times.
    3) At any given time fsimage on the Secondary Name Node holds file system image updated as of last Checkpoint run.
    4) After a reboot, Name Node picks up the fsimage from the "Secondary Name Node" and the Edit Log from NN local disc and merges them to create new fsimage file which is up to date with all changes as of then.

  • @pc0riginal870
    @pc0riginal870 5 лет назад

    Thank you so much ...

  • @aneksingh4496
    @aneksingh4496 4 года назад

    can u make a video why RDD is immutable and what would have happened had it not been immutable

  • @sharathkalaallapuram5941
    @sharathkalaallapuram5941 6 лет назад

    Sir great explation sir. I have a dout sir 1)how to install cloudera without internet sir & and what is parcel method and packeges method.

  • @sk-vs9nt
    @sk-vs9nt 6 лет назад

    it was clear about topic thank you so much , can you show with example

  • @sabyasachiprasad8929
    @sabyasachiprasad8929 5 лет назад

    Nice One

  • @nileshkharat1188
    @nileshkharat1188 3 года назад

    Why there's an odd no. Of JN 3 or 5??
    What's the reason behind that

  • @prasadkv9936
    @prasadkv9936 5 лет назад

    Thanks. could you please explain how to create Cloudera cluster as now a days many clients are prefer cloudera instead of Hortonworks..

  • @aniketamrutkar
    @aniketamrutkar 6 лет назад

    Awesome

  • @coolguy171182
    @coolguy171182 6 лет назад +1

    Sir, What will happen, if the DN-1 is slow, and it does not send heartbeat as fast as compared to other nodes. If NN then thought that DN-1 is down and started replicating the data on different node say DN-2 and during replicating the data the DN-1's heartbeat reached to NN. Will it stop replicating the data on DN-2?

    • @ScholarNest
      @ScholarNest  6 лет назад +2

      +Pranav Wagde, I think it is hypothetical question. Either I get the heartbeat within expected interval or I don't. There is no concept of slow heartbeat. If NN realized that the block is under replicated, it will make more replicas to fix it. There is no concept of stopping in between. Later when NN realizes that block is over replicated, it will fix that also by throwing away some replicas.

    • @coolguy171182
      @coolguy171182 6 лет назад

      Thanks for the explanation. Understood the concept.

  • @fasiuddin1874
    @fasiuddin1874 5 лет назад

    helpful

  • @testingmakeseasy
    @testingmakeseasy 7 лет назад +1

    please upload some hive and pig related videos ..

  • @iotmails9519
    @iotmails9519 4 года назад

    Can we have multiple replication factor for multiple tenants?

    • @ScholarNest
      @ScholarNest  4 года назад

      You can have it at the topic level and I guess all Tanents of the cluster are not going to share the topics. So, answer is a Yes.

  • @VishalYadav-lw2ky
    @VishalYadav-lw2ky 7 лет назад

    Can we make a single node for both NameNode and as a Secondary NameNode..?

  • @navinnayak007
    @navinnayak007 7 лет назад

    how fsimage file and editlog file communicate each other?

    • @karthiknedunchezhiyan7935
      @karthiknedunchezhiyan7935 6 лет назад

      fsimage will not communicate with editlog but during checkpointing process new fsimage will be created by merging old fsimage with new editlog

  • @joker-cy6qo
    @joker-cy6qo 4 года назад

    1.75 x

  • @Deshammanideep
    @Deshammanideep 5 лет назад +2

    Hadoop is very fault tolerant. The only point of failure can be Maharashtra State Electricity Board.

    • @justACatOnYoutube
      @justACatOnYoutube 4 года назад

      Lol! You can keep backup in Inverters.. Its not costly.

  • @ravikoganti227
    @ravikoganti227 7 лет назад

    Great explanation