Hadoop HDFS Commands and MapReduce with Example: Step-by-Step Guide | Hadoop Tutorial | IvyProSchool

Поделиться
HTML-код
  • Опубликовано: 4 авг 2024
  • In this video, we will demonstrate the Hadoop ecosystem and deep dive into the core Hadoop commands, providing clear explanations and practical examples of how to interact with the Hadoop Distributed File System (HDFS) and manage your data effectively.
    Next, we will explore MapReduce, a powerful programming model and algorithm that lies at the heart of Hadoop's data processing capabilities. You will learn how to execute MapReduce tasks to process and analyse vast amounts of data in a distributed manner, enabling parallel processing and maximising performance.
    Throughout the tutorial, we will walk you through some commands and MapReduce tasks, breaking them down into easy-to-follow steps and explaining the underlying concepts along the way. By the end of this tutorial, you will have a solid understanding of Hadoop commands and MapReduce, empowering you to confidently tackle big data challenges and extract valuable insights from your datasets.
    00:00:00 - Introduction
    00:00:35 - Launch the hadoop cluster
    00:01:26 - Check hadoop cluster from web browser
    00:02:00 - Hdfs commands
    00:03:04 - Upload a file into hdfs from local
    00:04:07 - Upload a folder into hdfs from local
    00:04:37 - Run a wordcount program in hadoop
    00:05:49 - Java Heap Space Error solve
    00:08:23 - copy a file from hdfs to local
    00:09:45 - Conclusion
    #hadooptutorial #mapreduce #dataengineering #hadoopcommands
    ❓Interested in a Data Science or Data Engineering career? Check out the Ivy Pro School’s top ranked Nasscom, Govt. of India and IBM accredited Data Science and Data Engineering courses: ivyproschool.com/
    ❓Why should you learn Data Science today? • Future Of Analytics & ...
    ❓Are you a Fresher? Watch this video to know the Data Science Journey for Freshers: • Data Science Journey f...
    ✅Don’t forget to Subscribe to Ivy Pro School’s RUclips Channel: / @ivyproschool
    ✅Learn from Ivy Pro School’s Data Science and Data Engineering Students’ Success Stories : • Student Success Story
    Liked the video? Check out below more playlists on Data Science and Data Engineering learning tutorial, alumni interview experiences, live data science case studies,etc:
    ⏩Students Share Interview Experiences- • Student Interview Expe...
    ⏩Data Cleaning using excel- • Data Cleaning using Excel
    ⏩SQL Interview questions solved in Hindi - • Sql Interview Question...
    ⏩PowerBI Tutorials - • Power BI Tutorial
    ⏩Industry case studies and projects by students - • Industry Case Studies ...
    ⏩Data Science for Beginners - • Data Science for Begin...
    ⏩Excel VBA tutorial - • Excel VBA Tutorial
    ⏩Excel Tutorials by Eeshani Agrawal - • Excel for Beginners
    ⏩Tableau Tutorials by Eeshani Agrawal - • Tableau Tutorial by Ee...
    For more updates on courses and tips don’t forget to follow us on:
    - Instagram: / ivyproschool
    - LinkedIn: / ivy-professional-school
    - Facebook: / ivyproschool
    - Twitter: / ivyproschool
    - Telegram: t.me/learndswithexperts
    Want to take a Data Science Course with Ivy Professional School - visit us at ivyproschool.com You can call us at +91-7676882222 or mail to info@ivyproschool.com.

Комментарии • 51

  • @leokook
    @leokook Год назад

    Thank you !!

  • @ardian9030
    @ardian9030 11 месяцев назад +1

    Thank you so much, teacher. Please you can dedicated to teach more in Hadoop :) God bless you

  • @shiva_9596
    @shiva_9596 6 месяцев назад

    can you attach files as well

  • @lokamanisowmya5858
    @lokamanisowmya5858 2 месяца назад

    Sir till browse directory its opened but not able find the temp folder or create new folder

  • @vksaisushmitha4203
    @vksaisushmitha4203 2 месяца назад

    I have executed the wordcount successfully, but i am unable to see the application in the cluster , but the output is executed and successfully returned in hdfs path i gave.
    what could be the missing element.

  • @user-yn9mw2df8i
    @user-yn9mw2df8i 11 месяцев назад +1

    Hi, thank you for your tutorial! When I execute the wordcount program, it keeps getting stuck at Map job is 100%, but reduce job is 0%... I installed and configured hadoop from your previous video. Could you please help me with this?

    • @IvyProSchool
      @IvyProSchool  11 месяцев назад

      run jps and see the datanode is running or not, if it's not running format namenode and delete all files from tmp file.

    • @AnkitYadav-up4
      @AnkitYadav-up4 4 месяца назад

      ++ bhai kase resolve hua???//

  • @viswanathvankadara4383
    @viswanathvankadara4383 Год назад +1

    Hey, thanks a ton for your videos.
    I tried to replicate what you did in this video, but I'm encountering this error when trying to execute wordcount
    [2023-07-03 12:20:31.201]Container exited with a non-zero exit code 1. Last 4096 bytes of stderr :
    '"C:\Program Files\Java\jdk-11.0.17"' is not recognized as an internal or external command,
    operable program or batch file.
    Not sure where exactly this is going wrong

    • @IvyProSchool
      @IvyProSchool  Год назад +1

      java isn't configured properly, install java in different folder where the folder name should not contain any space, try to edit java path with "C:\Program/ Files\Java\jdk-11.0.17" otherwise install java in different folder.

    • @viswanathvankadara4383
      @viswanathvankadara4383 Год назад +1

      @@IvyProSchool thanks for the response. Didn't expect that to be the issue. Should heed to the warnings from now on.
      Also, as of now Map job is 100% done but reduce job is still at 0%, I'm not sure what the exact issue is but I suppose Windows Firewall is not letting the file be transferred from the nodes where the mappers had run to the nodes where the reducer is trying to run.

    • @user-yn9mw2df8i
      @user-yn9mw2df8i 11 месяцев назад

      Hey! I am facing the same issue! Could you please tell me what you did?@@viswanathvankadara4383

  • @AnkitYadav-up4
    @AnkitYadav-up4 4 месяца назад +1

    sir after running the jar file command i not able to create output file for wordcount it showing map 0% reduce 0% but after that it gave error for exception message '/tmp/hadoop-Ankit' is not recognized as internal or external command not able to found solution I searched it everywhere

    • @IvyProSchool
      @IvyProSchool  3 месяца назад

      Watch the video from 5:50 to solve the memory issue and remove that temporary folder.

  • @yaswanthmarni839
    @yaswanthmarni839 Год назад

    when i run start-all.cmd ,the namnode and datanode are not launched
    please guide how to resolve this

    • @IvyProSchool
      @IvyProSchool  Год назад

      please format the namenode first then use start-dfs.cmd to start namenodes then start-yarn.cmd to start resource manager.

  • @yallayaswanth5962
    @yallayaswanth5962 9 месяцев назад

    hey, when i try to run hdfs dfs -ls / command its not showing any tmp file, but my tmp file is creating in the c drive. and even when i check in utilities its not showing any tmp file.

    • @IvyProSchool
      @IvyProSchool  8 месяцев назад

      you temporary files isn't created inside of hdfs, it's create on you're local disk drive, so any hdfs command can't show that.

    • @thang4280
      @thang4280 3 месяца назад

      @@IvyProSchool I also fall into this situation ! So is it okay? Is this temporary file important? How can I fix it?

    • @thang4280
      @thang4280 3 месяца назад

      have you fixed it yet ?

  • @shahul68
    @shahul68 5 месяцев назад +1

    where are the practice files
    I couldn't find that
    Thanks Good Job

    • @chanchanman4426
      @chanchanman4426 5 месяцев назад

      they should've provided a link for the files.

    • @IvyProSchool
      @IvyProSchool  4 месяца назад

      These datasets are huge, we can't share them here. If you want some data search in kaggle.com

  • @ridazouga4144
    @ridazouga4144 7 месяцев назад

    Hello sir when running mapreduce job it shows error with exception 'tmp/haddop-Rida' is not recognized as intern or extern command, i searched everywhere (stack overflow youtube ...) for solution but no hope

    • @IvyProSchool
      @IvyProSchool  7 месяцев назад +1

      need to see more details go to localhost:8088 then failed job see the error, most probably it's resource issue (CPU, memory), try to run this a simple mapreduce job "hadoop jar C:\hadoop\share\hadoop\mapreduce\hadoop-mapreduce-examples*.jar pi 2 10" if it's still give error delete the tmp folder from c drive and format the namenode again.

    • @umeranwar8250
      @umeranwar8250 4 месяца назад

      Hey was your error solved if it was please tell me how to do it

    • @ridazouga4144
      @ridazouga4144 4 месяца назад

      @@umeranwar8250 sorry nope

  • @lakshagajyothi258
    @lakshagajyothi258 Год назад

    Localhost namenode alone is not opening in browser.please help what to do

    • @IvyProSchool
      @IvyProSchool  Год назад

      try to use localhost:50070 or localhost:9870, to access it.

  • @rathnakumari5567
    @rathnakumari5567 Месяц назад

    Hi,
    First of all thanks for your support.
    I followed all the steps as mentioned.
    And yes it is working.
    But when I execute jps comment it is not returning anything.
    Just returning to the comment prompt again with sbin.
    Can you plz explain it why?
    And my both localhosts are running

    • @IvyProSchool
      @IvyProSchool  3 дня назад

      Did the problem get solved, or is it still ongoing? Let us know, and we'll be happy to help.

  • @tyrakeech5018
    @tyrakeech5018 3 месяца назад

    start-yarn.cmd command is not working. what should I do?

    • @IvyProSchool
      @IvyProSchool  3 месяца назад

      Check yarn-site.xml properties from the video.

  • @user-sy9lm8uq3w
    @user-sy9lm8uq3w 3 месяца назад

    when i execute the hadoop jar command it show Exception in thread "main" java.lang.UnsupportedOperationException: 'posix:permissions' not supported as initial attribute. is there a solution

    • @IvyProSchool
      @IvyProSchool  3 месяца назад +1

      Your java path have space, put '\' before space.

    • @DevshriSanika
      @DevshriSanika 3 месяца назад

      @IvyProSchool Where exactly to add '\'before space ? In environment variables in JAVA HOME or in hadoop-env? Can you clarify please ?

    • @DevshriSanika
      @DevshriSanika 3 месяца назад

      @@IvyProSchool Where exactly to add '\'before space ? In environment variables in JAVA HOME or in hadoop-env? Can you clarify please ?

  • @nabilazafar5657
    @nabilazafar5657 Год назад

    MapReduce jobs get stuck in Accepted state
    Please guide how to resolve this

    • @IvyProSchool
      @IvyProSchool  Год назад

      maybe your pc's resource is low, use smaller size of input data.

  • @ParasKumar-fs7vq
    @ParasKumar-fs7vq 11 месяцев назад

    Bro, when i am running the jps command its showing nothing, and also port 8088 is not running

    • @ParasKumar-fs7vq
      @ParasKumar-fs7vq 11 месяцев назад

      only local host 9870 is running

    • @IvyProSchool
      @IvyProSchool  11 месяцев назад

      Add java jdk bin path in system variable path

    • @shreyasnaidu5203
      @shreyasnaidu5203 7 месяцев назад

      Same problem bro
      I've added the path , even though it is not working

  • @reemhasan3506
    @reemhasan3506 8 месяцев назад

    jar returns me nothing is that normal?

    • @IvyProSchool
      @IvyProSchool  8 месяцев назад +1

      you mean jps. yes, if namenode and datanode working then you're good to go. If you want jps return something Go to Control Panel >> System >> Advanced system settings >> Environment Variables then Click 'Path' from System variables then Click Edit. Now add the path (ex: C:\Java\jdk1.8.0_72\bin) Now open cmd window and write jps. It will work now.

  • @SASIREKHAS-du2ns
    @SASIREKHAS-du2ns 2 месяца назад

    when i execute the hadoop jar command it show Exception in thread "main" java.lang.UnsupportedOperationException: 'posix:permissions' not supported as initial attribute. is there a solution

    • @vksaisushmitha4203
      @vksaisushmitha4203 2 месяца назад

      Hi i am encountering same error. Did you find any solution?

    • @SASIREKHAS-du2ns
      @SASIREKHAS-du2ns 2 месяца назад

      @@vksaisushmitha4203 I changed the version of hadoop 3.2.1 & executed.