Tutorial 7- Pandas-Reading JSON,Reading HTML, Read PICKLE, Read EXCEL Files- Part 3

Поделиться
HTML-код
  • Опубликовано: 29 сен 2019
  • Hello All,
    Welcome to the Python Crash Course. In this video we will understand about Pandas library, how to read JSON ,HTML, PICKLE and Eexcel files.
    github url : github.com/krishnaik06/Machin...
    Support me in Patreon: / 2340909
    Connect with me here:
    Twitter: / krishnaik06
    Facebook: / krishnaik06
    instagram: / krishnaik06
    If you like music support my brother's channel
    / @ultralifeproject
    Buy the Best book of Machine Learning, Deep Learning with python sklearn and tensorflow from below
    amazon url:
    www.amazon.in/Hands-Machine-L...
    You can buy my book on Finance with Machine Learning and Deep Learning from the below url
    amazon url: www.amazon.in/Hands-Python-Fi...
    Subscribe my unboxing Channel
    / @krishnaikhindi
    Below are the various playlist created on ML,Data Science and Deep Learning. Please subscribe and support the channel. Happy Learning!
    Deep Learning Playlist: • Tutorial 1- Introducti...
    Data Science Projects playlist: • Generative Adversarial...
    NLP playlist: • Natural Language Proce...
    Statistics Playlist: • Population vs Sample i...
    Feature Engineering playlist: • Feature Engineering in...
    Computer Vision playlist: • OpenCV Installation | ...
    Data Science Interview Question playlist: • Complete Life Cycle of...
    You can buy my book on Finance with Machine Learning and Deep Learning from the below url
    amazon url: www.amazon.in/Hands-Python-Fi...
    🙏🙏🙏🙏🙏🙏🙏🙏
    YOU JUST NEED TO DO
    3 THINGS to support my channel
    LIKE
    SHARE
    &
    SUBSCRIBE
    TO MY RUclips CHANNEL

Комментарии • 111

  • @aryamahima3
    @aryamahima3 4 года назад +76

    I have taken udemy (1000 rs INR) course for python for data science. Your video are far better and more intense than that course.
    Thanks a lot.

    • @nandhakishore8950
      @nandhakishore8950 4 года назад +1

      Exactly

    • @vijayjb1704
      @vijayjb1704 4 года назад

      Whose Instructor of your course? Bcoz i also took one

    • @sunitapatil381
      @sunitapatil381 3 года назад +2

      thank you for telling this i was thinking to join but now i feel this is better than you so much

    • @Virus-ke8xj
      @Virus-ke8xj 3 года назад

      True

    • @shrirangsapate
      @shrirangsapate 3 года назад

      Very True.

  • @suchitanaik6728
    @suchitanaik6728 3 года назад +10

    Being a fresh learner in Python, your videos are been a blessing. Once I finish all the videos, it will be easy to get through proper certification course

  • @rishabhtewari4357
    @rishabhtewari4357 3 года назад +2

    The way you explained everything looks like too easy and interesting .Thanks for providing all the stuff. I am following the same path as suggested by you .THANKS

  • @QaAutomationAlchemist
    @QaAutomationAlchemist 4 года назад +2

    Great content and especially reading html pages...thanks a lot!

  • @md.omankhan8648
    @md.omankhan8648 3 года назад

    I am soo glad that I did not skip this video, learned a lot

  • @veltechunivalumnidept2171
    @veltechunivalumnidept2171 9 месяцев назад

    Learning day by day from your videos. Thank you so much. Learning from basics

  • @robyshah6879
    @robyshah6879 3 года назад +18

    In Video it is been conveyed that Wine.data file is in JSON format which is not and instead its in CSV format. Guys pls take note of this error.

    • @PriteshsRhymes
      @PriteshsRhymes 2 года назад +2

      Even I was thinking the same thing, Json was never used :(

    • @fariqjamil5484
      @fariqjamil5484 Год назад +1

      that's why he used pd .read_csv("path"). But what is Header

  • @arrooow9019
    @arrooow9019 3 года назад

    What a amazing video🤩🤩🤩

  • @gauravmarathe3730
    @gauravmarathe3730 4 года назад +20

    Everthing is Pretty much simple pretty much easy 😆🤘🤣😁

  • @ashamaheshk7306
    @ashamaheshk7306 4 года назад +1

    I like ur teaching 🤘

  • @sudeeprajput1830
    @sudeeprajput1830 3 года назад

    Thanks brother for this informative video

  • @robinfelix3879
    @robinfelix3879 3 года назад

    vera level content bro

  • @mohdzain1741
    @mohdzain1741 4 года назад +5

    It would be really helpful if you could provide the links of the dataframes you are using 😃

  • @louerleseigneur4532
    @louerleseigneur4532 3 года назад

    Thanks Krish

  • @ranaasad6132
    @ranaasad6132 3 года назад

    Thank you,Sir
    Love from Pakistan

  • @21Gannu
    @21Gannu 4 года назад +2

    When is your full tutorial going live?

  • @nareshjanjirala472
    @nareshjanjirala472 4 года назад +1

    hai Krish please make a video on how to import data directly from data base to python

  • @user-wt8sp2xk3n
    @user-wt8sp2xk3n 5 месяцев назад

    how does your output look so nicely arranged with shading , mine op looks like list numbers when reading from html

  • @saumyagupta2606
    @saumyagupta2606 2 года назад +1

    Hello Krish after scraping the table from web, how do I save the list to csv ?

  • @srikanthchandana4485
    @srikanthchandana4485 3 года назад

    Hi Krish,
    If there are multiple tables with same column headers(for eg: in mobile country code data, there are other tables as well, with same column headers), then how to extract that specific table. Kindly let us know!...Thanks in advance.

  • @dilippradhan94
    @dilippradhan94 4 года назад +1

    Bro please upload deep learning videos...

  • @tahabimuhammad4524
    @tahabimuhammad4524 4 года назад +8

    If I download the wine.data from the given link, I see that it was already in csv format instead of json format not like what said in the video. After applying df.to_csv it actually added index of row from 0 to 177 and column index 0 to 13 in the new generated wine.csv file.

    • @vishwajitbhagat9515
      @vishwajitbhagat9515 3 года назад

      but does it add column names too ?

    • @AnjaliGupta-cm1zo
      @AnjaliGupta-cm1zo 3 года назад +2

      And also in the video we are reading this file using read_csv but he is saying abt json file, I'm not able to understand

  • @hometvfirestick
    @hometvfirestick 2 года назад

    thanks

  • @ankitgupta8797
    @ankitgupta8797 3 года назад

    if wine data is in json format, how are you reading at as csv using read_csv ?

  • @manishpahuja8127
    @manishpahuja8127 3 года назад +1

    Hey Krish! Great video!
    You have mentioned that using pickle, we can avoid running the entire code every time while doing pre processing and model training which takes a lot of time especially for large datasets and multiple attempts at model building. Is there any video where this concept is explained in more detail? Thanks a lot!

  • @omkarkabade79
    @omkarkabade79 4 года назад +3

    2:36 [ is it possible to read json format file using read_csv?]

  • @shrirangsapate
    @shrirangsapate 3 года назад

    Sir, can you elaborate more about pickle?

  • @ashutoshsharma6883
    @ashutoshsharma6883 3 года назад

    Sir in the video you said that the wine data is in json but you are reading it with read_csv.

  • @roushanraj2654
    @roushanraj2654 4 года назад +4

    11:16 in case u get an import error, perform this: "pip install lxml"

    • @cartiktechnomechnobro9061
      @cartiktechnomechnobro9061 4 года назад

      Still its showing module lxml not found
      i wrote that command in windows prompt and when it wasnt working then i wrote in Anaconda Prompt as well but there its showing requirements satisfied

    • @pqs403
      @pqs403 3 года назад +1

      @@cartiktechnomechnobro9061 try "conda install lxml" on anaconda prompt

  • @praveenshenoy8064
    @praveenshenoy8064 3 года назад

    i am getting error as invalid syntax while working on first Json example statement , what may be the reason..?

  • @hardikchoudhary3596
    @hardikchoudhary3596 4 года назад +1

    Pickle and to_csv are basically same except the extension right ? Is there any benefit of using pickel....thanks

    • @krishnaik06
      @krishnaik06  4 года назад +4

      Pickle can be used for any data structure eg: model, files. It usually requires less space also

  • @sachinbairi6353
    @sachinbairi6353 3 года назад +1

    Sir, doubt !!! you said [ 2:25 ] that wine.data is in JSON format so why are you reading a JSON data using read_csv ???

  • @datascienceexpert6524
    @datascienceexpert6524 4 года назад

    please show how to read multiple csv or excel files

  • @rahul4upandey
    @rahul4upandey 4 года назад

    can you please have video on mongo db ccreated?

  • @suraushareddy4454
    @suraushareddy4454 3 года назад

    Hii sir , i have smAll doubtpython is enough through this vedios

  • @roshankumargupta46
    @roshankumargupta46 4 года назад +1

    (url_mcc, match='Country', header=0)
    What if two table having same column name, which one it will select?

  • @jagajayaraman5200
    @jagajayaraman5200 4 года назад

    Hi sir
    How to convert html table to csv sir ?

  • @satyacenation5874
    @satyacenation5874 4 года назад

    Hi Krish,
    Im getting the below error...while trying to read the table from url....
    405 # this version of raise is a syntax error in Python 3
    URLError:

  • @karansaini7855
    @karansaini7855 2 года назад

    when i write any website address for data,i get .How to fix this

  • @sharathkumar1387
    @sharathkumar1387 3 года назад

    11:32 Sir please explain why we are using [0] index in dfs[0]??

    • @manishpahuja8127
      @manishpahuja8127 3 года назад +3

      Hey it seems that the read_html looks up all the tables available on the html page and gives out a list containing the different datasets(tables) on the said page. Using dfs[0] returns the first dataset in the list, which is what appears in Krish's code! Please let me know if this helps!

  • @indirajithkv7793
    @indirajithkv7793 2 года назад

  • @sunnysolanki2460
    @sunnysolanki2460 Год назад

    at 2:30 you are using pd.read_csv to read a json file

  • @chiragkapoor32
    @chiragkapoor32 2 года назад +4

    at 11:11 i am getting error "No tables found"

    • @hrshtmlng
      @hrshtmlng 4 месяца назад +1

      before that i was getting error related to import of html5lib

  • @aryanrana5658
    @aryanrana5658 2 года назад +1

    When i read excel file by using pd.read_excel(file name.xlsx) ,even that file my laptop contains but i still get the error of "there is no such file or directory ".

    • @kojorichardson4283
      @kojorichardson4283 2 года назад

      check to make sure that your excel file is in the same directory as your current working directory.

  • @ashishbrahmankar2143
    @ashishbrahmankar2143 3 года назад

    I am getting below error while converting Data to json, could anyone help please ?
    df1=pd.read_json(Data)
    ValueError: Invalid file path or buffer object type:

  • @curious_bird
    @curious_bird Год назад

    Hi, You told that the next tutorial will be of mongodb but there is no mongodb tutorial.

  • @betsythomas5971
    @betsythomas5971 4 года назад

    Can anyone tell me why do we take dfs[0] in line 166

  • @imteyaz5160
    @imteyaz5160 3 года назад

    Hi sir when I practice my jupyter notebook showing
    Name error: name ' pd' is not defined
    Name error: name 'df' is not defined
    What I should do sir

  • @nidhijakhad128
    @nidhijakhad128 3 года назад

    Hello Sir , when I am using pd.read_json() it is giving a value error .
    Saying that : if using all scalar values , you must pass an index .
    Please help me out with this !
    Thanks

    • @pakhigupta2869
      @pakhigupta2869 3 года назад +2

      This is because the read_json() function has a parameter 'typ' which is DataFrame by default, while data has Series value.
      So we either convert our data value to DataFrame, or change the typ parameter:
      1. Convert data from Series to DataFrame:, ie pass the json object inside [], so that each dict inside this list is treated as each row of the DataFrame
      data='[{"a" : "name", "b" : "num"}]'
      pd.read_json(data)
      2. change the 'typ' parameter
      data='{"a" : "name", "b" : "num"}'
      pd.read_json(data, typ='Series')

    • @nidhijakhad128
      @nidhijakhad128 3 года назад

      @@pakhigupta2869 Thank you

  • @ashwini4683
    @ashwini4683 3 года назад

    df1 = pd.read_json(Data) is showing error
    ValueError: Expected object or value
    please help

    • @suchitanaik6728
      @suchitanaik6728 3 года назад

      Check the file name mentioned, might possible you have used a different name, I have done the same code with different assignment
      df=pd.read_json(Jdata)

  • @adityanaik1800
    @adityanaik1800 Год назад

    This type of error coming after i type..
    df = pd.read_html(url)

  • @sapnilpatel1645
    @sapnilpatel1645 2 года назад +1

    when i run this line -> df=pd.read_csv('archive.ics.uci.edu/ml/machine-learning-database/wine/wine.data',header=None) i am getting the error that 404:not found. so anyone have new link for the same data?

    • @soofishafiya2632
      @soofishafiya2632 2 года назад +1

      I am also getting this error...did you find any solution to it ??

    • @sapnilpatel1645
      @sapnilpatel1645 2 года назад

      Not yet.

    • @jn9281
      @jn9281 Год назад

      In the link you have typed database, it will be databases

  • @srinukondaveeti9558
    @srinukondaveeti9558 3 года назад

    Sri, while dealing with json file , i got the
    data={"a" : "name", "b" : "num"}
    pd.read_json(data)
    ERROR: "Invalid file path or buffer object type" i don't understand this

    • @pakhigupta2869
      @pakhigupta2869 3 года назад +6

      First of all read_json() functions require String value, not a dict. So, data should be:
      data='{"a" : "name", "b" : "num"}'
      Then with this value, you will get a 'ValueError'.
      ValueError: If using all scalar values, you must pass an index
      This is because the read_json() function has a parameter 'typ' which is DataFrame by default, while data has Series value.
      So we either convert our data value to DataFrame, or change the typ parameter:
      1. Convert data from Series to DataFrame:
      data='[{"a" : "name", "b" : "num"}]'
      pd.read_json(data)
      2. change the 'typ' parameter
      data='{"a" : "name", "b" : "num"}'
      pd.read_json(data, typ='Series')

    • @debayanmazumdar3056
      @debayanmazumdar3056 3 года назад

      @@pakhigupta2869 I was getting the same error, thank you so much for your reply it was very helpful!!

    • @ruchitpatel107
      @ruchitpatel107 3 года назад

      @@pakhigupta2869 thanks mannn

  • @EcExplorer
    @EcExplorer 2 года назад

    read_html does not work. Do I need something else to install as well?

    • @EcExplorer
      @EcExplorer 2 года назад

      ---------------------------------------------------------------------------
      ImportError Traceback (most recent call last)
      Input In [50], in ()
      1 url_mcc = 'en.wikipedia.org/wiki/Mobile_country_code'
      ----> 2 dfs = pd.read_html(url_mcc, match='Country', header=0)
      File ~\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\util\_decorators.py:311, in deprecate_nonkeyword_arguments..decorate..wrapper(*args, **kwargs)
      305 if len(args) > num_allow_args:
      306 warnings.warn(
      307 msg.format(arguments=arguments),
      308 FutureWarning,
      309 stacklevel=stacklevel,
      310 )
      --> 311 return func(*args, **kwargs)

  • @kritikaverma3762
    @kritikaverma3762 4 года назад +1

    anyone else getting this error when try to read json file?
    File "", line 1
    jsonData = pd.read_json('C:\Users\kritika\ML\example_1.json')
    ^
    SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \UXXXXXXXX escape

    • @sripadhavallabhagoud9590
      @sripadhavallabhagoud9590 4 года назад +1

      This error occurs because you are using a normal string as a path.
      you can add r before your normal string it converts normal string to raw string: jsonData = pd.read_json(r"C:\Users\kritika\ML\example_1.json")
      or
      jsonData = pd.read_json("C:\\Users\\kritika\\ML\\example_1.json")
      or
      jsonData = pd.read_json("C:/Users/kritika/ML/example_1.json")

  • @samikshabharne1251
    @samikshabharne1251 3 года назад

    want to convert json file into dataframe but got this error: (Invalid file path or buffer object type: )

    • @samikshabharne1251
      @samikshabharne1251 3 года назад

      got answer ,its just syntax error

    • @samikshabharne1251
      @samikshabharne1251 3 года назад

      j_file='{"emp_name":"samiksha","email":"bharnesm@gmail.com","emp_address":[{"title":"mr.","name":"suhas"}]}' , i just forgot to write all the code in a single quotation mark

  • @lngwnd1
    @lngwnd1 3 года назад +3

    The url from the video didn't work for me... This is the correction @11:11 'www.fdic.gov/resources/resolutions/bank-failures/failed-bank-list/'

    • @parveenparveen9384
      @parveenparveen9384 3 года назад

      Thank you. But it throws error saying No tables found. How to solve this?

    • @parveenparveen9384
      @parveenparveen9384 3 года назад

      @@payaldhekwar2717 , by using request module i have resolved this error
      import requests
      import pandas as pd
      url1= 'www.fdic.gov/resources/resolutions/bank-failures/failed-bank-list/'
      crypto_url = requests.get(url1)
      crypto_url

    • @dreamday4810
      @dreamday4810 2 года назад

      @@parveenparveen9384

    • @dreamday4810
      @dreamday4810 2 года назад

      This not working after applying your method as well. i think concerned site put some restriction on scraping

  • @krishanpalsingh973
    @krishanpalsingh973 4 года назад +2

    I'm not getting playlist

  • @w.g.ogaming2210
    @w.g.ogaming2210 Год назад

    HTML is not installing... Help someone 😔

  • @vineetkrpandey7641
    @vineetkrpandey7641 3 месяца назад

    tut-7//13/04/2024

  • @khushboochhabra2136
    @khushboochhabra2136 6 месяцев назад

    You skipped the part that you couldn't answer! In the second html page, there were other tables as well with "Country" as column names, but you tried to deviate from the explanation by quickly switching the tabs. 2. You didn't mentioned to install lxml for reading xml file. I believe such small small things are important to tell a newbie

  • @palashmoon3808
    @palashmoon3808 4 года назад

    why did we use dfs[0] instead of dfs? as the result for both will be same just the format of dfs is different than that of dfs[0].

    • @alexanderryzhkov7421
      @alexanderryzhkov7421 4 года назад

      if you have several tables on the webpache the result will be different. by using dfs[0] you chose the first table

    • @manishpahuja8127
      @manishpahuja8127 3 года назад

      Hey it seems that the read_html looks up all the tables available on the html page and gives out a list containing the different datasets(tables) on the said page. Using dfs[0] returns the first dataset in the list, which is what appears in Krish's code! Please let me know if this helps!

  • @AvinashKumarMAD
    @AvinashKumarMAD 4 года назад

    Hello krish i am in very starting phase of learning python in which your channel is helping a lot from which i am learning continuously.
    I am just trying to execute the below code
    import pandas as pd
    Data = '{"employee_name":"James","email":"james@gmail.com"}'
    pd.read_json(Data)
    but giving error "If using all scalar values, you must pass an index"
    but with Data = '{"employee_name": "James", "email": "james@gmail.com", "job_profile": [{"title1":"Team Lead", "title2":"Sr. Developer"}]}' this is working fine.
    I am unable to identify the difference, can you please help

    • @gulshanarya1714
      @gulshanarya1714 4 года назад

      since you are using all values string i.e. a scalar you must pass list or dict in values like '{"employee_name":"James","email":["james@gmail.com"]}'

    • @atifiqbalm
      @atifiqbalm 2 года назад

      the difference is [ and ]

  • @JacklinSibiyal
    @JacklinSibiyal 5 месяцев назад

    Day 3 - 18/02/2024