Web Scraping Databases with Mechanical Soup and SQlite

Поделиться
HTML-код
  • Опубликовано: 6 янв 2025

Комментарии • 221

  • @inconnumj4692
    @inconnumj4692 3 года назад +16

    can mechanical Soup scrap dynamic content "javascript pages" ?

    • @PythonSimplified
      @PythonSimplified  3 года назад +24

      nope! it's meant for HTML/XML scraping only, so it's optimal for websites with very little user interaction! 😊
      If you're scraping a JavaScript website - definitely go for Selenium!
      I have a bunch of videos explaining how to use it:
      ⭐ Web Scraping LinkedIn:
      ruclips.net/video/7aIb6iQZkDw/видео.html
      ⭐ Web Scraping Instagram:
      ruclips.net/video/iJGvYBH9mcY/видео.html
      ⭐ Web Scraping Facebook:
      ruclips.net/video/SsXcyoevkV0/видео.html
      Good luck and I hope it helps! 😀

    • @inconnumj4692
      @inconnumj4692 3 года назад +7

      @@PythonSimplified thank you for your reply and for the good job you are doing ! keep it up

    • @alb12345672
      @alb12345672 3 года назад +2

      Sometimes you can look at the network activity and call APIs. Every situation is different.

    • @KacperSieradziński
      @KacperSieradziński 3 года назад +1

      @@PythonSimplified My viewers are asking me if this is legal :D Do you have any answer on such questions? ;-)

    • @PythonSimplified
      @PythonSimplified  2 года назад +1

      @@KacperSieradziński I've actually filmed a short video on it a while back:
      ruclips.net/video/f0B6RdVGcM8/видео.html
      It doesn't really count as "legal advice", but as long as you use it within the "fair use" copyrights clause you should be fine 😉

  • @TheDroc1990
    @TheDroc1990 2 года назад +1

    New subscriber!! Im a Data Engineer with 10 years of experience. Can't wait to watch all your videos! 👏

  • @VisuallyExplained
    @VisuallyExplained 3 года назад +30

    This is probably one of the best and most thorough channels related to practical applications of Python. You have a very unique style, very well done!

    • @PythonSimplified
      @PythonSimplified  3 года назад +2

      Thank you so much for the incredible feedback!! I really enjoyed reading your comment! 😃😃😃

    • @codzlaw
      @codzlaw 3 года назад

      This is perfect for someone wanting to learn you have what a lot of educational videos are missing (simpler way of explaining) they teach like you already know and get it. Educators often forget because they get it don't mean everyone does. Different ways to teaching and people absorb information differently. Thanks for taking time to explain it fully step by step.

  • @chaghlarblabla5157
    @chaghlarblabla5157 3 года назад +5

    As a blind programmer i find it very useful when you tell us what You typed. i subscribe Your channel now. Keep it going.

    • @PythonSimplified
      @PythonSimplified  3 года назад +3

      Thank you so much for the lovely comment Chaghlar! 😃
      I'm always trying to make these videos as accessible as possible to everyone, and I'm so happy you found them helpful!!
      I really admire the fact that you're programming despite the challenge and I'm super excited to have you on board! 😁😁😁

    • @chaghlarblabla5157
      @chaghlarblabla5157 3 года назад +2

      Thank You for te kind words coming from You

    • @chaghlarblabla5157
      @chaghlarblabla5157 3 года назад

      One thing i haven't understand in this code. there is value.text.replace inside the brackets. i understand it's function and what it does. but, i got no clue where did You define value variable at first? is it a method of mechanicalsoup library or m i missing something due to i almost asleep.

    • @chaghlarblabla5157
      @chaghlarblabla5157 3 года назад

      Thats a list comprehension, ok.

  • @Pradeep-kv9hp
    @Pradeep-kv9hp 2 года назад

    people call Python is easy language but you are makes python so easy and understanding by tutorial videos. Thanks and keep it and make it more and more tutorials videos

  • @CrypticPulsar
    @CrypticPulsar 2 года назад +1

    You are just too awesome! Simple, powerful, and fluent method to easily remember commands that you wouldn’t otherwise.. bravo, and thank you!

  • @jefferyandme3741
    @jefferyandme3741 3 года назад

    I'm hooked on this channel!! Easy to look at and her style resonates with me. She has a real way of helping me understand. Very refreshing!! Thank You!

  • @stay_stoic_be_stoic
    @stay_stoic_be_stoic 3 года назад

    When you said simplied, you are on the money. I needed to get a clear view on OOP, your video literally cleared all my doubts.

  • @wslater56
    @wslater56 3 года назад

    wow - i am very impressed - so much easier than going to coding sites where the logic is harder to translate. Thanks for clearly explaining the logic within each line of code

  • @alisheik3076
    @alisheik3076 Год назад

    Hi,
    You have an unique way to explain the subject. I am really impressed the way you explain. I have seen hundreds of videos about web scraping, but none of them as simplified as this. Thank you so much. I heard about Beautifulsoup, Scrapy, Selenium, Puppeteer. But for the first time MechanicalSoup. And I am expecting some more videos about web scraping with the updated versions. You are techings are as sweet as your voice.
    Thanks

  • @martinmiguez6153
    @martinmiguez6153 3 года назад +4

    Ohhh magnifique!!! Love love love this tutorial!!
    your work always improves. The way you explain and how you use the logic of programming is very clear.. thk you

    • @PythonSimplified
      @PythonSimplified  3 года назад +1

      Thank you so much for the lovely comment Martin!! 😁😁😁 I'm super happy you like my explanations! 😊

  • @giorgosiotis1557
    @giorgosiotis1557 3 года назад +1

    I wish all female minds worked a little like you. Congratulations from Greece

    • @PythonSimplified
      @PythonSimplified  3 года назад +1

      hahahahaha thank you so much Giorgos! 😁😁😁
      Cheers from Canada! 🍁

  • @bc4198
    @bc4198 2 года назад

    This is exactly the topic I was looking for, but it was crazy hard to find, so thank you!

  • @tonym5857
    @tonym5857 3 года назад

    You have a great tallent to teach us complex programs in the east way. 👏👏👏🌻☘️🐱.

  • @freedtmg16
    @freedtmg16 2 года назад

    i love your tutorials! I am in a code camp for web development but im finding the logic driven scripting my true love and following along with your videos makes it really easy to learn as you do an amazing job explaining whats going on! thank you!

  • @urbaneplanner
    @urbaneplanner 3 года назад +2

    Nice walk through - there are many ways to webscrape - I hadn’t come across mechanical soup before (I’ve used beautiful soup, selenium, and scrapy and I would approach this task a bit differently) and I really like this intuitive walk through as the thought process applies to the various programmes

    • @PythonSimplified
      @PythonSimplified  3 года назад +1

      Thank you so much! That's exactly what I was aiming for! 😊
      This video is meant to demonstrate a general approach to web scraping and using developer tools to select elements on the DOM!
      You can apply the same techniques with any other web scraping library, and I must admit that I'm much more of a Selenium kind of girl... but once in a while it's nice to diversify 😉

    • @urbaneplanner
      @urbaneplanner 3 года назад

      @@PythonSimplified I’m not a programmer, but I like to use a little programming to support my work - so this sort of approach to coding is actually better I think for a lot of people who aren’t necessarily doing programming at scale or worried about efficiency - learn the concepts, do something that works and if you want later on you can learn the more efficient ways of doing things. I find some of the more elegant examples are a bit too sophisticated for some people as they can be so efficient it’s hard for a non programmer to actually understand why they work

  • @zparihar
    @zparihar 3 года назад +1

    Definitely the cutest Python Teacher in existence!
    I'm officially giving up Ruby!

    • @PythonSimplified
      @PythonSimplified  3 года назад

      hahaha thank you so much Zubin! 😀
      And yeeeeey! My evil plan to convert everyone to Python is finally working!! muahahahaha!!! 🤣🤣🤣

    • @zparihar
      @zparihar 3 года назад

      @@PythonSimplified Found out your in Van! Go Van!

  • @Jason-si1yd
    @Jason-si1yd 3 года назад +1

    Great job on presenting this. It was kept basic and very informative.

  • @ReadAlongClassics
    @ReadAlongClassics Год назад

    Thanks!

  • @jameselliott8671
    @jameselliott8671 11 дней назад

    Great video! I'm adding a note in case anyone else has the same issue. I was getting the following error when trying to organise the columns: "All arrays must be of the same length". I turns out that I had my initial index for columns at 6 rather than 7, but it looked ok when printing it because both index 6 and 7 are 'AlmaLinux OS Foundation'.

  • @harkiranonline
    @harkiranonline 3 года назад

    If beautiful girls like you kept teaching, IT is bound to be more popular

  • @senseinfx9630
    @senseinfx9630 2 года назад +1

    I love you teacher👍😍😘.

  • @JaveGeddes
    @JaveGeddes 3 года назад +1

    Some pages have links on them that have to be joined to get the url with the info I'm looking for, I want to open then scrape those.. can you explain how to do that?

  • @domingezu4687
    @domingezu4687 3 года назад +1

    BB - Beautiful and brilliant! :)

  • @wragabrr
    @wragabrr 3 года назад +4

    Nice thanks, but if we work with static numbers it brakes as soon as something is added to the table. Would there be a way to only refer to the specific table?

    • @PythonSimplified
      @PythonSimplified  3 года назад

      Absolutley! However not with Mechanical Soup but with Pandas! 🐼
      Checkout my notebook on Github (I'll film a tutorial about it shortly):
      github.com/MariyaSha/WebscrapingDatabases/blob/main/scraper_Pandas.ipynb
      The scraper object in the notebook code extracts all the tables from the page. Then you find the index of the table you're looking to scrape and you print it I the next cell.
      It's a shortcut that skips the entire scraping process and leaves it to Pandas, which is super handy! 😃
      I hope it helps 😊

  • @a43em18
    @a43em18 2 года назад +2

    Amazingly clear explanation - thanks a lot for the example. 100-point :-D !!!

  • @printdaniel
    @printdaniel Год назад +1

    This is new for me, thanks.

  • @shadyabdelhady-rm3sz
    @shadyabdelhady-rm3sz 2 года назад +1

    thank you so much, that's exactly what I'm looking for

  • @kamertonaudiophileplayer847
    @kamertonaudiophileplayer847 3 года назад

    Now I better understand how Google works. Indeed, it is a simple as a soup. Thank you.

  • @aquahoodjd
    @aquahoodjd 3 месяца назад +1

    Ok, lets say that each list in a table contains a two or three small sample files (audio files of samples of a transmission), the wiki is sigwiki. It is a table of signals, modulation types, who they belong to, and sample of those signals, can we have the scrapper automatically download the the files in the table?

    • @PythonSimplified
      @PythonSimplified  3 месяца назад

      Not sure I understood the question, but you can scrape the links from a table (or alternatively, just targeting all the anchor elements on a page) and once you scrape them - download them with something like wget 🙂
      Here's their homepage: www.gnu.org/software/wget/

  • @trtlphnx
    @trtlphnx 3 года назад +2

    I Found This Highly Informative and Very Interesting: Thanks Sweetness, Love You And your Channel ~

    • @PythonSimplified
      @PythonSimplified  3 года назад

      Thank you so much! I'm glad you enjoyed this tutorial! 😀😀😀

  • @dipeshrathore8842
    @dipeshrathore8842 3 года назад +1

    You are great Maria😍😍🥰

    • @PythonSimplified
      @PythonSimplified  3 года назад +1

      And so are you Dipesh!! Thank you so much 😊😊😊

  • @Tobs_
    @Tobs_ 3 года назад +2

    great video, thanks for sharing 👍 I always learn something new.

    • @PythonSimplified
      @PythonSimplified  3 года назад +1

      Thank you so much! glad you liked it Tobs! 😀

  • @ayoubcharbaji884
    @ayoubcharbaji884 8 месяцев назад

    and if i already have a table that i wanna insert Pandas data frame into it what should i do ?

  • @autentikum
    @autentikum 2 года назад

    I get the error when trying to do pd.DataFrame() "All arrays must be of the same length'. Do you maybe know why?

  • @sibtainshah3376
    @sibtainshah3376 3 года назад

    It was really a helpful and inspirational experienced session for me to get an overview of different interesting fields in computing like this one .... 😊 Thanks a lot ... ☺

  • @ilyessbenmessaoud9272
    @ilyessbenmessaoud9272 2 года назад

    how we can do it from a table that has rows organised in multiples pages

  • @hanjielo4277
    @hanjielo4277 2 года назад

    thank you, nice tutorial! I have a question, i come across no module of bs4 or mechanical on way scripts, any advices?

  • @IsaacNewton80735
    @IsaacNewton80735 3 года назад

    Very entertaining. I want to start a Webscrapping project with Python. It was like I spect in many ways

  • @mytoptechs
    @mytoptechs 3 года назад

    I keep getting an sqlite3.OperationalError with cursor.execute("create table linux (Distribution, " + ",".join(column_names)+ ")") says near "("

  • @waylandchin
    @waylandchin 3 года назад +2

    So every time you ran the code to test out the script, it would reload the webpage into mechanical soup. If the website is huge, how do we save the page into a file so that we don’t have to request it from the site again.

    • @PythonSimplified
      @PythonSimplified  3 года назад +5

      Hi Wayland, it's a great question! 😊
      One solution could be scraping the entire content of the page with browser.page, converting it into a string (maybe replacing the tags with "," or something similar) and storing it in a csv or text file.
      with open('mt_text_file.txt', 'w') as my_file:
      my_file.write(my_string)
      with open('my_csv_file.csv', 'w') as my_file:
      my_file.write(my_string)
      If you're selecting specific elements rather than the entire page, try browser.page.find_all(element_name) and you can store them in the exact same manner 😀 This way you are interacting with a file on your computer rather with some kind of a web server.
      Good luck scraping and I hope it helps! 😄

  • @return_1101
    @return_1101 3 года назад +2

    Love your video! Content its great.

  • @pllemost8410
    @pllemost8410 3 года назад +1

    Adorable Mariya….!
    Beautiful soup, bs4.

  • @paulwratt
    @paulwratt 3 года назад

    if your code included variable assignments of those "print index", then you can still get all your wanted value.text results even if the contents of the tables changes. Also if you are going to re-run this scraper code that writes to the DB, you need to "DROP" the database first (ie delete the table, not just the data itself)

  • @cubano100pct
    @cubano100pct 3 года назад +1

    What would be the best for scrapping web sites that have login, so I can download details of my orders? There is one page that has order summaries and for each order I will click to get to the details to download. Which libraries would be best for this type of web scraping?

    • @PythonSimplified
      @PythonSimplified  3 года назад +2

      Hi Felix! 😃
      I personally go for Selenium whenever dealing with e-commece sites. It allows you to scrape Javascript (while Mechanical Soup is limited to HTML/XML) and it also gives you a certain degree of protection from security blockers. It opens a browser window from where all the scraping happens and because of this window - the server is convinced you are a legitimate user rather than a bot :)
      I have so many tutorials about Selenium (I belive you'll see a bunch of links in the pinned message on this vid) but here's one where I scrape Facebook:
      ruclips.net/video/SsXcyoevkV0/видео.html
      I hope it helps and good luck! 😊

  • @vishalvishwakarma7621
    @vishalvishwakarma7621 2 года назад

    Hey Mariya , would you tell me name of your large curve monitor, its totally insane like you

  • @neculaicristea8491
    @neculaicristea8491 3 года назад

    If a table is shown on multiple web pages (Next page ), could we still use this scraper? Thanks.

  • @semtex2987
    @semtex2987 3 года назад +2

    my first choice would be pandas read_html, but if you wanna do this kinda procedural in a more manual way, using static indexes is the worst way to go.
    just extract all tables and iterate over the content of each. ;) so nothing goes wrong if the contentsize changes

    • @PythonSimplified
      @PythonSimplified  3 года назад +1

      If the content size changes it shouldn't really matter as the index is not hard coded... we're performing a search to find in which index the value lives before we slice it so it should adjust accordingly (relevant only if the new distributions won't be added after Zorin OS though) 😊
      I definitely agree that read_html is a faster alternative! It's an incredible shortcut, while this video is more of a step by step demonstration of how to approach web scraping in general. I always like shortcuts, but in my opinion knowing the concept behind the shortcut is important as well 😉

    • @finkyfreak8515
      @finkyfreak8515 Год назад

      Thought the same thing. But for a fast and "dirty" way to scrape data it's fine.

  • @dravidaravindkumar4207
    @dravidaravindkumar4207 3 года назад +2

    Maam your teachings are awesome 👌👏..can you plss put a video on stadium seat booking system with python as front-end and SQL as backend...

  • @scpecialist
    @scpecialist Год назад

    Your videos are great and very informative.
    Are you planning to make a video about collecting data from forum sites? And hope you make a video about the how chatgpt is made it.

  • @lucadelpartita
    @lucadelpartita 2 года назад

    What do you think is best solution to scrape a website with HTML/XML and recaptcha or a website with javascript that give us a limited list of 10 result and then you must click on every one to popup an html window and see all details?
    Do you think is it possible to scrape data from a database when you must send at least some character for some fields?
    And what about a website with a database where you must feed a field with EXACT string to query? For example, if in the database there is a many record of names, it's not possible to insert just PAUL and then have all records for PAUL and JEANPAUL.
    I am talking about four different websites but all four with a database.

  • @EmaMazzi76
    @EmaMazzi76 3 года назад

    Thank you! Your tutorials are amazing 🤩

  • @zhang73
    @zhang73 3 года назад

    What monitor are you using? I need to get one.

  • @goosechasing
    @goosechasing 3 года назад

    Great video as always! Thank you! I would love to see you make a tutorial on the Curses Module!

  • @lopii777
    @lopii777 3 года назад +1

    It is magic to watch you creating code, very informative and cool.
    I just have a question as an Excel-user that would like to learn how to code.
    What is the major difference and benefit between the Python code and Importing from a web-page in Excel ?
    Thank you !

  • @ΝικόλαςΣτρατηγός-Αφράτης

    How can i fix an incorrect number of bindings supplied ?

  • @carlosarrieta1145
    @carlosarrieta1145 3 года назад

    Which monitor are you using?

  • @Ksys
    @Ksys 3 года назад

    What ultrawide monitor are you using?

  • @user-jchjkitv77896
    @user-jchjkitv77896 3 года назад

    Very nice 👌

  • @goodluckoriuwa1669
    @goodluckoriuwa1669 Год назад

    Can you do a tutorial on how I can connect to the website databases, read data directly from the tables and update table data from the website url and this mechanical soup?

  • @oscaralejandropazbalderas4288
    @oscaralejandropazbalderas4288 3 года назад

    Excelente...me he suscrito a su canal...es impresionante...Excellent, I just suscribed to your channel...It's amazing!!!

  • @gazul05
    @gazul05 3 года назад

    Awsome! Thanks a lot... greetings from Mexico.

  • @dustinokelley156
    @dustinokelley156 3 года назад

    I'm Taking a python course at school right now and am struggling mightily. I have had zero prior experience in programming. Do you have any suggestions on materials i could pick up to help myself?

  • @furkanozata6775
    @furkanozata6775 Год назад

    Great video. Thanks alot.

  • @Golledaman
    @Golledaman 3 года назад

    Great video, thanks for sharing!
    A bit offtopic, but what make and model is your monitor? I have been looking around for a 49" ultrawide for office use but it's difficult finding one that ticks all the boxes.

  • @Septounze
    @Septounze 3 года назад +1

    Very good video and I learned a lot from it. Love the way you explain everything.
    I have a questions. What are the advantages of using python to scrape the table from the web page instead of using Excel or Power Query?

  • @JorgeEscobarMX
    @JorgeEscobarMX Год назад

    I love how she says "Attributes"

  • @atillakoseoglu4089
    @atillakoseoglu4089 2 года назад

    Do you have a course building our coding skills?

  • @elvinrk
    @elvinrk 3 года назад

    Awsome video!

  • @zakyvids6566
    @zakyvids6566 3 года назад +1

    Hi
    Is it possible for you to make a short maybe an 30-1 hour long python crash course
    I had actually mentioned this in one of the previous videos too

    • @PythonSimplified
      @PythonSimplified  3 года назад +1

      Hi Zaky! 😀
      That's a very specific time frame you got! hahahaha I'll definitely keep posting videos on the Python for beginners series from time to time.
      I'm just trying to keep up with all the tutorial ideas in my head and it's not an easy task because I have too many of them!! 😅😅
      I've filmed a roadmap video not too long ago, covering some basic stuff for about 8 minutes:
      ruclips.net/video/wFEC7VbWBZo/видео.html
      If you're looking for a good source of beginners tutorials other than my channel, checkout Rune's channel (and more specifically the 8 hours worth of Python lessons playlist):
      ruclips.net/video/ybeeuGXdhrQ/видео.html
      Good luck and I hope it helps! 😃

  • @BobRoyAce77
    @BobRoyAce77 3 года назад

    Thanks for your tutorials...always informative and well-presented, and love your personality. By the way, what monitor is that that you are using?

  • @granand
    @granand 3 года назад

    Thank you Merry Xmas. Please can you always give li is to environment, editor you are using etc. Us that wayscript? In a month, I will following every video to do stuff. Thanks a ton

  • @Alex-fl2yh
    @Alex-fl2yh 3 года назад

    are you using python 2 ?

  • @nikluz3807
    @nikluz3807 3 года назад +1

    “And boom” haha I always say that when I’m explaining code. Nice

    • @PythonSimplified
      @PythonSimplified  3 года назад

      hahaha it's the best way to announce a successful return statement! 😉

  • @catafest
    @catafest 3 года назад

    I need to scraping instagram , can you have a good tutorial about this issue? Thank you for share.

  • @DS-nr9zc
    @DS-nr9zc 3 года назад

    Your channel is so helpful! Can you do a tutorial on git?

  • @vigneshsuresh6003
    @vigneshsuresh6003 3 года назад +1

    Plz make a video to scrape data from flight fare sites like cleartrip and expedia.

  • @lostsecArmy
    @lostsecArmy 3 года назад

    can u made a complate playlist in webscraping

  • @aliahmad5834
    @aliahmad5834 3 года назад

    time to change my career path

  • @danielgomes3994
    @danielgomes3994 2 года назад +1

    Hi All! Just a warning "Rocky Linux" dont have class table-rh. That can lead to ValueError("arrays must all be same length")

    • @zagoguic
      @zagoguic 2 года назад

      how did you avoid this?

    • @danielgomes3994
      @danielgomes3994 2 года назад

      @@zagoguic hi, i need to check again

  • @alfblack2
    @alfblack2 3 года назад

    Simply awsome. a topic I wanted to do/research. Presented excellently (best presentation I have seen for a noob like me). By a very pretty lady! But sadly audio volume is not great. :(

  • @fadyelias
    @fadyelias 3 года назад

    I'm new on your channel thank you for this good and helpful tutorial but we need Django advanced tutorial

  • @mustlagzmustlagz333
    @mustlagzmustlagz333 2 года назад

    Wawwwww vous êtes compétente♥.
    Dommage, manque sous-titrage en Français

  • @mibrahim4245
    @mibrahim4245 3 года назад +1

    Thanks beast

    • @PythonSimplified
      @PythonSimplified  3 года назад +1

      hahahaha thank you habibi! 😁😁😁

    • @mibrahim4245
      @mibrahim4245 3 года назад

      @@PythonSimplified 😍😍😍😍😍😍😍😍😍 you welcome habibti .. ❤

  • @pawelwalenda
    @pawelwalenda 3 года назад

    Great video as usual. But why do you use "pip" instead of "pip3". Is pip still supported?

  • @amadoucoulibaly2916
    @amadoucoulibaly2916 3 года назад

    Hi, can we have a tutorial, scraping a search result from a search bar? Thanks

  • @dipeshrathore8842
    @dipeshrathore8842 3 года назад +3

    I just want some tutorials solving real problem which I will have to solve at a python job!
    It will help me a lot😊😇😊

    • @PythonSimplified
      @PythonSimplified  3 года назад +1

      Hi Dipesh! 😀
      There are many jobs that involve Python! it's widely used in data science, financial analysis, cyber security, machine learning, etc.
      Almost all the videos on this channel will help you solving real problems which you may encounter on the job - it all depends on what industry you're aiming for 😉

    • @dipeshrathore8842
      @dipeshrathore8842 3 года назад +2

      @@PythonSimplified Yes I am trying to get into Data science and Cyber security.
      Your videos helps me a lot ❤

    • @PythonSimplified
      @PythonSimplified  3 года назад +2

      Awesome, I'm so happy to hear! 😊
      This tutorial is perfect for data science! many entry level jobs involve lots of web scraping (as you usually start from generating databases so that the senior data scientists can use them later)
      Good luck on your journey! 😁

    • @dipeshrathore8842
      @dipeshrathore8842 3 года назад +1

      @@PythonSimplified Thank you so much ❤

  • @sshroot5565
    @sshroot5565 3 года назад

    Can you do a video like this but for extracting video links from a website ? Thanks in advance

  • @danield.7359
    @danield.7359 3 года назад

    Subscribed 😊. Nice job. However, if any changes are done to the tables in regards adding/removing rows and/or columns - which in an actively maintained table of Linux distros is quite probable - the scraping algorithm won't work properly anymore. So instead of using hardwired indexes I'd add some more code to analyze the table(s) and compute the indexes dynamically. An other potential issue for certain sites is the EU cookie consent popup () that needs to clicked away before getting to the content. So you'd need to remote control a headless browser using js. Don't know if mechanical soup can handle this. If yes, I'd be interested to learn how.

  • @yahyeabdirashid9716
    @yahyeabdirashid9716 3 года назад

    To be honest i was staring at you whole time 😍

  • @laondaradio2022
    @laondaradio2022 3 года назад

    Nice channel, You can up information of react native please.

  • @AndrewOBannon
    @AndrewOBannon 3 года назад +1

    Mariya, like and I have subscribed to your github.

    • @PythonSimplified
      @PythonSimplified  3 года назад +1

      Thank you so much Andrei!! 😃😃😃 it's a great way to find out the topics of the upcoming videos a few hours in advance 😉 (I usually load the lesson code to Github the night before a premiere)

  • @diwakar_tsn
    @diwakar_tsn 3 года назад +1

    Wow❤️🥰❤️🇳🇵

  • @acmilanlover6517
    @acmilanlover6517 2 года назад

    How to withdraw data programs .exe via web scraping databases

  • @UserName-ln5ol
    @UserName-ln5ol Год назад

    Dis very gud stuff right hur.

  • @ldandco
    @ldandco 3 года назад

    gracias por el video ! de dónde eres ?

  • @tejaspatil3978
    @tejaspatil3978 3 года назад +1

    hey, can you plzz make the video on Activation function in neural network...?

    • @PythonSimplified
      @PythonSimplified  3 года назад

      Hi Tejas, I've already covered 2 activation functions in my ML series 😃
      You can find the step function in my Perceptron tutorial:
      ruclips.net/video/-KLnurhX-Pg/видео.html
      And the Sigmoid function in my Gradient Descent video:
      ruclips.net/video/jwStsp8JUPU/видео.html
      Good luck and I hope it helps! 🙂

    • @tejaspatil3978
      @tejaspatil3978 3 года назад

      @@PythonSimplified thank you

  • @mohamedhindam1793
    @mohamedhindam1793 3 года назад

    Why not using Pandas alone for scraping HTML tables like this

  • @khaledhossain4642
    @khaledhossain4642 3 года назад

    Dear , thanks for the session,it's really helped to understand some of the basic clearly..But while preparing my own tool , I have failed to do it ..is it possible to get some support from you?

  • @CurrentElectrical
    @CurrentElectrical 3 года назад +1

    How do you determine what type of website you are scraping from? I.E. Javascript, HTML, XML etc? Awesome tutorial! Hello from Ontario. :D

    • @PythonSimplified
      @PythonSimplified  3 года назад +4

      I don't know if there are strict rules to it, but usually when I need to scrape something, the first thing I consider is how interactive the site is.
      If there are many user interactions such as likes, shopping cart or highly scrollable content - I always go for Selenium which is incredibly powerful!
      Another consideration is - how likely is it that the website will block my bot? When dealing with e-commence websites you will encounter lots of blockers to prevent scavengers, therefore Selenium will be the best option again as it provides you with a GUI browser (which is considered as a legitimate user/client from the websites perspective. It just doesn't know how to differentiate between a human and an automated process when this browser window exists) and it bypasses these blockers so your ip is not recognized as a bot.
      So Selenium is definitely my favorite, but sometimes it's an overkill! 😀
      If you're scraping sites with very simple user interaction, for example: Google, Wikipedia, or even the weather forecasts on Environment Canada site (🍁🍁🍁). You'll notice that the majority of user interaction consists of links/anchor elements, which can easily be implemented with a markup language like HTML rather than a scripting one like JS.
      Also, these sites do not load more and more content when you scroll down - it has a pre-defined end of the page section and you can't scroll beyond that (unlike in Instagram and Facebook feeds for example).
      In these cases the best solution is probably Beautiful/Mechanical Soup or other XML/HTML scrapers with a headless browser (such as the stateful one from this tutorial) 😊
      It's so easy to work with and no need to download a special web driver to get it going, just pip install mechanical soup and you can start working! 😁
      Good luck with scraping, and I hope it helps!