Want To Learn Web Scraping? Start HERE

Поделиться
HTML-код
  • Опубликовано: 2 июл 2024
  • Thanks to Oxylabs for sponsoring this video
    oxylabs.go2cloud.org/aff_c?of...
    Add code “JR15" at checkout to save 15%
    I wanted to dump as much information as I could regarding how to get start web scraping! this video includes the top 3 methods I use, the main tools I use, and some basic pitfalls that beginners get stuck at. If you found any of this interesting you'll find dedicated videos on all of it for free on my RUclips channel. I'll have loads more Python, web and Node.js videos coming up soon too.
    Support Me:
    Patreon: / johnwatsonrooney (NEW)
    Amazon UK: amzn.to/2OYuMwo
    Hosting: Digital Ocean: m.do.co/c/c7c90f161ff6
    Gear Used: jhnwr.com/gear/ (NEW)
    -------------------------------------
    Disclaimer: These are affiliate links and as an Amazon Associate I earn from qualifying purchases
    -------------------------------------
  • НаукаНаука

Комментарии • 46

  • @randyjd3706
    @randyjd3706 2 года назад +6

    The only RUclipsr I have my notifications turned on for!! Thanks JR

  • @androiduser457
    @androiduser457 Год назад +1

    Thank you John. I started my scraping journey with scrapy, playwright, scrapy-playwright, proxies with scraperapi.

  • @CodePhiles
    @CodePhiles 2 года назад +5

    Thank you John for all this wonderful journey of web scraping various techniques that you illustrated in a professional and easy way, Keep forward and we wish you the best

  • @maggiekay1
    @maggiekay1 2 года назад +4

    Thanks so much John!I learned so much in your contents, keep going, you are amazing !

  • @juricadevic7337
    @juricadevic7337 2 года назад +2

    John I am really grateful for your new tutorial on web scraping. Regards from Croatia.

  • @TheHerisatry
    @TheHerisatry 2 года назад +1

    my new fav channel on python :) since yesterday i cant stop watching

  • @melih.a
    @melih.a 2 года назад +2

    This video is so good! Thank you

  • @MrAdmiralJo
    @MrAdmiralJo 2 года назад +8

    Nice Video! Could you tell us more about how you work with the scraped data afterwards?
    Do you put them in a Data Lake? (If so, which one do you use? Azure, AWS, G-Cloud, something selfhostet, ...)
    Or do you save the csv files only in an system like Apache Hadoop?
    So in general how looks the system around the scraping part?
    I´m very interestered to hear more about that from you!
    Best wished!

  • @wangdanny178
    @wangdanny178 2 года назад +2

    Thanks for this guide. You are a brilliant teacher. I came from whiskey webscraping. And I learned a lot from the comment section as well. This is a great community. I will move on to the Patreon soon. Nice job.

    • @JohnWatsonRooney
      @JohnWatsonRooney  2 года назад +1

      Thank you for watching, I’m glad you have got some good value and learning from my videos!

  • @RonWaller
    @RonWaller 10 месяцев назад +2

    Thanks John. I jumped in (having scrapped in past, but picking the right tool for the job is a big one. Plus also knowing how the tool works helps too. I been using Selenium to scrap Yahoo Finance and getting frustrated. I have been learning Python and coming up with projects to get better at it. Thanks for your videos.

    • @JohnWatsonRooney
      @JohnWatsonRooney  10 месяцев назад +2

      Hi Ron - thanks for watching I appreciate it and glad you are finding my videos helpful!

    • @RonWaller
      @RonWaller 10 месяцев назад

      @@JohnWatsonRooney Yes I am. I just started trying Playwright and Selectorlax. It worked well. Still a lot to learn. Thanks again

  • @Robls501510
    @Robls501510 2 года назад +1

    Excellent video. Your cool and easy to understand delivery is a real pleasure.

  • @tobi02
    @tobi02 2 года назад +1

    Nice video. I wish I would have had that when I started webscraping - basicly everything you need to know is included.
    I would appreciate if you would make a video about your relation to web scraping. Is it your hobby or do you need it for work, e.g. as a data scientist? At the moment it's a hobby for me, but i'm a mechanical engineer and i'm trying to evolve a bit away from classic product design towards using big data technologies.

  • @rodolforaquion166
    @rodolforaquion166 Год назад +1

    Thank you for this video, John. Any courses you know that teaches web scraping the professional way? I asked because I watched tutorials where I can't follow because I'm being blocked even with headers(only learned that after searching and reading a lot).

  • @lautarob
    @lautarob 2 года назад +1

    Thanks for all these enlighten videos. Among our videos, I haven't seen Facebook or any social media site scrape. It that possible? if so, which would be the best approach to achieve fast results?

  • @HoustonKhanyile
    @HoustonKhanyile 2 года назад +1

    This is great.

  • @futuregootecks
    @futuregootecks 2 года назад +2

    Super helpful thank you!

  • @datag1199
    @datag1199 Год назад

    Hi John, of the three main methods for scrapping, which would you recommend with the following use case: I need to access a URL to download four data files. The site requires user name and credentials to login and access the files. The files are updated on a weekly basis. The files use the same download link; it's the data that changes. I guess you can say it's a dynamic link to a data file. I am thinking this could work with Requests and BeautifulSoup (and OS to land the files on my local directory), but wondering if you would have a recommendations. Thanks

  • @jamesbrownenglishforprofes1596
    @jamesbrownenglishforprofes1596 9 месяцев назад +1

    Hi John, I've subscribed to your channel!

  • @Valentin439
    @Valentin439 2 года назад

    Thanks for the video! I have a question tho...Can you give me a direction to look for in how to scrape amazon as fast as possible? what method should I use . THank you

  • @funkdoc2001
    @funkdoc2001 Год назад

    Hey John, I’ve never scraped data before but really want to scrape RightMove for sold house prices in specific areas. Any suggestions on the best approach for this?

  • @irfanshaikh262
    @irfanshaikh262 Год назад

    I can say it proudly and out loud,
    "I have an amazing teacher now" .
    Thank you John
    Just one question, you've been talking a lot about premium proxies usage like oxyylabs and iproyal.
    Im confused and dont actually know how to use a proxy service in a code.
    Would you please be kind enough to guide us through process also?
    Thanks Again

  • @sampatankar1977
    @sampatankar1977 2 года назад +1

    Hey John, all interesting and I guess now a curated list of videos (maybe fill in the gaps) which covers your curriculum, step-by-step please?

    • @JohnWatsonRooney
      @JohnWatsonRooney  2 года назад +2

      Yes good idea I will sort a playlist

    • @sampatankar1977
      @sampatankar1977 2 года назад

      @@JohnWatsonRooney Thank you!! Have a lovely Christmas!

  • @nebupulickal
    @nebupulickal 2 года назад +1

    Thanks for the informative video. How to crawl all urls associated with a website using scrapy.

    • @JohnWatsonRooney
      @JohnWatsonRooney  2 года назад +1

      Check out my channel for my video on scrapys crawl spider, that is what you want to use

  • @ashwinalvin4997
    @ashwinalvin4997 2 года назад

    Hi John,
    Are you on any social media. I'm a beginner web scraper.. just want to clarify doubts on my planned personal projects on webscraping.
    Would me much helpful to get the insights from you.
    Thanks for the inspiring videos 🤗

  • @Cheerfulnag
    @Cheerfulnag 2 года назад +4

    Great video. Where've you been when I was staring in June :D Btw, just out of pure interest, how much time you would say a person need(approximate, if they're persistent and spend enough time) to go from start(knowing only basics of python) to the point when they can start to work as web scrapers? For me it took 3.5month to get my first contract, but seems that it can be done much faster).

    • @JohnWatsonRooney
      @JohnWatsonRooney  2 года назад +2

      I’d say still at least a few months. There’s a lot to learn if you want to do paid work as you’ll need to be comfortable with things I didn’t talk about in this video, like running scripts from a server and saving stuff to databases. But like everything keep practising and learning

    • @DittoRahmat
      @DittoRahmat 2 года назад +2

      1,5 months for me. But I do have basic programming skills from other language (C# & Delphi) so not exactly starting from 0.
      Usually when coming from zero programming background, learning about algorithms & data structures can be a bit challenging

    • @dhankkhush1088
      @dhankkhush1088 2 года назад +1

      ​@@JohnWatsonRooney A video on freelancing requirements would be awesome! What's your highest quality/proudest project you've done for a client? and what skills/methods did you need to complete it?

  • @syedameen8828
    @syedameen8828 2 года назад

    Hi John I just have a doubt in html5 we can have data-attribute in 'a' tag how to scrape that data elements when i try to use it is giving error,
    for example: data-index could u helpl me...!

  • @jacknobles8272
    @jacknobles8272 Год назад +2

    you say start here, but you're communicating as if we all started way before here. You're talking as if I'm already somewhat knowledgeable on web scraping.

  • @tommifish322
    @tommifish322 2 года назад +1

    Mate, you need a full fledged udemy course or something. Nobody else comes close to your tutorials. 😁

  • @ammaralzhrani6329
    @ammaralzhrani6329 2 года назад

    the link is not working

  • @prod.kashkari3075
    @prod.kashkari3075 2 года назад

    Hey John love your channel, I’ve been having this issue when I try and paginate with BS4 on some sites, where the page number will change (I loop over the numbers and change the url or click the button) but the items on the page will not change.

  • @piggyraccoon5464
    @piggyraccoon5464 2 года назад

    Beautiful soup is such a nonsense name