Request Headers for Web Scraping

Поделиться
HTML-код
  • Опубликовано: 29 сен 2024
  • With every HTTP request there are headers that contain information about that request. We can maipulate these with requests or which ever web scraping tool we are using with Python to change how the server reacts to us. In this video i'll show you the basics of how they work and what they look like, and then demo how to change the most important ones in your code.
    -------------------------------------
    Disclaimer: These are affiliate links and as an Amazon Associate I earn from qualifying purchases
    -------------------------------------
    Digital Ocean (Cloud Servers, Affiliate Link) - m.do.co/c/c7c9...
    Sound like me:
    microphone amzn.to/36TbaAW
    mic arm amzn.to/33NJI5v
    audio interface amzn.to/2FlnfU0
    -------------------------------------
    Video like me:
    webcam amzn.to/2SJHopS
    camera amzn.to/3iVIJol
    studio lights amzn.to/3aBpKik
    small lights amzn.to/2GN7INg
    -------------------------------------
    PC Stuff:
    case: amzn.to/3dEz6Jw
    psu: amzn.to/3kc7SfB
    cpu: amzn.to/2ILxGSh
    mobo: amzn.to/3lWmxw4
    ram: amzn.to/31muxPc
    gfx card amzn.to/2SKYraW
    27" monitor amzn.to/2GAH4r9
    24" monitor (vertical) amzn.to/3jIFamt
    dual monitor arm amzn.to/3lyFS6s
    mouse amzn.to/2SH1ssK
    keyboard amzn.to/2SKrjQA

Комментарии • 63

  • @b.1851
    @b.1851 3 года назад +9

    You are the Scrapy GOAT . keep up the content!!

  • @snowbaby93
    @snowbaby93 3 года назад +2

    You read my mind! This was the exact video I needed for today! Thank you for making all these videos!

  • @Zale370
    @Zale370 3 года назад +3

    I hope you plan to make a discord channel or a Telegram group so the scraper community has a place to exchange ideas :) Awesome content btw!

  • @rrahll
    @rrahll 3 года назад +4

    Thank you dude! Interesting and useful content! Keep it up and good luck ;)

  • @Brlitzkreig
    @Brlitzkreig 9 месяцев назад

    You make great content bro

  • @alivakili5408
    @alivakili5408 3 года назад

    Nice, thanks, pks talk about handling cookies that on every request changes

  • @multigladiator384
    @multigladiator384 3 года назад +2

    8:28 I like you man :D Sorry for the many comments. I am just very interested in this and you have the best content I have found so far!
    Can you please make a in-depth video about cookies and how to use them with "request"?:D

  • @abukaium2106
    @abukaium2106 3 года назад +2

    Great tutorial. I follow your every video. Would you like to give us some tips to prevent blocking when scraping websites?

    • @r0bi100
      @r0bi100 2 года назад

      Did you find some solution?

  • @gouemoregis195
    @gouemoregis195 3 года назад +1

    Great one

  • @axelamoe
    @axelamoe Год назад +1

    Man anyway I can email you, I am stuck scraping a certain website for my business and the cookie is being set by the website under set cookie and I can’t for the life of me get to change. Pleaseee

    • @JohnWatsonRooney
      @JohnWatsonRooney  Год назад

      Hey my email is on my main yt page - I’ll do my best to help if I can

  • @huzaifaameer8223
    @huzaifaameer8223 3 года назад

    Hi there! can you make a video on a chat bot?

  • @multigladiator384
    @multigladiator384 3 года назад

    // Standard for general requests
    Accept: "*/*"
    // Standard for navigation-requests in browsers
    Accept: text/html, application/xhtml+xml, application/xml;q=0.9, */*;q=0.8

  • @RideableEntertainment
    @RideableEntertainment Год назад +2

    Just want to take the time to thank you for the high quality, easy to understand information in such a compressed amount of time in this video

  • @ugurdev
    @ugurdev 3 года назад +2

    Hey mate, look into selenium-wire to grab cookies for using to scrape APIs.

  • @EvgeniySakharov
    @EvgeniySakharov Год назад +2

    Boy, you have great content. I am studying at the International Internet Academy. I'm learning Python. Your videos help me well in my studies. Thank you very much! Keep making good videos, don't stop. With best wishes from Russia!

  • @JackyVSO
    @JackyVSO 8 месяцев назад

    Is it also a good idea to employ custom headers when sending requests to a REST API?

  • @Seedrelywagon
    @Seedrelywagon 6 месяцев назад +1

    Thanks for the solution, just started with python and scraping 🤞

  • @celerystalk390
    @celerystalk390 3 года назад +3

    Thank you John for the great explanation of this important topic in web scraping! Also, the comparison between requests and requests-html is nice.

  • @marcelocpereira123
    @marcelocpereira123 2 года назад +1

    John congratulations for excelent video and explanation. Do you can a video using the method with playwright?

  • @ISAAKKUSH
    @ISAAKKUSH 2 года назад +1

    Thanks for the explanation. Didn't stump on this topic thanks to this video

  • @tonisun4785
    @tonisun4785 Год назад

    ❤🧡💛💚💙💜🤎🖤🤍 Starts 4:00

  • @Wabuh-Wabuh
    @Wabuh-Wabuh 3 года назад +1

    do u intentionally fade out of the images to annoy us?

  • @tubelessHuma
    @tubelessHuma 3 года назад +1

    Nice Change 🙄 Informative Dear👍

  • @Angel24112411
    @Angel24112411 Год назад +1

    + like for the guitars 🎸

  • @EvolutionMedicalCare
    @EvolutionMedicalCare Год назад +1

    BOOM! Great share John :)

  • @restitutoochea5723
    @restitutoochea5723 3 года назад +1

    i have a question , i get a response code 200 , so its good to go but , when i start scrapping it result to none, ??

    • @JohnWatsonRooney
      @JohnWatsonRooney  3 года назад

      I think maybe you are being sent a captcha page or similar, try printing the whole soup text and see what it is

  • @8anime_to723
    @8anime_to723 Год назад +1

    This was a great video

  • @alibaba2746
    @alibaba2746 3 года назад +1

    Great.. John Bro, You are the Man... GBU

  • @marcelagoda4635
    @marcelagoda4635 3 года назад +1

    Why do requests headers let us make our programs more human-like? Thanks!

    • @JohnWatsonRooney
      @JohnWatsonRooney  3 года назад

      We can send headers that a browser would normally sent automatically to make it look more like we are a normal browser

  • @stefenangga5662
    @stefenangga5662 3 года назад +1

    Thank you so muchh John!!!!

  • @GrandRapingBeats
    @GrandRapingBeats 2 года назад

    How do I scrape info from the payload after its submitted in the code? Im making a account creator that needs to print the token from the account, how can i scrape this from the headers/payload?

  • @ge0rgeth0mas
    @ge0rgeth0mas Год назад

    🙏 Thank you

  • @anishjain1874
    @anishjain1874 3 года назад

    Hey Man,
    is there any way to extract the cookie value into microsoft excel automatically? if yes, please share something about it. thanks in advance

  • @ezraakran7158
    @ezraakran7158 3 года назад +1

    Once I've written the code in python, does that mean I can go to my browser and the details will transfer to the relevant website?

    • @JohnWatsonRooney
      @JohnWatsonRooney  3 года назад

      Not sure what you mean but the code does everything itself is won’t make anything in your browser change

    • @ezraakran7158
      @ezraakran7158 3 года назад

      @@JohnWatsonRooney no worries mate, still new to all of this. I solved it in the end

  • @aogunnaike
    @aogunnaike 3 года назад +1

    Nice hoody bro

  • @rajesha2770
    @rajesha2770 3 года назад

    Hi can we add underscore in request header? by default underscore is converting to Hyphen, can we restrict this? please suggest

  • @ThomasBass
    @ThomasBass 3 года назад +1

    Hi John, did you receive my email?

  • @leniedor733
    @leniedor733 2 года назад

    DNT is not depricated? What can be an alternative?

  • @alexdin1565
    @alexdin1565 3 года назад +1

    the best like before watching

  • @alibaba2746
    @alibaba2746 3 года назад

    Can u please explain how we keep both Selenium and Requests?

  • @Gh0stwrter
    @Gh0stwrter 3 года назад +1

    Great video man, Thank you

  • @ahmadaboeleneen3357
    @ahmadaboeleneen3357 Год назад

    Thank you ,helpful and fruitful

  • @ferilukmansyah3037
    @ferilukmansyah3037 3 года назад

    hi, john thank for best tutor, I happen to discuss about headers, I want to ask how to rotate headers so I can avoid captcha here I am facing a case where I have to change the headers when making requests, where the form data parameter changes how to get the form data automatically so that when running the scraper it does not replace it manually?

    • @pr0skis
      @pr0skis 3 года назад +2

      Rotating headers doesn't defeat recaptchas - you should read this: www.blackhat.com/docs/asia-16/materials/asia-16-Sivakorn-Im-Not-a-Human-Breaking-the-Google-reCAPTCHA-wp.pdf
      To rotate headers, have a list of them in a txt file, the:
      with open('your file name.txt', 'r') as file:
      useragent_list = [item.strip('
      ') for item in file]
      Then you can use random.choice(useragent_list) in the relevant place for your header...

  • @narjesatia
    @narjesatia 3 года назад

    Hi,John .thank you for this nice video (as usually) .Please i need a help : can we store scraped data of images into a csv file ? ,Thanks again .

  • @cyy1117
    @cyy1117 3 года назад

    The channel is growing. Keep it up John!