Raspberry Pi + Squid: Building a Proxy Server with your Raspberry Pi for Web-scraping

Поделиться
HTML-код
  • Опубликовано: 2 окт 2024
  • Discover the simplicity of setting up a proxy server on your Raspberry Pi using the user-friendly and open-source software known as Squid. In this tutorial, we provide a step-by-step guide, demonstrating its application for web scraping. However, the advantages of establishing a proxy server extend beyond this, encompassing enhanced security, efficient caching, accelerated networking requests, and streamlined connection management. Unlock the potential of your Raspberry Pi with this comprehensive tutorial on Squid proxy server setup!
    Link to Blog Post:
    shillehtek.com...
    Link to Post on Medium:
    / building-a-simple-prox...
    You can donate here:
    www.buymeacoff...
    Join this channel to get access to perks:
    / @mmshilleh

Комментарии • 14

  • @jaxjax7318
    @jaxjax7318 2 месяца назад +1

    I loved to use tail -f comand to the squid log. so I get the real time ip and connections the client was connecting to. it was very helpful to diagnose what IP's i need to block in the firewall

    • @mmshilleh
      @mmshilleh  2 месяца назад

      Thanks for the info!

  • @jorgevillarreal2245
    @jorgevillarreal2245 5 месяцев назад +5

    But if the raspberry "proxy" is on the same local network isn't it using the same public ip address to do webscrapping ? Shouldn't this raspberry proxy be on a network with a different ISP provided public ip so that it can be leveraged correctly to avoid throttling ?

    • @mmshilleh
      @mmshilleh  5 месяцев назад +2

      That's a great question! It's possible to still have some success with web scraping even when using a proxy on the same local network, especially if the websites being scraped aren't actively blocking or throttling requests based on IP addresses. However, for more reliable and efficient web scraping, it's generally recommended to use a proxy with a different public IP address. This helps to distribute requests effectively and reduces the likelihood of being blocked or throttled by websites. If you've had luck with your setup on your local network, it might be because the websites you're scraping haven't implemented strict IP-based blocking or throttling measures. Just keep in mind that as your scraping activities increase or as you target different websites, you may encounter limitations or restrictions.

  • @MASKDANTE
    @MASKDANTE Месяц назад +1

    I have an internet connection that has a proxy and its IP is 192.168.49.1:8000, in order to connect to the internet I must configure this data, how do I configure the same on the raspberry pi4, I have not been able to use the internet via wifi, the raspberry pi4 connects to the wifi and assigns an IP automatically but does not browse because I have not configured this data as would be done when it is in client mode.

    • @mmshilleh
      @mmshilleh  Месяц назад

      I am not sure my friend

    • @JamminJosh7
      @JamminJosh7 Месяц назад +1

      Did you try manually setting a static IP in the router admin page?

    • @mmshilleh
      @mmshilleh  Месяц назад

      @@JamminJosh7 I have done that before yes

  • @jaredpullman1173
    @jaredpullman1173 3 месяца назад +1

    How can you adjust the squid.conf to allow remote use of the proxy? If I set the proxy I made as my browser proxy on my laptop it works great when I’m connected to the same wifi network but if I’m at a friends house on their wifi and try using the proxy as my browser proxy, it will not prompt for user authentication hence disallowing remote connection.

    • @mmshilleh
      @mmshilleh  3 месяца назад

      You can do these sorts of things easily with Tailscale. I recommend looking into that.

  • @tsriramaraju
    @tsriramaraju 3 месяца назад +1

    Great tutorial! I was wondering if we could use 3-4 4G LTE modems and rotate the IPs whenever there's a block. Do you have any suggestions on how to achieve this? Please provide some guidance.

    • @mmshilleh
      @mmshilleh  3 месяца назад

      Wow you definitely could sounds like an interesting project. You would have to have some error handling in your Python code I have done something similar. I do not think it is so complicated to do that

  • @davidbotham7090
    @davidbotham7090 8 месяцев назад +1

    Great tutorial! I would love to connect a few of these to a single RPI to monitor my filament containers. Do you know if that is possible? And, if so, maybe where there is a howto on something like that?

    • @mmshilleh
      @mmshilleh  8 месяцев назад

      Hey man, no I do not know on the top of my head how to do that. You probably need more sensor apparatus than just a simple Pi. If you would like to discuss this in detail feel free to book a consulting slot on the buy me coffee link found on my RUclips profile. Sounds involved yet interesting!