GO for Beginners - Web Scraping with Golang Tutorial

Поделиться
HTML-код
  • Опубликовано: 13 авг 2024
  • In this golang video tutorial I will guide you through an excellent example of a beginner project for those who have started learning GO. This basic HTML web scraper will introduce you to some of the GO fundamentals, and also hopefully make a slight more interesting project that you can use to expand your GO skills.
    We will learn how to setup our project and use a web scraping framework called Colly - go-colly.org/ to extract some data from a test site.
    Patreon: / johnwatsonrooney
    Donations: www.paypal.com/donate/?hosted...
    Proxies: iproyal.club/JWR50
    Hosting: Digital Ocean: m.do.co/c/c7c90f161ff6
    Gear I use: www.amazon.co.uk/shop/johnwat...
    -------------------------------------
    Disclaimer: These are affiliate links and as an Amazon Associate I earn from qualifying purchases
    -------------------------------------
  • НаукаНаука

Комментарии • 68

  • @rouabahoussama
    @rouabahoussama 2 года назад +9

    Hey great work,
    Actually there is zero values for fields in the struct so you don't have to specify all the fields values if you don't have it.
    for example :
    type Item struct {
    Name string
    Price string
    ]
    func main() {
    item := Item{Name: "John"}
    fmt.Printf("%+v
    ", item)
    }
    try it and let me know what you have learned.

    • @JohnWatsonRooney
      @JohnWatsonRooney  2 года назад +3

      Ahh ok thanks that makes sense!

    • @rouabahoussama
      @rouabahoussama 2 года назад

      @@JohnWatsonRooney Keep going John Go is so great I love it

  • @M1911Original
    @M1911Original 2 года назад +4

    Best guide for web scraping on the internet!!! God bless you mate. Because of you I've been able to start so many projects that I was scared of starting because I thought web scraping was too complex. But you simplified so many things for me

  • @anisarebecca
    @anisarebecca 6 месяцев назад +1

    Great video, John, thank you! You strike a perfect balance of moving at a good pace but clearly explaining what you're doing.

  • @harkirehal258
    @harkirehal258 Год назад +1

    This is the way to learn a new language. Simple, concise programs. Great video!

  • @hsider
    @hsider Год назад +2

    I've already been familiar with Golang syntax but coded interesting with until today, wow, the speed and simplicity of Golang plus this library that comes packed with all the features I need for scrapping, amazing discovery, I don't think I'm going back to python for this kind of projects. Nice video mate as always 👌

    • @JohnWatsonRooney
      @JohnWatsonRooney  Год назад

      thanks! it's very different but i really like it, and am learning more and more GO in my free time

  • @comfixit
    @comfixit 2 года назад +6

    I really like how you present the information and structure your videos. Would you ever consider doing a video on how you prep to make videos?

    • @JohnWatsonRooney
      @JohnWatsonRooney  2 года назад +3

      Thanks! Sure, I could look at doing something like that

  • @lamhintai
    @lamhintai 2 года назад +3

    Great content as always! I’m pleasantly surprised that you are covering new languages, as I really like your succinct style of presentation. I have always heard of the good things about Golang but had an impression that it’s mostly for web backend stuffs. Your video sparks my interest to actually pick it up now!

  • @mondesinxi9397
    @mondesinxi9397 2 года назад +2

    Great walk-through! I used this to get a web-scraper going. I had done some web scraping with Python using selenium to get product info for a local supermarket chain, the code was much longer and getting it working was a lot more involved. It would probably have been neater with Scrapy. Anyway, this makes for a nice Part 1 of a mini-project. Part 2 would be to get your data into a database, sqlite as a start. Finally, Part three would create a RESTful API to expose your scraped data.

  • @mujeebishaque
    @mujeebishaque 2 года назад +5

    Please do a comparrison of speed between Python and Golang scraping/frameworks/libraries. Thank you. Love your content. I hit the like button first and watch later, top quality content as always.

    • @JohnWatsonRooney
      @JohnWatsonRooney  2 года назад +2

      Thank you very much! Yes I am going to do a comparison video!

    • @GlauberSilva333
      @GlauberSilva333 2 года назад +1

      I'm Python programmer and I'm changing to Golang for these cases. The way each one access hardware resources is huge different. Golang is faster than Python by far.

  • @LeviElekes
    @LeviElekes 5 месяцев назад

    Thanks for the video

  • @realB12
    @realB12 Год назад +1

    At 11:30 it start to get really interesting when the Colly Framework starts auto-navigating to different pages. However, the details for doing so might not be that obvious for beginners like me (tooke me some minutes) and could have been explored in a bit more detail and length. All in all, great vid. Have learned a lot in no time!!

    • @JohnWatsonRooney
      @JohnWatsonRooney  Год назад

      Sure that's a fair comment. I'm glad you enjoyed the video - I will do some more on GO and go into it a bit deeper!

  • @cryptojunkie591
    @cryptojunkie591 Год назад +1

    Great content, John. Seems Go/colly will make things easier for the personal project I've been working since I first thought about trying scraping a couple of months ago. Just as I'd managed to get what i needed from python following your vids and adapting the process for the site I'm wanting data from, and others code i'd found online, to work for the most part. Certainly getting the url's i need to be able dive in further was a lot easier with Go.

    • @JohnWatsonRooney
      @JohnWatsonRooney  Год назад

      Thanks. I really like Go and colly is great for these things. But Python is still a simpler language and easier to pick up - but give Go a chance and see how you feel

  • @coffeeintocode
    @coffeeintocode Год назад +1

    Great video. Need more Go content please

  • @thisguyisnotable
    @thisguyisnotable 8 месяцев назад +1

    awesome tutorial, really enjoyable to follow along! thanks, you earned a sub!

  • @mjacfardk
    @mjacfardk 2 года назад +1

    Thank you brother.
    As always great content from great mind.
    Keep going brother 🤗.

  • @janisvelbergs6394
    @janisvelbergs6394 2 года назад +1

    Hi John,
    This was great content! Go with Go ;)

  • @dwiatmokopurbosakti1193
    @dwiatmokopurbosakti1193 2 года назад

    good content man, please do more scraping with go especially on dynamic website, ty so much

  • @ekoaripurnomo3651
    @ekoaripurnomo3651 10 месяцев назад +2

    simple and provement code.... 👍👍👍

  • @Danny67483
    @Danny67483 Год назад

    This is an excellent tutorial, John,
    At the end of your demonstration you mentioned about using it in a Proxy. I found that intriguing, could you please elaborate on this, or if you know any tutorial online that I can read on would be great,
    Cheers.

  • @valuetraveler2026
    @valuetraveler2026 Год назад +1

    Much easier to get started than Scrapy used to be . General comment - such FWs are good for structured sites but what do you think about where there are no next links and no sitemap?

    • @JohnWatsonRooney
      @JohnWatsonRooney  Год назад

      it makes it a bit more tricky, but you can either use a range loop for pages, or generate page numbers to add to URLs and use c.Visit(). It really depends on how the page works. but yes you are correct scraping frameworks like colly are really best when its structured HTML sites

  • @David-mj9st
    @David-mj9st 2 года назад +1

    I am learning your python videos,I need to keep the pace, haha!

    • @JohnWatsonRooney
      @JohnWatsonRooney  2 года назад

      Ha thanks I just fancied a change, back to more Python next week!

  • @aminkhodayari68
    @aminkhodayari68 Год назад +1

    great video !

  • @CrazyFanaticMan
    @CrazyFanaticMan 2 года назад +1

    Oh my god, I need to learn Go...

    • @JohnWatsonRooney
      @JohnWatsonRooney  2 года назад

      I'm learning more and more of it now and really like it so far

  • @dpljs
    @dpljs Год назад

    Really good work, exactly what I was searching. Also, what color schema do you use?

  • @sebwylleman
    @sebwylleman 10 месяцев назад +1

    Cool video and setup mate! What font are you using, and is that the Gruvbox theme?

    • @JohnWatsonRooney
      @JohnWatsonRooney  10 месяцев назад

      thanks! i change my setup a lot but yes this is Gruvbox, or "Gruvbox Material" i used to use a lot. as for the font i'm not sure but I think its "m plus 2m" from adobe

    • @sebwylleman
      @sebwylleman 10 месяцев назад +1

      Top man! I see that you moved over to Neovim if I am not mistaken. That's my goal eventually, currently using Vim motions in VScode as an easier transition :)@@JohnWatsonRooney

    • @JohnWatsonRooney
      @JohnWatsonRooney  10 месяцев назад +1

      @@sebwylleman I have, I've been using it for a while now - sometimes I still use vs code for demos, but with vim motions!

  • @eduardosalles9212
    @eduardosalles9212 Год назад +1

    awesome!

  • @anwar587
    @anwar587 2 года назад +1

    Thanks for the video i really liked it
    But I have a question can we do web scraping with c/c++? If yes how ? And Sorry for my bad English

    • @JohnWatsonRooney
      @JohnWatsonRooney  2 года назад +1

      Hey! I’m sure you can, however I don’t know anything about c or c++ so I’m afraid I can’t really help

  • @budi0580
    @budi0580 2 года назад +2

    Is possible to scrape a page that need login first ?

    • @JohnWatsonRooney
      @JohnWatsonRooney  2 года назад +2

      Yes it is, however you will need to find out how the login form works and make a POST request to login first, saving that in a session object. Possible but more complex than what I covered in this video

  • @salimbo4577
    @salimbo4577 2 года назад +1

    A lot of people are learning go But I keep googling rust libraries LOL. My history is full of "rust rest api framework", "rust web scraping library", "rust machine learning library".

    • @JohnWatsonRooney
      @JohnWatsonRooney  2 года назад +1

      I also looked at rust for a second language but after searching a bit I chose Go and am liking it so far!

  • @pr0xy_
    @pr0xy_ 2 года назад +3

    I heard that go is really fast. does that apply in case of web scraping too? I have very big scrapping projects with millions of pages. Even python multi threading, async isn't enough. I was wondering which other language might work instead.

    • @JohnWatsonRooney
      @JohnWatsonRooney  2 года назад +3

      It absolutely can be, it adds complexity but yes. Colly states 1k plus pages a minute - I’m sure this can be achieved with Scrapy too which is on the twisted framework. Interesting though I will do some testing

    • @mattbass4807
      @mattbass4807 2 года назад

      @@JohnWatsonRooney would be a great video benchmarking stuff!

    • @mondesinxi9397
      @mondesinxi9397 2 года назад +1

      I guess having native concurrency with Goroutines doesn't hurt either

    • @JohnWatsonRooney
      @JohnWatsonRooney  2 года назад

      @@mondesinxi9397 sure, all stuff I am still learning about but so far so good

  • @tatsamui
    @tatsamui 2 года назад

    Which is more flexible if compare to Python?

  • @PerfectmindAMV
    @PerfectmindAMV 2 года назад +2

    is go scriper is faster then python scriper😐

  • @wtfdoiputhere
    @wtfdoiputhere Год назад +1

    Hope you're still enjoying go :)

    • @JohnWatsonRooney
      @JohnWatsonRooney  Год назад +1

      Yes very much so!

    • @wtfdoiputhere
      @wtfdoiputhere Год назад

      @@JohnWatsonRooney any remarkable limitations / drawbacks on Windows? I've only heard it once on some talk

    • @JohnWatsonRooney
      @JohnWatsonRooney  Год назад +1

      @@wtfdoiputhere to be honest i;m not sure, I don't use windows so much anymore

  • @SecurityM1nd
    @SecurityM1nd 2 года назад

    Hey, please what is ur VScode theme name ?

    • @JohnWatsonRooney
      @JohnWatsonRooney  2 года назад

      Hey it’s one of the GitHub themes, dark medium or dark soft I think

  • @mohammedaljahwari1165
    @mohammedaljahwari1165 2 года назад +1

    Xpath + regex are much better ❤️

  • @shaf8200
    @shaf8200 Год назад

    do we have to do an os.Exit anywhere?