Scraping LinkedIn Jobs with Python Scrapy (2022)

Поделиться
HTML-код
  • Опубликовано: 22 дек 2024

Комментарии • 35

  • @pit5335
    @pit5335 Год назад +4

    I followed you at every step and it was a success.
    Thank you for your time and dedication. regards
    +1 follower

    • @scrapeops
      @scrapeops  Год назад

      Great, thanks for the feedback!

  • @tingwang5009
    @tingwang5009 Год назад +2

    Followed you step by step and code works perfect ,thx~

  • @Alex-zz8vk
    @Alex-zz8vk Год назад +3

    Hey, I found a bug. If you will open jobs-guest page and try to scrape all links to job postings trough changing start parameter you can get some data duplicated and some data missed. Any ideas how to fix that ?

  • @sumedhpatil1474
    @sumedhpatil1474 Год назад +4

    How to get full job description

  • @PujanPanoramicPizzazz
    @PujanPanoramicPizzazz 6 месяцев назад

    Seems the solution is out dated as jobs-guest filter does not work right now, it's voyager but more complicated and I cannot get that url.

  • @ramensusho
    @ramensusho Год назад +1

    Followed the same code from blog post and video but it is always returning an empty list (not-found) ? How to solve this ?

    • @nishabhakar9207
      @nishabhakar9207 Год назад +1

      same with me, if you get any solution kindly inform me!

  • @harrisonjameslondon
    @harrisonjameslondon 6 месяцев назад

    Has anyone had issues with running the final scrapy list?? i am only getting the 'quotes' instead of linked_jobs !! please help as have just spent 4 hours on this!1

  • @John-BruceGreen
    @John-BruceGreen Год назад

    Hello. I’ve really enjoyed your videos. I have a question about LinkedIn scraping. When I go to LinkedIn it seems to require that I login or signup. I do not see any way around this. When I inspect their top level url page I do not see the elements you refer to. Can you advise. Has their site changed since your recording, so as to make it inaccessible to guests? Thanks in advance.

  • @ahmedelsaid8368
    @ahmedelsaid8368 Год назад +1

    if you get 403 error ,That means your request didn't have an API key, the API key was invalid or you haven't validated your email address.

    • @ThePilosyankaren
      @ThePilosyankaren Год назад

      Saved me from frustration, leading to the existential crisis :D

  • @forexhunter2040
    @forexhunter2040 Год назад

    Always the best

    • @scrapeops
      @scrapeops  Год назад

      Thank you!

    • @forexhunter2040
      @forexhunter2040 Год назад +1

      @@scrapeops I realised after you scroll for a while the jobs data stops being automatically loaded and you encounter "See More Jobs" button.
      Is there a way my bot can detect that the data is no longer being loaded automatically and now start following the buttons?

  • @CarolineTravelsAround
    @CarolineTravelsAround 10 месяцев назад

    hi! I am getting errors "Unknown command: list" and "Unknown command: crawl". does anyone know how I can fix this? thanks!

    • @mugiwareng
      @mugiwareng 9 месяцев назад

      maybe because scrapy cannot find an active project in the directory where you trying to run the 'crawl' command

    • @mugiwareng
      @mugiwareng 9 месяцев назад

      if you already activate the virtual environment, then make sure you in the root directory of your scrapy project where you try to run the 'crawl' command. The directory is where the 'scrapy.cfg' file is located

    • @mugiwareng
      @mugiwareng 9 месяцев назад

      if according to the video, first, cd basic-scrapy-project then run 'scrapy list'

  • @Wideloot
    @Wideloot Год назад

    amazing
    reomanded

  • @pit5335
    @pit5335 Год назад

    How do I automate the extraction? and that it does not repeat the data?
    I need to perform searches every 1 hour to be able to apply quickly.
    Many thanks for everything

    • @pit5335
      @pit5335 Год назад

      I export it to json and then with panda I make it an xlxs. It would be nice to be able to do everything automatically. I am looking for information, I await your comments.
      I would like to automate the process.
      Thanks

    • @scrapeops
      @scrapeops  Год назад +2

      To schedule it to run every hour you could deploy it to a server using ScrapeOps scheduler (free). Here is a video on how to do it: scrapeops.io/docs/servers-scheduling/digital-ocean-integration/
      To make sure it only extracts new data then you need to create a pipeline that checks what data you have already scraped by checking a database before adding new data. Check out this guide: scrapeops.io/python-scrapy-playbook/scrapy-save-data-postgres/#only-saving-new-data

    • @pit5335
      @pit5335 Год назад +1

      @@scrapeops You are very generous, thanks for the work

  • @manoj8415
    @manoj8415 Год назад

    It is giving me an error saying that http status is not handled or allowed

    • @scrapeops
      @scrapeops  Год назад

      What is the status code?

    • @hakeembashiru5615
      @hakeembashiru5615 Год назад

      @@scrapeops 403

    • @scrapeops
      @scrapeops  Год назад +2

      @@hakeembashiru5615 That means your request didn't have an API key, the API key was invalid or you haven't validated your email address.

    • @hakeembashiru5615
      @hakeembashiru5615 Год назад +1

      @@scrapeops oooh, I didn’t actually validate my email address, I’ll have to do that thanks. but I got a workaround using scrapy_selenium with the selenium.webdriver using the chrome driver

    • @ahmedelsaid8368
      @ahmedelsaid8368 Год назад

      thanks a lot , this was the solution for the error for me , i only needed to validate my email @@scrapeops

  • @BazaroffM
    @BazaroffM Год назад +2

    Hey that's great content and the easiest setup I've ever seen, but man the pricing is hella pricey, it spends 1000 credits for 250 jobs, does it really cost that much, I'd love to support you, but paying 100 bucks for that is kinda makes me look for alternatives to the proxy, just wanted to share feedback with you.

    • @scrapeops
      @scrapeops  Год назад +1

      Unfortunatly LinkedIn is one of the hardest websites to scrape out there! It costs 70 times more than scraping an Amazon page (or other normal website). The average cost is approx. $7 to scrape 1000 profile/job pages

  • @somethingfeto
    @somethingfeto 10 месяцев назад

    idk is it still working or not but I enjoy the video. wp

  • @1622roma
    @1622roma Год назад

    I did my best to keep up, but it was quite challenging. Why make it so complicated? If there's a video tutorial, why not include explanations of the code? It's nearly impossible for someone without a background in web scraping to follow along.

    • @CambiacapasYT
      @CambiacapasYT Год назад

      it is challenging, when i find something like this i downgrade a level and then come back, try some web scraping fundamentals tutorial, then basic scrapy tutorial, then learn apis and youll be ready for this one