How to Webscrape Linkedin Profiles 2022 with Selenium & Beautiful Soup

Поделиться
HTML-код
  • Опубликовано: 21 дек 2024

Комментарии • 13

  • @nizardad1031
    @nizardad1031 2 года назад +2

    That intro 😭 i felt ur pain thanks top G for such a precious help

  • @JUANCAMILOVEGAPEREZ
    @JUANCAMILOVEGAPEREZ 8 месяцев назад

    Great video!! Is there a way to extract the information of only the experience? I tried with your code but when extracting experience or education information the ' ' spaces are 18 for both topics.

  • @mansisharma3624
    @mansisharma3624 Год назад

    Can you also specify the format in which we need to write our login.txt in ?

  • @mikew2883
    @mikew2883 Год назад

    Excellent video!👍

  • @sherlyhartono8024
    @sherlyhartono8024 2 года назад

    Do you know how to by pass the security verification check? It's giving me the verification prompt after several times running the program

  • @muneebahmedabbasi6840
    @muneebahmedabbasi6840 2 года назад

    Great work.

  • @ayush_aksingh
    @ayush_aksingh 2 года назад

    Github link please. Just now please

    • @aiwithaz
      @aiwithaz  2 года назад

      github.com/thelazyaz/linkedin-web-scraping

  • @freddyrodriguezalvear6590
    @freddyrodriguezalvear6590 Год назад

    Hello, I have a problem when extracting the data
    Warning (from warnings module):
    File "C:\Users\fredd\OneDrive\Imágenes\linkedin-web-scraping-main (1)\linkedin-web-scraping-main\linkedin_employee_scraper.py", line 11
    driver = webdriver.Chrome(path)
    DeprecationWarning: executable_path has been deprecated, please pass in a Service object
    Warning (from warnings module):
    File "C:\Users\fredd\OneDrive\Imágenes\linkedin-web-scraping-main (1)\linkedin-web-scraping-main\linkedin_employee_scraper.py", line 42
    source = BeautifulSoup(driver.page_source)
    GuessedAtParserWarning: No parser was explicitly specified, so I'm using the best available HTML parser for this system ("lxml"). This usually isn't a problem, but if you run this code on another system, or in a different virtual environment, it may use a different parser and behave differently.
    The code that caused this warning is on line 42 of the file C:\Users\fredd\OneDrive\Imágenes\linkedin-web-scraping-main (1)\linkedin-web-scraping-main\linkedin_employee_scraper.py. To get rid of this warning, pass the additional argument 'features="lxml"' to the BeautifulSoup constructor.
    Traceback (most recent call last):
    File "C:\Users\fredd\OneDrive\Imágenes\linkedin-web-scraping-main (1)\linkedin-web-scraping-main\linkedin_employee_scraper.py", line 176, in
    searchable = getProfileURLs(company)
    File "C:\Users\fredd\OneDrive\Imágenes\linkedin-web-scraping-main (1)\linkedin-web-scraping-main\linkedin_employee_scraper.py", line 53, in getProfileURLs
    title = invisibleguy.findNext('div', class_='lt-line-clamp lt-line-clamp--multi-line ember-view').contents[0].strip('
    ').strip(' ')
    AttributeError: 'NoneType' object has no attribute 'contents'
    Greetings from Ecuador

    • @aiwithaz
      @aiwithaz  Год назад

      this is a common problem; the program can't find the chromedriver path, so download the chromedriver exe that matches whichever chrome version you're using here: chromedriver.chromium.org/downloads then when downloaded you can just put the exe file in the same directory as the program and set the path string to be the name of the chromedriver exe

  • @Lapookie
    @Lapookie Год назад

    Thanks ! Do not work anymore in June 2023

    • @aiwithaz
      @aiwithaz  Год назад +1

      what's annoying is linkedin changes their ui frequently to prevent web scrapers like us from harvesting data. A program like this needs to be constantly maintained

    • @Lapookie
      @Lapookie Год назад

      @@aiwithaz yeah really hard ^^