How I save my Scraped Data to a Database with Python! Beginners sqlite3 tutorial

Поделиться
HTML-код
  • Опубликовано: 27 ноя 2024

Комментарии • 76

  • @mauisam1
    @mauisam1 2 года назад +16

    Great video! Just an FYI the is a CREATE TABLE IF NOT EXISTS TABLE_NAME (column_name datatype, column_name datatype); command so you don't have to comment it out and rerun you script without errors. There is also a DROP TABLE IF EXISTS TABLE_NAME; as well if you want to recreate it with fresh data over and over.

  • @eka4015
    @eka4015 4 года назад +21

    I dont follow many channels, but this one is gold. Thanks for your extremly well explained tutorials. Keep doing them, it helps a lot

    • @raymondmichael4987
      @raymondmichael4987 4 года назад

      That makes the two of us.
      For such explanation,
      I subed

  • @omerthebear
    @omerthebear 2 года назад +2

    Around 4:35 I like that you're using variables here to emulate real-life code instead of teaching it like a text book. A lot of people learn the same way I do, we need to be taught with applicable examples, thank you for that!

  • @vishnuprasad4107
    @vishnuprasad4107 2 года назад +1

    This channel is so underrated.

  • @sounakchatterjee6417
    @sounakchatterjee6417 3 года назад +3

    Your videos are genuinely helpful and always to the point. I hope you never stop delivering such content and always remain motivated for delivering such stuffs. These are very good.. thanks John. 🙏

  • @a991ejack
    @a991ejack Год назад +1

    Thank you for adding pandas to this video. It was exactly what I needed to learn.

  • @prashantbhosale6745
    @prashantbhosale6745 2 года назад +1

    Great job brother…you have every problems solution❤love from india…keep growing

  • @travelselects272
    @travelselects272 4 года назад +3

    This is great. I'm working on an personal project along the lines and this will help a ton. Thanks John ( you rock! )

  • @karolkleckovski6473
    @karolkleckovski6473 2 года назад +1

    WOW. Words cannot do justice to how well this has all been explained. :O Subscribed, please teach me more! :D The only sad thing I noticed is that you didn't say why it is important to close the connection at the end.

  • @raymondmichael4987
    @raymondmichael4987 4 года назад +2

    Good job brother,
    Thanks so much.
    I subed
    Greetings from Tanzania 🇹🇿

  • @tubelessHuma
    @tubelessHuma 4 года назад +2

    Thanks John its really very useful tutorial.

  • @kanaal2411
    @kanaal2411 2 года назад +1

    Thanks dude for the tutorial!

  • @jasond580
    @jasond580 2 года назад +1

    Great videos John!! Any suggestions on pulling dynamic data from APIs (a data set that is updated maybe weekly) and being able to update that existing record's updates in the database?

    • @JohnWatsonRooney
      @JohnWatsonRooney  2 года назад

      Hey thanks. Sure, before you load to the database, check for an existing entry with the same details, ID or name or something unique and if it exists update with the new data, And if it doesn’t add it in

  • @emilseyfi9988
    @emilseyfi9988 3 года назад +2

    Great work! I wonder, will it be again so if we use find_all() instead of find(). I think it should be tougher

  • @jacekberkowski5542
    @jacekberkowski5542 4 года назад +2

    Absolutely fantastic, thank you.

  • @nsnilesh604
    @nsnilesh604 3 года назад +1

    Awesome video for saving nd creating database Sir 👌

  • @aogunnaike
    @aogunnaike 4 года назад +2

    Can this be done with a MySQL database? I'm thinking webscraping into a MySQL database and use PHP to view it on a web page

  • @zeeshantaj55
    @zeeshantaj55 2 года назад

    How to append dictionary with different keys each time in a for loop to the same sqlite database in python?

  • @ismailyyy7453
    @ismailyyy7453 3 года назад +2

    You are my man

  • @findmyhome3846
    @findmyhome3846 2 года назад +1

    Thank you so much for this amazing content.

  • @saifazeem8158
    @saifazeem8158 3 года назад +1

    Well Explained, Thanks man!

  • @hmak5423
    @hmak5423 11 месяцев назад

    I do have an urgent question:
    I'm currently working on a technical assessment for a job interview. Everything seems quite simple so far - they need me to make a data pipeline using Python and SQL. Python needs to be the tool used for data (raw) pull and Data Quality checks. The rest of the pipeline is made using SQL (normalise, dimensions, etc.). My question is regarding the step between Python and SQL. According to the requirements, I need to make a mock Datamart where I can store the tables that are created using Python code. These tables that go into the mock DataMart are then pulled (queried) again using SQL. As mentioned before, SQL will then be used to normalise and analyse. What is a DataMart and how can I make a mock version of it? Are they simply asking me to make a database or data warehouse? I've heard of DataMarts before but never used them at university, job or even when I am doing coding sessions in my own free time.

  • @Rway27
    @Rway27 4 года назад +1

    John your content is great and keep it up! Do you program or code for a living? I saw in your bio that you are self taught and that is what I am working on right now!

    • @JohnWatsonRooney
      @JohnWatsonRooney  4 года назад +1

      Thanks! I work in e-commerce and use code to help me but I’m not a developer as such

  • @Eijgey
    @Eijgey 4 года назад +1

    Really looking forward to the flask app, will you still be doing that one? Loving the content, one of the best channels I've found in a while!

    • @JohnWatsonRooney
      @JohnWatsonRooney  4 года назад +2

      Absolutely! Things are just taking me a little longer at the moment!

    • @raymondmichael4987
      @raymondmichael4987 4 года назад

      @@JohnWatsonRooney, I'll be waiting

  • @siraj.udlla_
    @siraj.udlla_ 2 месяца назад

    What do you think should learn sql to store web scraping data in database

  • @alexcrowley243
    @alexcrowley243 4 года назад +1

    I have two functions, one scrapes a series code, episode title and url (from Family Guy transcripts) and the other function scrapes the actual transcripts text. How can I add these all to the same database as the variables are defined in different functions? Thanks!

  • @wangdanny178
    @wangdanny178 2 года назад

    Hey John, why after SELECT, the print results are like this(each item has a comma behind inside parenthesis)?
    [('apple',), ('banana',), ('orange',), ('orange',)]

  • @stewart5136
    @stewart5136 2 года назад

    Thanks JWR for the excellent vids. I'm in NA and this website doesn't load up. Used Selenium to see what was happening and I could make it load with my VPN set to the UK. I'm fine with following this video and I'll use a different site, however I was also running into the issue of accepting cookies and wondered if there's a method using BS4 to address the prompt and setting cookies to actually scrape?
    Assume this wasn't an issue when you first released the video. Thanks again.

    • @stewart5136
      @stewart5136 2 года назад

      Watched your video (fantastic) on Insomnia and that got me everything required. Now works fine if my VPN is set to the UK - but not if scraping from outside the UK. Not a big deal, but wondered if there's an obvious way in the code to spoof my location and not have to use the VPN?

  • @hedgehog8229
    @hedgehog8229 3 года назад +1

    I have a use case question. How can i stop the insertion into the db if the same name already exists there. for example apple is already there in the db, so I don't want again apple to be included there.

    • @JohnWatsonRooney
      @JohnWatsonRooney  3 года назад

      Sure, you need to use a primary key on the column of the table you don’t want to insert twice - then change the sql command to “insert or ignore”

  • @deekshagokarn2783
    @deekshagokarn2783 4 года назад +1

    Hello John,
    I have scraped the data and created a database table and inserted the respective values, unfortunately the values are being inserted twice. Not able to figure out why. Could you please let me know about this issue?
    Thanks in advance

    • @JohnWatsonRooney
      @JohnWatsonRooney  4 года назад +1

      Sounds odd. Inserted into the database twice? Check you don’t have multiple conn.commits in your code

    • @deekshagokarn2783
      @deekshagokarn2783 4 года назад

      @@JohnWatsonRooney it worked now. I made a rookie mistake of calling the function again inside a class. Thanks for the quick response 😇. Looking forward for other videos.

  • @rosalyna_24
    @rosalyna_24 4 года назад +1

    can u do a video on saving the scraped data in a json file

  • @samuelmino2536
    @samuelmino2536 3 года назад

    hey man i believe im on the right track with your vid i just have trouble on how to connect it to a firestore databse if you could please give me a hand?

  • @wangdanny178
    @wangdanny178 2 года назад

    hey John, could you have a look at the overclocker website? I use selenium and playwright. But it doesn't load when i see the process without headless mode.
    Selenium page source and playwright cookies are all learned from your previous videos. I am frustrated now when all of them do not work out. :( sorry for sharing this feelling🤣.

  • @PyMoondra
    @PyMoondra 3 года назад

    Where is a good place to host the database?

  • @armanwirawan7099
    @armanwirawan7099 Год назад

    How to introduce randomness so they don’t think you are web scrapping them?

  • @tdye
    @tdye 3 года назад +1

    How do you update the github file on the linux server after you've pushed edits to github? I tried 'git pull origin master' but it wouldn't let me pull and overwrite the existing github repository on the linux server.

    • @JohnWatsonRooney
      @JohnWatsonRooney  3 года назад

      I use git fetch -all then git pull to sync all changes

  • @Error-Solver.
    @Error-Solver. 8 месяцев назад

    by using sqlite3 do we get the updated value

  • @abbasnoufal9225
    @abbasnoufal9225 3 года назад

    i did a spider and connected it to sqlite3 it works great ! but when i crawl again diffrent url it overwrites previous scraped data :/ how to fix that if you dont mind answering :) thank you in advance

  • @ktrades2898
    @ktrades2898 3 года назад

    Great Video. I was wondering if you prefer this over sqlachemy!

  • @matteomannini1205
    @matteomannini1205 3 года назад +1

    r.content or r.text? Why you decided to use r.content? Thanks

    • @JohnWatsonRooney
      @JohnWatsonRooney  3 года назад +1

      content is the raw bytes - if we call .text it changes it into a text string for us. It's best to do .text if you want the text only

  • @saman27gold72
    @saman27gold72 4 года назад

    Hi . I create a database and I have a question how can I using sum to sum column price in my table .

  • @True-pak
    @True-pak 3 года назад

    Hi, can anyone help me with this error in this project,
    error is:
    AttributeError: 'NoneType' object has no attribute 'text'"
    Thanks in Advance

  • @lorenzocastagno1305
    @lorenzocastagno1305 3 года назад +2

    Grazie!

  • @Grinwa
    @Grinwa Год назад

    Am planning to make a stock market bot that checks for updates in prices and sends notification to users with telegram is it worth it or it's a stupid idea

  • @navarajpokharel8980
    @navarajpokharel8980 3 года назад

    have you done project with sqlalchemy?

  • @mohdsalmanansari5992
    @mohdsalmanansari5992 4 года назад +2

    Make a video on scrapy + pymongo

  • @TheChaos2711
    @TheChaos2711 11 месяцев назад

    Thank you💌

  • @Grinwa
    @Grinwa Год назад

    Is there a possible way to host the database file on free server like Google drive

  • @suravighosal9934
    @suravighosal9934 2 года назад

    What is header ?

  • @cccccc864
    @cccccc864 3 года назад +1

    Why aren't you following pep8?

    • @JohnWatsonRooney
      @JohnWatsonRooney  3 года назад

      Just for demonstration purposes as long as it’s clear I don’t worry about pep8 so much

    • @cccccc864
      @cccccc864 3 года назад

      @@JohnWatsonRooney I think you shouldn't. Even more for demonstration. Many will see your content as one of their first contacts with Python. They will probably try to replicate it. It doesn't cost much to follow good practices and it makes a huge difference.

  • @daveisdead
    @daveisdead 4 года назад +1

    how do you add the id primary key to this? does it do it automatically?

    • @JohnWatsonRooney
      @JohnWatsonRooney  4 года назад

      Sure - When creating the table add PRIMARY KEY after the data type (TEXT or INTEGER etc)

  • @lukmarnhakeem9278
    @lukmarnhakeem9278 4 года назад +1

    John, I sent you email. Please check.

  • @Leonardo_A1
    @Leonardo_A1 Год назад

    Is it possible to use SQLite as a Web-Server from a PC-Client? If yes, please return the value of the connection = conn= ... (Server.../ Https://...). Is this possible?

  • @eduardomatsumoto
    @eduardomatsumoto 2 года назад +1

    hello everyone, I am new to python.
    I am trying to run the codes, but I got an attribute error "' 'NoneType' object has no attribute 'text' ", someone knows how can I fix it? (I thought it was related to the url, but I updated the url to other product of the same website, but no sucess).
    peace.