Great video! Just an FYI the is a CREATE TABLE IF NOT EXISTS TABLE_NAME (column_name datatype, column_name datatype); command so you don't have to comment it out and rerun you script without errors. There is also a DROP TABLE IF EXISTS TABLE_NAME; as well if you want to recreate it with fresh data over and over.
Around 4:35 I like that you're using variables here to emulate real-life code instead of teaching it like a text book. A lot of people learn the same way I do, we need to be taught with applicable examples, thank you for that!
Your videos are genuinely helpful and always to the point. I hope you never stop delivering such content and always remain motivated for delivering such stuffs. These are very good.. thanks John. 🙏
WOW. Words cannot do justice to how well this has all been explained. :O Subscribed, please teach me more! :D The only sad thing I noticed is that you didn't say why it is important to close the connection at the end.
Great videos John!! Any suggestions on pulling dynamic data from APIs (a data set that is updated maybe weekly) and being able to update that existing record's updates in the database?
Hey thanks. Sure, before you load to the database, check for an existing entry with the same details, ID or name or something unique and if it exists update with the new data, And if it doesn’t add it in
I do have an urgent question: I'm currently working on a technical assessment for a job interview. Everything seems quite simple so far - they need me to make a data pipeline using Python and SQL. Python needs to be the tool used for data (raw) pull and Data Quality checks. The rest of the pipeline is made using SQL (normalise, dimensions, etc.). My question is regarding the step between Python and SQL. According to the requirements, I need to make a mock Datamart where I can store the tables that are created using Python code. These tables that go into the mock DataMart are then pulled (queried) again using SQL. As mentioned before, SQL will then be used to normalise and analyse. What is a DataMart and how can I make a mock version of it? Are they simply asking me to make a database or data warehouse? I've heard of DataMarts before but never used them at university, job or even when I am doing coding sessions in my own free time.
John your content is great and keep it up! Do you program or code for a living? I saw in your bio that you are self taught and that is what I am working on right now!
I have two functions, one scrapes a series code, episode title and url (from Family Guy transcripts) and the other function scrapes the actual transcripts text. How can I add these all to the same database as the variables are defined in different functions? Thanks!
Hey John, why after SELECT, the print results are like this(each item has a comma behind inside parenthesis)? [('apple',), ('banana',), ('orange',), ('orange',)]
Thanks JWR for the excellent vids. I'm in NA and this website doesn't load up. Used Selenium to see what was happening and I could make it load with my VPN set to the UK. I'm fine with following this video and I'll use a different site, however I was also running into the issue of accepting cookies and wondered if there's a method using BS4 to address the prompt and setting cookies to actually scrape? Assume this wasn't an issue when you first released the video. Thanks again.
Watched your video (fantastic) on Insomnia and that got me everything required. Now works fine if my VPN is set to the UK - but not if scraping from outside the UK. Not a big deal, but wondered if there's an obvious way in the code to spoof my location and not have to use the VPN?
I have a use case question. How can i stop the insertion into the db if the same name already exists there. for example apple is already there in the db, so I don't want again apple to be included there.
Hello John, I have scraped the data and created a database table and inserted the respective values, unfortunately the values are being inserted twice. Not able to figure out why. Could you please let me know about this issue? Thanks in advance
@@JohnWatsonRooney it worked now. I made a rookie mistake of calling the function again inside a class. Thanks for the quick response 😇. Looking forward for other videos.
hey man i believe im on the right track with your vid i just have trouble on how to connect it to a firestore databse if you could please give me a hand?
hey John, could you have a look at the overclocker website? I use selenium and playwright. But it doesn't load when i see the process without headless mode. Selenium page source and playwright cookies are all learned from your previous videos. I am frustrated now when all of them do not work out. :( sorry for sharing this feelling🤣.
How do you update the github file on the linux server after you've pushed edits to github? I tried 'git pull origin master' but it wouldn't let me pull and overwrite the existing github repository on the linux server.
i did a spider and connected it to sqlite3 it works great ! but when i crawl again diffrent url it overwrites previous scraped data :/ how to fix that if you dont mind answering :) thank you in advance
Am planning to make a stock market bot that checks for updates in prices and sends notification to users with telegram is it worth it or it's a stupid idea
@@JohnWatsonRooney I think you shouldn't. Even more for demonstration. Many will see your content as one of their first contacts with Python. They will probably try to replicate it. It doesn't cost much to follow good practices and it makes a huge difference.
Is it possible to use SQLite as a Web-Server from a PC-Client? If yes, please return the value of the connection = conn= ... (Server.../ Https://...). Is this possible?
hello everyone, I am new to python. I am trying to run the codes, but I got an attribute error "' 'NoneType' object has no attribute 'text' ", someone knows how can I fix it? (I thought it was related to the url, but I updated the url to other product of the same website, but no sucess). peace.
Great video! Just an FYI the is a CREATE TABLE IF NOT EXISTS TABLE_NAME (column_name datatype, column_name datatype); command so you don't have to comment it out and rerun you script without errors. There is also a DROP TABLE IF EXISTS TABLE_NAME; as well if you want to recreate it with fresh data over and over.
I dont follow many channels, but this one is gold. Thanks for your extremly well explained tutorials. Keep doing them, it helps a lot
That makes the two of us.
For such explanation,
I subed
Around 4:35 I like that you're using variables here to emulate real-life code instead of teaching it like a text book. A lot of people learn the same way I do, we need to be taught with applicable examples, thank you for that!
This channel is so underrated.
Your videos are genuinely helpful and always to the point. I hope you never stop delivering such content and always remain motivated for delivering such stuffs. These are very good.. thanks John. 🙏
Thank you for adding pandas to this video. It was exactly what I needed to learn.
Great job brother…you have every problems solution❤love from india…keep growing
This is great. I'm working on an personal project along the lines and this will help a ton. Thanks John ( you rock! )
WOW. Words cannot do justice to how well this has all been explained. :O Subscribed, please teach me more! :D The only sad thing I noticed is that you didn't say why it is important to close the connection at the end.
Good job brother,
Thanks so much.
I subed
Greetings from Tanzania 🇹🇿
Thank you!
Thanks John its really very useful tutorial.
Thanks dude for the tutorial!
Great videos John!! Any suggestions on pulling dynamic data from APIs (a data set that is updated maybe weekly) and being able to update that existing record's updates in the database?
Hey thanks. Sure, before you load to the database, check for an existing entry with the same details, ID or name or something unique and if it exists update with the new data, And if it doesn’t add it in
Great work! I wonder, will it be again so if we use find_all() instead of find(). I think it should be tougher
Absolutely fantastic, thank you.
Awesome video for saving nd creating database Sir 👌
Thank you!
Can this be done with a MySQL database? I'm thinking webscraping into a MySQL database and use PHP to view it on a web page
How to append dictionary with different keys each time in a for loop to the same sqlite database in python?
You are my man
Thank you so much for this amazing content.
Well Explained, Thanks man!
I do have an urgent question:
I'm currently working on a technical assessment for a job interview. Everything seems quite simple so far - they need me to make a data pipeline using Python and SQL. Python needs to be the tool used for data (raw) pull and Data Quality checks. The rest of the pipeline is made using SQL (normalise, dimensions, etc.). My question is regarding the step between Python and SQL. According to the requirements, I need to make a mock Datamart where I can store the tables that are created using Python code. These tables that go into the mock DataMart are then pulled (queried) again using SQL. As mentioned before, SQL will then be used to normalise and analyse. What is a DataMart and how can I make a mock version of it? Are they simply asking me to make a database or data warehouse? I've heard of DataMarts before but never used them at university, job or even when I am doing coding sessions in my own free time.
John your content is great and keep it up! Do you program or code for a living? I saw in your bio that you are self taught and that is what I am working on right now!
Thanks! I work in e-commerce and use code to help me but I’m not a developer as such
Really looking forward to the flask app, will you still be doing that one? Loving the content, one of the best channels I've found in a while!
Absolutely! Things are just taking me a little longer at the moment!
@@JohnWatsonRooney, I'll be waiting
What do you think should learn sql to store web scraping data in database
I have two functions, one scrapes a series code, episode title and url (from Family Guy transcripts) and the other function scrapes the actual transcripts text. How can I add these all to the same database as the variables are defined in different functions? Thanks!
Hey John, why after SELECT, the print results are like this(each item has a comma behind inside parenthesis)?
[('apple',), ('banana',), ('orange',), ('orange',)]
Thanks JWR for the excellent vids. I'm in NA and this website doesn't load up. Used Selenium to see what was happening and I could make it load with my VPN set to the UK. I'm fine with following this video and I'll use a different site, however I was also running into the issue of accepting cookies and wondered if there's a method using BS4 to address the prompt and setting cookies to actually scrape?
Assume this wasn't an issue when you first released the video. Thanks again.
Watched your video (fantastic) on Insomnia and that got me everything required. Now works fine if my VPN is set to the UK - but not if scraping from outside the UK. Not a big deal, but wondered if there's an obvious way in the code to spoof my location and not have to use the VPN?
I have a use case question. How can i stop the insertion into the db if the same name already exists there. for example apple is already there in the db, so I don't want again apple to be included there.
Sure, you need to use a primary key on the column of the table you don’t want to insert twice - then change the sql command to “insert or ignore”
Hello John,
I have scraped the data and created a database table and inserted the respective values, unfortunately the values are being inserted twice. Not able to figure out why. Could you please let me know about this issue?
Thanks in advance
Sounds odd. Inserted into the database twice? Check you don’t have multiple conn.commits in your code
@@JohnWatsonRooney it worked now. I made a rookie mistake of calling the function again inside a class. Thanks for the quick response 😇. Looking forward for other videos.
can u do a video on saving the scraped data in a json file
Yes good idea, I’ll have a look into it
hey man i believe im on the right track with your vid i just have trouble on how to connect it to a firestore databse if you could please give me a hand?
hey John, could you have a look at the overclocker website? I use selenium and playwright. But it doesn't load when i see the process without headless mode.
Selenium page source and playwright cookies are all learned from your previous videos. I am frustrated now when all of them do not work out. :( sorry for sharing this feelling🤣.
Where is a good place to host the database?
How to introduce randomness so they don’t think you are web scrapping them?
How do you update the github file on the linux server after you've pushed edits to github? I tried 'git pull origin master' but it wouldn't let me pull and overwrite the existing github repository on the linux server.
I use git fetch -all then git pull to sync all changes
by using sqlite3 do we get the updated value
i did a spider and connected it to sqlite3 it works great ! but when i crawl again diffrent url it overwrites previous scraped data :/ how to fix that if you dont mind answering :) thank you in advance
Great Video. I was wondering if you prefer this over sqlachemy!
r.content or r.text? Why you decided to use r.content? Thanks
content is the raw bytes - if we call .text it changes it into a text string for us. It's best to do .text if you want the text only
Hi . I create a database and I have a question how can I using sum to sum column price in my table .
Hi, can anyone help me with this error in this project,
error is:
AttributeError: 'NoneType' object has no attribute 'text'"
Thanks in Advance
Grazie!
Am planning to make a stock market bot that checks for updates in prices and sends notification to users with telegram is it worth it or it's a stupid idea
have you done project with sqlalchemy?
Make a video on scrapy + pymongo
Good idea yes
Thank you💌
Is there a possible way to host the database file on free server like Google drive
What is header ?
Why aren't you following pep8?
Just for demonstration purposes as long as it’s clear I don’t worry about pep8 so much
@@JohnWatsonRooney I think you shouldn't. Even more for demonstration. Many will see your content as one of their first contacts with Python. They will probably try to replicate it. It doesn't cost much to follow good practices and it makes a huge difference.
how do you add the id primary key to this? does it do it automatically?
Sure - When creating the table add PRIMARY KEY after the data type (TEXT or INTEGER etc)
John, I sent you email. Please check.
Sure I’ll have a look
@@JohnWatsonRooney Thank John. your video saved me.
Is it possible to use SQLite as a Web-Server from a PC-Client? If yes, please return the value of the connection = conn= ... (Server.../ Https://...). Is this possible?
hello everyone, I am new to python.
I am trying to run the codes, but I got an attribute error "' 'NoneType' object has no attribute 'text' ", someone knows how can I fix it? (I thought it was related to the url, but I updated the url to other product of the same website, but no sucess).
peace.