Yesterday, before this video, I was testing this subject and I got an error and it took me a while to figure it out. The problem was that my Excel conversion to CSV gave a ; as delimiter instead of a comma (,) The solution was df = pd.read_csv('data.csv', *sep=';'* ) Thanks for sharing your wisdom ;-)
As someone more used to VBA, I'm starting to learn Python as my employer is going to deprecate VBA due to security concerns, so this is really useful. However, I have yet to find any tutorials at all, anywhere on youtube, that show how to actually deploy for end users within an Excel environment. My users aren't going to have an IDE, they need to be able to click a button and set code running.
Microsoft are in the process of launching python for excel - directly use python in excel. It’s currently in developer preview but it sounds like what you are looking for
Absolutely one of the most underrated RUclipsrs out there. I guess Google's ML algo probably identifies John as a conservative and that's why his channel hasn't exploded in subscriber count yet.
So when you do something (like when you changed the price of having $ to not having it) the memory saves for it? Like that part is stored so it remembers you did it? The reason im asking is because you deleted the lines of code when you made that change. So it must remember that you originally made that change to get rid of the $ right?
Hi John, very well explained and covered most topics which are used in excel. Superb. if you can make one for using python create pivot table and paste it in excel. How to dissect the original data to make smaller data which can then be used to create chart in excel. so I don't have to rely on formula or excel pythons to make charts. Python would simple process the larger dataset and format data in a way which would put it in excel which can then be used for charts
@@JohnWatsonRooney Yes it is. Even not only for Excel/Office things, but the lack of support made VBA as zombie - dead but will exist as long as Excel will. The support/ new libraries are making me curious into Python world.
Hi - what do you do in the case that you want to merge the data but some of the data in the references tab does find a corresponding match in the original merged spreadsheet?
Hey John. Long time watcher. First time student( spending my Sunday time programming instead of watching ). . So, John, Why doesn’t the code from selenium IDE for chrome work; when it’s generated for an ? I want to add the code to a Python script. The gets recorded, but doesn’t work when I replay it.
Not sure if you'll see this... but I just noticed you're running Ubuntu in WSL? That would be an interesting series to do - especially when production level scrapers almost always need to use a rotating IP and that's usually only possible in Linux. I'm still doing the good old VM way of using Linux haha
Sure, I’ve used WSL or dual booted into Linux ever since I’ve been coding properly. I just got used to the commands. I could look at doing a video on the benefits
Is there a way to save or create a function of the lines of code for repeated tasks with new data sets that come in lets say weekly? So essentially have like a .exe or .bat file, or even a GUI with a run button that when clicked on it automates the process and gives results fast
Hey what I do for weekly reports is tailor my script to get them from a specific folder, put the new files in there and run the script from the terminal when ready
Hello, John can you make a video about the VScode Debugger about that how to setup a debugger, how to use it and it's setting and all that stuff. Thanks in Advance
Thanks bro ,...need help .......in the realtime I am getting data from the broker terminal ......I want one condition like previous data is in percentage I want that last previous percentage is less than current percentage data and vice versa ....and I want every 5 min
Hello sir . I watched u r videos of how to read google sheet data using pandas.i got it.but after getting data i want to clear my data from database automatically. What can i do for that
Hi John, glad to see a pandas video here. Very good one, the Timestamps are really appreciated. John could you make a video only about the vlookup, merge, iloc, drop duplicated? Even when I manage to used them, I couldn't say I really understand it.
Hello John we read in certain supplier invoices for customers only with a number of suppliers (invoices) we have problems reading in. via sep='\t' have tried but no result. we now first go to excel and read it in and then we save it to csv then it is changed to sep=';' then this we read in. ?? what are we doing wrong when reading this format csv greetings alex
@@Arvinth14 Thanks for your suggestion, unfortunately that doesn't work either. what I find strange if I first divide it into columns in excel and then write it to csv it works fine. this only happens with the dutch supplier at csv from the web i have no problems
Do you have a book you'd recommend to learn Pandas for this type of work? Most I'm seeing is heavy on the math. I mostly need to be able to find duplicates and in a new column assign the duplicates a new id. So Company A may have 10 records. I want to automate the process of finding them and assigning all of them a CompanyID.
maybe export into csv file and use gawk, fast lightweight utility to process text, easy to learn and use... courtesy chatgpt: write a gawk script to find duplicates and in a new column assign the duplicates a new id. Here's a simple example of a gawk script to find duplicates and assign each duplicate a new ID: BEGIN { FS="," # Set the field separator to comma OFS="," # Set the output field separator to comma count=1 # Initialize the count to 1 id=1 # Initialize the ID to 1 } { if ($0 in seen) { print $0, id } else { seen[$0]=count count++ print $0, ++id } } This script sets the input field separator (FS) and output field separator (OFS) to ,, and initializes two variables: count and id. The BEGIN block runs before any input is processed. The main body of the script uses an associative array (seen) to keep track of which lines have been seen before. If the current line ($0) is in the seen array, it prints the line and the current value of id. If the current line is not in the seen array, the script adds it to the seen array with a value of count, increments count, and prints the line with a new value of ++id.
Hi johny😊. I got a job recently as of data analyat but work is quite biring copy pasting in excel.... So i want to know ia there any tool like chatgpt which can do this work for me....
@@JohnWatsonRooney Ohh there was a link indeed! I didn't know youtube doesn't allow it. I had asked if you could help me to solve a scrape issue. I'm trying to scrape a supermarket webpage (carrefouruae) to get the name and the price of a product but because the data is rendered throught javascript and my script is running inside a container where it doesn't have a browser, I don't know which library I can use to scrape. Thank you in advance! I Thank you for your awesome videos!
You can render the page with requests-html or have a look in the page source code for “next_data” you might be able to get something useful from the script tag there
Hi John, Thank you for all your videos.. It really helps. I have one request ,Can you help us on how to scrape dynamic website ,i mean the website which changes its query string parameters on pagination.. If you want i can share the link with you.. Just drop me your email id Thanks.. i am stuck and not able to understand how to scrape that website
Thank you for taking time and responding to my comment.i have dropped you an email regarding the website, kindly have a look whenever it is convenient and Please keep making amazing knowledgeable content :D
Yesterday, before this video, I was testing this subject and I got an error and it took me a while to figure it out.
The problem was that my Excel conversion to CSV gave a ; as delimiter instead of a comma (,)
The solution was df = pd.read_csv('data.csv', *sep=';'* )
Thanks for sharing your wisdom ;-)
As someone more used to VBA, I'm starting to learn Python as my employer is going to deprecate VBA due to security concerns, so this is really useful.
However, I have yet to find any tutorials at all, anywhere on youtube, that show how to actually deploy for end users within an Excel environment. My users aren't going to have an IDE, they need to be able to click a button and set code running.
Microsoft are in the process of launching python for excel - directly use python in excel. It’s currently in developer preview but it sounds like what you are looking for
Thank you very much for the gentle introduction of Panda library . It was very useful for me
Hi John Watson! I'm watching your channel regularly and updating my skills. You are my real teacher. Thanks!
Absolutely one of the most underrated RUclipsrs out there. I guess Google's ML algo probably identifies John as a conservative and that's why his channel hasn't exploded in subscriber count yet.
Nice to get tutorials from an actual work flow rather than a reinterpretation of the manual.
Masterly content and presentation, thanks. The ideal pace for making the viewer understand what is being done and how.
It's a very useful tip to get a list of df column names
I learned a lot about web scrapping and data handling methods from you in very short time. Thank you
That’s great glad I could help!
Very useful common operations for data science projects. 👍💖
Really useful from start till the end. Highly appreciated.
So when you do something (like when you changed the price of having $ to not having it) the memory saves for it? Like that part is stored so it remembers you did it? The reason im asking is because you deleted the lines of code when you made that change. So it must remember that you originally made that change to get rid of the $ right?
This video is really helpful. Thank you so much.
You could have used the jupyter extension of vscode for easy interaction
Yeah absolutely. I don’t know why I’ve just never been a fan of notebooks. I should probably revisit that idea though
Thanks John for this helpful content.
Thank you very kind
Thats perfect how you explained.
Hi John, very well explained and covered most topics which are used in excel. Superb. if you can make one for using python create pivot table and paste it in excel. How to dissect the original data to make smaller data which can then be used to create chart in excel. so I don't have to rely on formula or excel pythons to make charts. Python would simple process the larger dataset and format data in a way which would put it in excel which can then be used for charts
did you figure it out?
@@bearingoutward1302 no
Just wanted to say a huge THANKYOU as you have taught me so much!
That’s great!
Thanks a lot for your videos. These are really helpful and easy to understand.
Thank you very kind
I am getting started with touching python for my excel files and glad i found this video.
I can easily follow this thru
Looking forward for more :)
Thank you glad it helped!
what theme are you using? it looks really nice
this is gruvbox material i believe!
@@JohnWatsonRooney thank you!
Easy to follow tutorial, appreciate it john!!!!
Very new to python. How do I get the same user interface as you? When I downloaded it, it just gave me the shell/IDLE
Download VS Code from Microsoft
You just make it soo easy, nice tutorial. Python seems be to the best alternative for VBA.
I never got into VBA but I heard it’s very powerful for this… but Python is so good for most things :)
@@JohnWatsonRooney Yes it is. Even not only for Excel/Office things, but the lack of support made VBA as zombie - dead but will exist as long as Excel will. The support/ new libraries are making me curious into Python world.
John, Can you target a specific sheet within an excel workbook to create a new db?
great video, going to check if you do an in-depth video using pandas
Hi - what do you do in the case that you want to merge the data but some of the data in the references tab does find a corresponding match in the original merged spreadsheet?
Hi John, Thank you for the informative video! Top job
Excellent video, thank you
Hey John. Long time watcher. First time student( spending my Sunday time programming instead of watching ). . So, John, Why doesn’t the code from selenium IDE for chrome work; when it’s generated for an ?
I want to add the code to a Python script.
The gets recorded, but doesn’t work when I replay it.
you are amazing man! keep going
Great walkthrough
Not sure if you'll see this... but I just noticed you're running Ubuntu in WSL?
That would be an interesting series to do - especially when production level scrapers almost always need to use a rotating IP and that's usually only possible in Linux.
I'm still doing the good old VM way of using Linux haha
Sure, I’ve used WSL or dual booted into Linux ever since I’ve been coding properly. I just got used to the commands. I could look at doing a video on the benefits
Great.awesome video...great learning time
Awesome video! Can we get the real python link. I couldn't find it in your description.
Hi, thanks for the tutorial, where can i get the csv files you are using ?
Hi I never put them up online but I generated them from a free service called mockaroo
Is there a way to save or create a function of the lines of code for repeated tasks with new data sets that come in lets say weekly? So essentially have like a .exe or .bat file, or even a GUI with a run button that when clicked on it automates the process and gives results fast
Hey what I do for weekly reports is tailor my script to get them from a specific folder, put the new files in there and run the script from the terminal when ready
Hello, John can you make a video about the VScode Debugger about that how to setup a debugger, how to use it and it's setting and all that stuff. Thanks in Advance
great job . how about Heroku ? is it in your plans for future contents ?
Yes 100%
Excellent. Thanks!
Your tutorials are sublime :) :)
However, where can I get access to the excel test files, as I want to reproduce your demo.
Thanks! Ahh I’ll try to find them, but all my fake data comes from mockaroo
thanks rooney, a true ace!
Very nice job. Good luck.
Thanks bro ,...need help .......in the realtime I am getting data from the broker terminal ......I want one condition like previous data is in percentage I want that last previous percentage is less than current percentage data and vice versa ....and I want every 5 min
great material, thanks
Hello sir .
I watched u r videos of how to read google sheet data using pandas.i got it.but after getting data i want to clear my data from database automatically. What can i do for that
Thanks again, really good video.
Thank you!
Awesome bro its save my time
whew! that was a good one
Great Video Sir.
Hi John, glad to see a pandas video here. Very good one, the Timestamps are really appreciated. John could you make a video only about the vlookup, merge, iloc, drop duplicated? Even when I manage to used them, I couldn't say I really understand it.
Thanks. Sure I can do a more deep dive on those
Can You Create an Telgram Group To discuss more about WebScraping & Python?
Thanks a lot for your videos. These are really helpful and easy to understand.
Can we connect ?had something to discuss
Hello John
we read in certain supplier invoices for customers only with a number of suppliers (invoices) we have problems reading in. via sep='\t' have tried but no result.
we now first go to excel and read it in and then we save it to csv then it is changed to sep=';' then this we read in. ??
what are we doing wrong when reading this format csv
greetings alex
Hi Alex. Hard to say without seeing the file and how it reads in, what file type is the first invoice? Csv, xlsx or something else?
@@JohnWatsonRooney Hello John
can i send you the file?
Sure, my email is on my main RUclips page
May be the below given code should work,
dataframe_name = pd.read_csv('filename.extension', delimiter='\t')
@@Arvinth14 Thanks for your suggestion, unfortunately that doesn't work either. what I find strange if I first divide it into columns in excel and then write it to csv it works fine. this only happens with the dutch supplier at csv from the web i have no problems
Do you have a book you'd recommend to learn Pandas for this type of work? Most I'm seeing is heavy on the math. I mostly need to be able to find duplicates and in a new column assign the duplicates a new id. So Company A may have 10 records. I want to automate the process of finding them and assigning all of them a CompanyID.
maybe export into csv file and use gawk, fast lightweight utility to process text, easy to learn and use... courtesy chatgpt: write a gawk script to find duplicates and in a new column assign the duplicates a new id.
Here's a simple example of a gawk script to find duplicates and assign each duplicate a new ID:
BEGIN {
FS="," # Set the field separator to comma
OFS="," # Set the output field separator to comma
count=1 # Initialize the count to 1
id=1 # Initialize the ID to 1
}
{
if ($0 in seen) {
print $0, id
} else {
seen[$0]=count
count++
print $0, ++id
}
}
This script sets the input field separator (FS) and output field separator (OFS) to ,, and initializes two variables: count and id. The BEGIN block runs before any input is processed.
The main body of the script uses an associative array (seen) to keep track of which lines have been seen before. If the current line ($0) is in the seen array, it prints the line and the current value of id. If the current line is not in the seen array, the script adds it to the seen array with a value of count, increments count, and prints the line with a new value of ++id.
Good video thanks
how do you output data that's not able to merge with the reference data?
Got it, df[reference].isna()
Cool stuff bro
Thanks
Awesome brother
subscribed 🤩
Thanks!
Hi johny😊.
I got a job recently as of data analyat but work is quite biring copy pasting in excel....
So i want to know ia there any tool like chatgpt which can do this work for me....
Thanks❤
Your content is amazing, you need to work on your thumbnail to get more impression & click.
Thanks, I’ve tried a few different types of thumbnail, if you have some good examples you can link them to me?
@@JohnWatsonRooney I just found out RUclips deleted my comment because I linked to a thumbnail photo 😑
Where can we get this sample data
21:28 : how do you get the output colorized ?
Rainbow CSV addon
@@bhavik15 Thank you kindly
Installed and it works ;-)
Good video
Thanks
please provide the sample file?
21:12 you don't say John😊
How can i contact you.
why my comment is being deleted?
Was it a link? I haven’t deleted any comments from this video it must be RUclips
@@JohnWatsonRooney Ohh there was a link indeed! I didn't know youtube doesn't allow it. I had asked if you could help me to solve a scrape issue. I'm trying to scrape a supermarket webpage (carrefouruae) to get the name and the price of a product but because the data is rendered throught javascript and my script is running inside a container where it doesn't have a browser, I don't know which library I can use to scrape. Thank you in advance! I Thank you for your awesome videos!
You can render the page with requests-html or have a look in the page source code for “next_data” you might be able to get something useful from the script tag there
@@JohnWatsonRooney thank for the reply! I've tried and i got "Access Denied" :(
Hi John,
Thank you for all your videos.. It really helps.
I have one request ,Can you help us on how to scrape dynamic website ,i mean the website which changes its query string parameters on pagination..
If you want i can share the link with you..
Just drop me your email id
Thanks.. i am stuck and not able to understand how to scrape that website
Hey, glad you enjoy the videos. My email is on my main RUclips page if you want to drop me a message I will have a look when I get a chance
Thank you for taking time and responding to my comment.i have dropped you an email regarding the website, kindly have a look whenever it is convenient and Please keep making amazing knowledgeable content :D
Yo...
The video title says "Automate Excel Work..." but its content is all about csv data. This manipulation is disappointing.
No wonder it's called “data manipulation” ;)