Hi Doug. Really learning a lot from your ETL tutorial. I had a data mismatch problem for the date and drove myself crazy until I realized that you have a capital Y for the %Y-%m-%d .
Brilliant Video.!! Thank you ❤️ but what is that module you were referring to before 41:00 that accepts formulae and you can do a lot with ? Pandas ? Numpy,? Openpyxl ? Mongo or mundle I think it'd called (on medium recently ). I need to look it up on. Wait a sec I'll be back . No it can't be ... Mitosheet ??? Bamboolib?? I don't think so (but those are cool - have yet to try them. Need to reimport/merge my jupiter labs from the old machine & it's environment variables) . But Would be interested to know what you were referring to? Or did you just mean python itself ? (Which it does have? [Formulae ability ofcourse!]) . Definitely though tell me if you were referring to something else . All the best & thanks for your video - it was absolutely amazing . !!
I'm not sure it's necessary to have all the try-except blocks. If an exception is thrown, and there is no try block surrounding it, then Python automatically aborts the script.
That's a fair observation. For a one-shot ETL script like this it depends on who will be running the script and whether you want to provide friendlier feedback than a stack trace.
@@ASHISH517098 If it's an error that the user can do something about, then I agree with you. But if the user is just going to forward the error to a programmer to fix, then a stack trace is perfectly adequate.
Hello, you mentioned near the beginning of the talk that you did a similar demo using SSIS on the exchange rate data and had shared the link in the chat. Could you share it here as well please?
This was awesome. Thank you so much for putting this together. May I ask a question?... I'm new to python. On line 35 when printing out the BOCResponse variable, you did print(BOCResponse.text)... what is the '.text' line or what does that refer to? Is that part of the Requests module or a function of the Requests Module?
text is a property of the Response class in the Requests module, which BOCResponse is an instance of: docs.python-requests.org/en/master/api/#requests.Response.text
Hi Doug, nice tutorial. However, I dont see anything that couldnt be achieved using pandas and requests primarily. Also, instead of doing the outer join and then filtering the rows that you dont want, couldnt we have just done a left join by keeping the expenses table as the left one? Would've made the code simpler.
Hi Mohammed thanks for watching! We can't use a left join because the exchange rate API does not return a value for every date - weekends and holidays have no updates. We can have expenses on days without exchange data, so we first need to fill down to fix up missing values. Pandas could be used in this example because the dataset is very small, but it is not something I recommend for automated ETL as it is less memory-efficient than petl. Great tool for exploring data though.
This is so cool! This video is a bit beyond me though as a new Python learner, but I see the utility. I'm trying to move up in my analyst career. How do I learn this stuff? Should I go back to school for a BS in CS, or just learn by doing projects like this on my own?
If you used the "join" instead of "outerjoin" you would not need the filtering step, "join" in PETL seems to be equivalent to SQL INNER JOIN while "outerjoin" to SQL LEFT JOIN.
Thanks for commenting! We can't use an inner join because the exchange rate API does not return a value for every date - weekends and holidays have no updates. We can have expenses on days without exchange data, so we first need to fill down to fix up missing values.
Hi Doug. Is not a good idea try to invent the hot water. There aré many tool in the market in order to create ETL in a very efficient way. Any way for Python teaching pourpouse Is a good exercice. Regards
Hi Doug, many appreciations! You are a real teacher that makes things smooth to understand.
Really good example and explanation, I learned a lot. Thanks for sharing this class.
You are a great example of a good teacher! Learned so much from video. Thank you!
Hi Doug. Really learning a lot from your ETL tutorial. I had a data mismatch problem for the date and drove myself crazy until I realized that you have a capital Y for the %Y-%m-%d .
This is clear and concise. I love your teaching👍👍
Very good presentation. Easy to understand.
Glad it was helpful!
The talk really increased my knowledge. Thank you.
It was great thank you
We are waiting for more
Thank you! I am just learning python but I can follow your very detailed explanation! Thanks!
This just saved my life thanks!
Excellent presentation!
That was extremely helpful and interesting! thanks for that!
the left join part was interesting :)
More videos and examples please
Brilliant Video.!! Thank you ❤️ but what is that module you were referring to before 41:00 that accepts formulae and you can do a lot with ? Pandas ? Numpy,? Openpyxl ? Mongo or mundle I think it'd called (on medium recently ). I need to look it up on. Wait a sec I'll be back .
No it can't be ... Mitosheet ??? Bamboolib?? I don't think so (but those are cool - have yet to try them. Need to reimport/merge my jupiter labs from the old machine & it's environment variables) . But Would be interested to know what you were referring to?
Or did you just mean python itself ? (Which it does have? [Formulae ability ofcourse!]) .
Definitely though tell me if you were referring to something else . All the best & thanks for your video - it was absolutely amazing . !!
petl is the module we're using here, but the capabilities around excel are drawn from openpyxl.
@@elduggio99 ah !! Ok thank you !!
Thank a lot for the presentation, it has been very useful for building my data warehouse while BI course.
this is so cool, thank you so much
I'm not sure it's necessary to have all the try-except blocks. If an exception is thrown, and there is no try block surrounding it, then Python automatically aborts the script.
That's a fair observation. For a one-shot ETL script like this it depends on who will be running the script and whether you want to provide friendlier feedback than a stack trace.
except block will let the user know exactly which block is not running in a user friendly message
@@ASHISH517098 If it's an error that the user can do something about, then I agree with you. But if the user is just going to forward the error to a programmer to fix, then a stack trace is perfectly adequate.
Super tutorial Bro! Thanks
Hello, you mentioned near the beginning of the talk that you did a similar demo using SSIS on the exchange rate data and had shared the link in the chat. Could you share it here as well please?
Sure thing: github.com/dsartori/SSIS-API-Demo
@@meanmedianandmoose9276 thanks!
Can share some more api like this bank of Canada API, from where we can get the data?
Check it out here: www.bankofcanada.ca/valet/docs
Thank you Doug!
This was awesome. Thank you so much for putting this together. May I ask a question?... I'm new to python. On line 35 when printing out the BOCResponse variable, you did print(BOCResponse.text)... what is the '.text' line or what does that refer to? Is that part of the Requests module or a function of the Requests Module?
text is a property of the Response class in the Requests module, which BOCResponse is an instance of: docs.python-requests.org/en/master/api/#requests.Response.text
@@meanmedianandmoose9276 Thank you for the reply!
Hi Doug, nice tutorial.
However, I dont see anything that couldnt be achieved using pandas and requests primarily. Also, instead of doing the outer join and then filtering the rows that you dont want, couldnt we have just done a left join by keeping the expenses table as the left one? Would've made the code simpler.
Hi Mohammed thanks for watching! We can't use a left join because the exchange rate API does not return a value for every date - weekends and holidays have no updates. We can have expenses on days without exchange data, so we first need to fill down to fix up missing values. Pandas could be used in this example because the dataset is very small, but it is not something I recommend for automated ETL as it is less memory-efficient than petl. Great tool for exploring data though.
WHAT A WORK!
Tks a lot for this!
This is so cool! This video is a bit beyond me though as a new Python learner, but I see the utility. I'm trying to move up in my analyst career. How do I learn this stuff? Should I go back to school for a BS in CS, or just learn by doing projects like this on my own?
I think you’re better off self-learning as this is at most intermediate level python.
Beatifull!
I need to load ready API data into mongodb please make one video on that
If you used the "join" instead of "outerjoin" you would not need the filtering step, "join" in PETL seems to be equivalent to SQL INNER JOIN while "outerjoin" to SQL LEFT JOIN.
Thanks for commenting! We can't use an inner join because the exchange rate API does not return a value for every date - weekends and holidays have no updates. We can have expenses on days without exchange data, so we first need to fill down to fix up missing values.
@@meanmedianandmoose9276 got it. I think I missed the part where the "filldown" excel like functionality is explained.
Thank you sir
Nice ..but looks very complex code.. great work
Hi Doug. Is not a good idea try to invent the hot water. There aré many tool in the market in order to create ETL in a very efficient way. Any way for Python teaching pourpouse Is a good exercice. Regards
Not all organizations will buy 3rd party etl tools
Good
petl is better than pandas to work with data transformation!
won't be the outer join highly memory inefficient