ETL with Python

Поделиться
HTML-код
  • Опубликовано: 23 дек 2024

Комментарии •

  • @haghjoomohsen
    @haghjoomohsen 2 года назад +4

    Hi Doug, many appreciations! You are a real teacher that makes things smooth to understand.

  • @arejoshi1009
    @arejoshi1009 8 месяцев назад +1

    Really good example and explanation, I learned a lot. Thanks for sharing this class.

  • @amritasrivastva
    @amritasrivastva Год назад

    You are a great example of a good teacher! Learned so much from video. Thank you!

  • @mjpender9443
    @mjpender9443 3 года назад +4

    Hi Doug. Really learning a lot from your ETL tutorial. I had a data mismatch problem for the date and drove myself crazy until I realized that you have a capital Y for the %Y-%m-%d .

  • @ogechimaryann1082
    @ogechimaryann1082 Год назад

    This is clear and concise. I love your teaching👍👍

  • @marc-antoinej-l1829
    @marc-antoinej-l1829 3 года назад +2

    Very good presentation. Easy to understand.

  • @sajadsafarveisi4512
    @sajadsafarveisi4512 3 года назад

    The talk really increased my knowledge. Thank you.

  • @Mahmoud-ys1kt
    @Mahmoud-ys1kt 2 года назад

    It was great thank you
    We are waiting for more

  • @mercydabbs3330
    @mercydabbs3330 2 года назад +1

    Thank you! I am just learning python but I can follow your very detailed explanation! Thanks!

  • @nocoldizTV
    @nocoldizTV 3 года назад

    This just saved my life thanks!

  • @DineshCTech
    @DineshCTech Год назад

    Excellent presentation!

  • @nimrodr7675
    @nimrodr7675 3 года назад

    That was extremely helpful and interesting! thanks for that!
    the left join part was interesting :)
    More videos and examples please

  • @redfeather22sa
    @redfeather22sa 2 года назад

    Brilliant Video.!! Thank you ❤️ but what is that module you were referring to before 41:00 that accepts formulae and you can do a lot with ? Pandas ? Numpy,? Openpyxl ? Mongo or mundle I think it'd called (on medium recently ). I need to look it up on. Wait a sec I'll be back .
    No it can't be ... Mitosheet ??? Bamboolib?? I don't think so (but those are cool - have yet to try them. Need to reimport/merge my jupiter labs from the old machine & it's environment variables) . But Would be interested to know what you were referring to?
    Or did you just mean python itself ? (Which it does have? [Formulae ability ofcourse!]) .
    Definitely though tell me if you were referring to something else . All the best & thanks for your video - it was absolutely amazing . !!

    • @elduggio99
      @elduggio99 2 года назад +2

      petl is the module we're using here, but the capabilities around excel are drawn from openpyxl.

    • @redfeather22sa
      @redfeather22sa 2 года назад

      @@elduggio99 ah !! Ok thank you !!

  • @eliotsygha9489
    @eliotsygha9489 3 года назад +2

    Thank a lot for the presentation, it has been very useful for building my data warehouse while BI course.

  • @darkknight0258
    @darkknight0258 3 года назад

    this is so cool, thank you so much

  • @MisterKorihor
    @MisterKorihor 3 года назад +2

    I'm not sure it's necessary to have all the try-except blocks. If an exception is thrown, and there is no try block surrounding it, then Python automatically aborts the script.

    • @elduggio99
      @elduggio99 3 года назад +1

      That's a fair observation. For a one-shot ETL script like this it depends on who will be running the script and whether you want to provide friendlier feedback than a stack trace.

    • @ASHISH517098
      @ASHISH517098 2 года назад

      except block will let the user know exactly which block is not running in a user friendly message

    • @MisterKorihor
      @MisterKorihor 2 года назад

      ​@@ASHISH517098 If it's an error that the user can do something about, then I agree with you. But if the user is just going to forward the error to a programmer to fix, then a stack trace is perfectly adequate.

  • @mkgm001
    @mkgm001 2 года назад

    Super tutorial Bro! Thanks

  • @dizetoot
    @dizetoot 3 года назад +1

    Hello, you mentioned near the beginning of the talk that you did a similar demo using SSIS on the exchange rate data and had shared the link in the chat. Could you share it here as well please?

  • @muhammadshoaibkhan2378
    @muhammadshoaibkhan2378 2 года назад

    Can share some more api like this bank of Canada API, from where we can get the data?

  • @neonatal123
    @neonatal123 3 года назад

    Thank you Doug!

  • @WillB0212
    @WillB0212 3 года назад

    This was awesome. Thank you so much for putting this together. May I ask a question?... I'm new to python. On line 35 when printing out the BOCResponse variable, you did print(BOCResponse.text)... what is the '.text' line or what does that refer to? Is that part of the Requests module or a function of the Requests Module?

    • @meanmedianandmoose9276
      @meanmedianandmoose9276  3 года назад +2

      text is a property of the Response class in the Requests module, which BOCResponse is an instance of: docs.python-requests.org/en/master/api/#requests.Response.text

    • @WillB0212
      @WillB0212 3 года назад

      @@meanmedianandmoose9276 Thank you for the reply!

  • @mohammedsafiahmed1639
    @mohammedsafiahmed1639 2 года назад

    Hi Doug, nice tutorial.
    However, I dont see anything that couldnt be achieved using pandas and requests primarily. Also, instead of doing the outer join and then filtering the rows that you dont want, couldnt we have just done a left join by keeping the expenses table as the left one? Would've made the code simpler.

    • @meanmedianandmoose9276
      @meanmedianandmoose9276  2 года назад +2

      Hi Mohammed thanks for watching! We can't use a left join because the exchange rate API does not return a value for every date - weekends and holidays have no updates. We can have expenses on days without exchange data, so we first need to fill down to fix up missing values. Pandas could be used in this example because the dataset is very small, but it is not something I recommend for automated ETL as it is less memory-efficient than petl. Great tool for exploring data though.

  • @maybenew7293
    @maybenew7293 Год назад

    WHAT A WORK!

  • @meowreacti
    @meowreacti 2 года назад

    Tks a lot for this!

  • @nathangarst1310
    @nathangarst1310 2 года назад

    This is so cool! This video is a bit beyond me though as a new Python learner, but I see the utility. I'm trying to move up in my analyst career. How do I learn this stuff? Should I go back to school for a BS in CS, or just learn by doing projects like this on my own?

    • @skinnymanolo4400
      @skinnymanolo4400 2 года назад

      I think you’re better off self-learning as this is at most intermediate level python.

  • @ernestoflores3873
    @ernestoflores3873 6 месяцев назад

    Beatifull!

  • @arjunjadhav3062
    @arjunjadhav3062 3 года назад

    I need to load ready API data into mongodb please make one video on that

  • @zberteoc
    @zberteoc Год назад

    If you used the "join" instead of "outerjoin" you would not need the filtering step, "join" in PETL seems to be equivalent to SQL INNER JOIN while "outerjoin" to SQL LEFT JOIN.

    • @meanmedianandmoose9276
      @meanmedianandmoose9276  Год назад +1

      Thanks for commenting! We can't use an inner join because the exchange rate API does not return a value for every date - weekends and holidays have no updates. We can have expenses on days without exchange data, so we first need to fill down to fix up missing values.

    • @zberteoc
      @zberteoc Год назад

      @@meanmedianandmoose9276 got it. I think I missed the part where the "filldown" excel like functionality is explained.

  • @tuandino6990
    @tuandino6990 2 года назад

    Thank you sir

  • @sanishthomas2858
    @sanishthomas2858 11 месяцев назад

    Nice ..but looks very complex code.. great work

  • @raulaguilar3023
    @raulaguilar3023 Год назад

    Hi Doug. Is not a good idea try to invent the hot water. There aré many tool in the market in order to create ETL in a very efficient way. Any way for Python teaching pourpouse Is a good exercice. Regards

    • @dingaroo2003
      @dingaroo2003 5 месяцев назад

      Not all organizations will buy 3rd party etl tools

  • @lequi3259
    @lequi3259 Год назад

    Good

  • @joaopedrosenna1
    @joaopedrosenna1 3 года назад +2

    petl is better than pandas to work with data transformation!

  • @BhaveshKumar-dz8hq
    @BhaveshKumar-dz8hq 10 месяцев назад

    won't be the outer join highly memory inefficient