Redfin Analytics|python ETL pipeline with airflow|Data Engineering Project|Snowpipe|Snowflake|Part 1

Поделиться
HTML-код
  • Опубликовано: 17 окт 2024

Комментарии • 39

  • @stevenlomon
    @stevenlomon 6 месяцев назад +6

    God bless you and your beautiful soul for taking the time and effort to make this and make it available for free 🙏

  • @backgrounding4821
    @backgrounding4821 5 месяцев назад +1

    Hello, you are the best Data Engineering Instructor here on RUclips. I want to continue learning on your end to end project unfortunately I am having a problemin in initiating airflow.
    scheduler | [2024-05-05 04:41:04 +0000] [3014] [INFO] Booting worker with pid: 3014
    scheduler | [2024-05-05T04:41:04.168+0000] {settings.py:60} INFO - Configured default timezone UTC
    scheduler | [2024-05-05 04:41:04 +0000] [3016] [INFO] Booting worker with pid: 3016
    scheduler | [2024-05-05T04:41:04.405+0000] {manager.py:393} WARNING - Because we cannot use more than 1 thread (parsing_processes = 2) when using sqlite. So we set parallelism to 1.
    this is the error came from my terminal. Hope you can assist me sir. Thank you and more power

    • @Ashokkumar-ru4im
      @Ashokkumar-ru4im 4 месяца назад

      Consider Switching to a More Robust Database

  • @Nari_Nizar
    @Nari_Nizar Год назад

    You are the best instructor out there on RUclips, thank ou for explaining everything. I would like for you to do another data engineering with spark and EMR. Finally, maybe do some video on data analysis and data processing. Finally, please keep going and let us know how we can support you to continue on this amazing path!!!

    • @tuplespectra
      @tuplespectra  Год назад +2

      Thanks so much for your kind words. Means a lot to me and yes we should be having a data engineering project on spark and EMR soon. There are different ways by which you can support my channel. You can decide to support me by being a member of this channel. Please listen to this short video where I explained the JOIN membership button. ruclips.net/video/I-72fMtbEbE/видео.html . You could also support me with a "Super Thanks" which you can see at the button of every of my videos. Another way to support me is by Liking my videos and sharing with others so that they can also benefit from these videos. Thanks so much!

  • @madhusudanpatil7010
    @madhusudanpatil7010 8 месяцев назад +1

    very very nice detailed explanation which is very helpfull to understand topic clearly...Thanku so much please keep posting videos sir

    • @tuplespectra
      @tuplespectra  8 месяцев назад

      You are welcome and thanks for your comment. It means alot to me.

  • @shumengshi5925
    @shumengshi5925 7 месяцев назад

    Great Content! I will definitely follow along! Quick question, how much will the EC2 t2 xlarge instance cost be for this whole project?

  • @deborahjohnson3775
    @deborahjohnson3775 9 месяцев назад

    thank you so much for this video. I use a mac and i wanted to know if i could select macos instead of ubuntu and why ubuntu is better

  • @assieneolivier5560
    @assieneolivier5560 9 месяцев назад

    This a great project, you are doing a great job. I am waiting for this same project wiht EMR!!!

    • @tuplespectra
      @tuplespectra  9 месяцев назад

      Thanks so much for your kind words. Here is the project EMR ruclips.net/video/k3lIQZzsaWY/видео.htmlfeature=shared and ruclips.net/video/PeaLln90YXg/видео.htmlfeature=shared

  • @koladearisekola3650
    @koladearisekola3650 Год назад

    Great video, can't wait for part 2

    • @tuplespectra
      @tuplespectra  Год назад

      Glad you enjoyed it. Thank you! See you soon in part 2.

  • @abhishekkumar-ce4zs
    @abhishekkumar-ce4zs 2 месяца назад

    Awesome explanation

  • @dipankarmodak1092
    @dipankarmodak1092 7 месяцев назад

    How did you made connection with airflow? Is it HTTP?

  • @vaibhavverma1340
    @vaibhavverma1340 Год назад

    Thank you so much for the video, I am getting error "ModuleNotFoundError: No module named 'boto3'" even I installed boto3 following everything as you instructed. Please share your thoughts on this?

    • @tuplespectra
      @tuplespectra  Год назад +1

      Make sure to activate your virtual environment before doing "pip install boto3".

  • @ImBatmanYT_CODM
    @ImBatmanYT_CODM Год назад

    Hi there, first of all I just want to appreciate you for helping out in self learning journey of people like us. I closely follow your projects and implement it whenever I can.
    For this tutorial as well, I followed everything as you instructed but at last to load data to s3 failed for some reason. My raw data bucket was empty but the transformed data of 867MB was successfully saved. Can you point me towards the right direction here? Also, I'm not using root account and my user has Admin level access.
    ~Thank you for your time :)

    • @tuplespectra
      @tuplespectra  Год назад

      Did you remember to refresh your raw data s3 bucket so as to check if the file was saved there?

  • @quishzhu
    @quishzhu 8 месяцев назад

    very very very helpful!!! thank you so much!!

  • @kenneth1691
    @kenneth1691 Год назад

    Great, let's do this!

  • @joealtona2532
    @joealtona2532 8 месяцев назад

    You don't have to wait 5 min for DAGs to be reloaded. Run this in shell > airflow dags reserialize
    No need to stop the airflow, open another shell and activate venv.

  • @luiscamilofranco8238
    @luiscamilofranco8238 Год назад

    Amazing!!

  • @errrbrrr3821
    @errrbrrr3821 Год назад

    great project!

  • @jerbear97
    @jerbear97 Год назад +4

    Tip: do not use ctrl+c to copy the airflow password as this will also cause the server to shut down 😂

  • @wepyk6552
    @wepyk6552 6 месяцев назад +1

    I like your accent

  • @AbhinavKumar-e6q
    @AbhinavKumar-e6q 7 месяцев назад

    dag is not showing

  • @latabharti8175
    @latabharti8175 10 месяцев назад +1

    dag is not showing

    • @tuplespectra
      @tuplespectra  10 месяцев назад

      It might take about 5mins before dag shows. Otherwise, you might be having some errors.

    • @joealtona2532
      @joealtona2532 8 месяцев назад

      > airflow dags reserialize
      This reloads dags immediately