Hey Jay, thank you for the video. I'd be happy to see you doing more ELT pipelines and focus on your thought's process ( I can watch longer format 1-2 hours) - why you do things in that way, why is it important and whatnot; and you can throw some explainers to anything else you do and the reason behind it. I think senior DE and others with experience do things bit automatically and it takes time for the newbies to pick up on those skills. So, your thought process for doing things instead of just doing the things is priceless for anyone watching, including me. Appreciate your video, dude :)
Error solved!!!! for anyone facing this error: Runtime Error Database error while listing schemas in database "dbt_db" Database Error 250001: Could not connect to Snowflake backend after 2 attempt(s).Aborting Try the second method to update account name for your project inside profile.yml file. account_locator-account_name
I have been struggling with dbt and airflow for a long time. For some reason I could not connect the dots. Having some mixture of knowledge - I landed on this tutorial and it just glued all my scattered dots well. Thanks Jayzern!!! Really appreciate the efforts :)
Hello, thanks for this tutorial. At the very beginning, when trying to run the "dbt deps" command I'm getting this error : "Encountered an error loading local configuration: dbt_cloud.yml credentials file for dbt Cloud not found. Download your credentials file from dbt Cloud to `C:\Users\a.schirina\.dbt`". I'm using dbt command locally and my profiles.yml in the .dbt folder is data_pipeline: target: dev outputs: dev: type: snowflake account: jpb45436 # User/password auth user: alices password: mypassword role: dbt_role database: dbt_db warehouse: dbt_wh schema: dbt_schema threads: 4 client_session_keep_alive: False Does anyone know the problem?
So to my understanding, the singular tests really mean to check if nothing is the result of the query been tested. If the test is true, then nothing equates to the query been tested - Great your data is fine. If false, you should run that query to see what exactly are those rows. Confusing at first but makes sense now.
thank you so much it is 100% worth and useful... expect some more videos in detail.... like prod deployment through git and git interation with airflow
Hi Jay, thanks for the video. I'm having an issue connecting to Snowflake backend at the stage you first perform 'dbt run' @ 14:50 . This is the error I get: 15:17:54 Encountered an error: Runtime Error Database error while listing schemas in database "dbt_db" Database Error 250001: Could not connect to Snowflake backend after 2 attempt(s).Aborting I've checked the profiles.yml file and all details are correct. Please help!
@MalvinSiew I solved one of the two errors I was facing. I did not have Git installed in my system. You can simply ask AI for prompts to guide you through the installation process.
Had the same problem, when passing the account_value with 'dbt init' I wasn't able to connecto using the ccount url value, only with the second option which was the - value
Awesome video! I already recommended this to my entire team. Please make more like this, they are extremely helpful. Idea for next video: dbt for Snowflake (again) but with Data Vault 2.0 modeling. I would love to see the logic behind creating dim and fact tables, how you define the stg files for creating the hubs/satellites/links.
Oof yea I did consider doing a Data Vault model where we showcase how hubs, satellites and links work but didn't think ppl would be interested. Thanks for raising 👍
Jay! Thanks for the video and content very cool to see. Curious why Airflow over something like FiveTran besides the ability to self host? Any gotchas?
FiveTran is not really an orchestration tool - it's really meant for the "Extract Load" part only. It's great because of Unix philosophy, i.e. "do one thing, do one thing well only", whereas Airflow is more of a generalist, task-based orchestrator. Another thing is FiveTran is super expensive, unless you're working on something enterprise-y
Hello did anyone else face this error at Airflow after @32:50 Broken DAG: [/usr/local/airflow/dags/dbt-dag.py] Traceback (most recent call last): File "/usr/local/lib/python3.12/site-packages/cosmos/operators/base.py", line 361, in __init__ self.full_refresh = full_refresh ^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/airflow/models/baseoperator.py", line 1198, in __setattr__ if key in self.__init_kwargs: ^^^^^^^^^^^^^^^^^^ AttributeError: 'DbtRunLocalOperator' object has no attribute '_BaseOperator__init_kwargs'. Did you mean: '_BaseOperator__instantiated'? please send help
Ok, so I think I was able to find the thread related to this issue.. Its still open as of 8/18/2024 11pm PT.. github.com/astronomer/astronomer-cosmos/issues/1161
couldn't run int_order_items.sql because it ruturns a strange error: it says: "The selection criterion 'int_order_items.sql' does not match any enabled nodes". And if aI run "dbt run" it says: " unexpected '.' in line 1" at 20:22
No worries, I realized you used it from the Cosmos github repo so I managed to find it there and finally was able to wire up everyhing and deploy it. 🤓 Thanks Jay. It's a super helpful tutorial. @@jayzern
26:00 item_discount_amount is supposed to be negative because the macro defined it as such. I also checked the data on snowflake and they're all negative amounts. Did I miss something?
Thank you for your video. I'm stuck at the last step when connecting to Airflow though I followed every step,.it shows this error: Broken DAG: [/usr/local/airflow/dags/dbt_dag.py] Traceback (most recent call last): File "/usr/local/lib/python3.11/site-packages/cosmos/converter.py", line 211, in __init__ project_config.validate_project() File "/usr/local/lib/python3.11/site-packages/cosmos/config.py", line 206, in validate_project raise CosmosValueError(f"Could not find {name} at {path}") cosmos.exceptions.CosmosValueError: Could not find dbt_project.yml at /usr/local/airflow/dags/dbt/data_pipelin/dbt_project.yml Do you have any idea why I get this error? Thank you in advance!
Hi Jay. Question: Once you have created the Fact table, how does this process work if I run it again? Is it going to append new records and update the existing ones? Or is it going to drop and create the Fact table over again?
Hello.. thanks for the tutorial. I know airflow runs the tasks/dags however I cannot follow one thing; how do we determine the order of the action items at 35:36 within dbt (I believe it is determined on dbt side) since we have only one dag running on this example? I appreciate if anyone replies.
This is such an amazing video @jayzern! The project taken was not overly complex but also not barebones and covered a lot of important stuff! Thanks for being thoughtful and including the code along link (else some of formatting issues would have bugged many newbies)! I think you should keep creating more videos as you are a good teacher. Only suggestion I have is may be include a bit more explanation, which will help beginners even more! Kudos!
I materialized marts as tables but int_order_items, int_order_items_summary and fct_orders are created as views instead of tables. How do I convert these views to tables?
This code along session (starting from scratch with environment setup, codebase structure ...) is soooooooo helpful. Hope to see more examples like this. Keep up the work my man
I watched the video "How you start learning Data Engineering..." and wondering that can you do a live coding that step through all those aspects (from SQL, command lines... to Kafka...) in 1 project? I think it would help a lot...
Glad to hear it's helpful! 👍 It's great to hear feedback on what type of live coding videos you find insightful. Will keep note on Kafka and Command lines
hi guys kindly help me out, does only snowflakes and dbt is enought are i have to learn hadoop, spark etc i am working as data analyst for last 1 year and planning to switch to de
This video is like a gold mine for building a portfolio especially for someone starting out as a Data Engineer like me!... Manny Thanks and Kudos to you!.. Love from India
Hey, thanks for the project tutorial. i was wondering if there is the best way to deploy airflow on a cloud enviroment... I see a lot of Ec2 or EKS (kubernetes). But maybe i could work on ECS + Fargate? Which deploy method would you please recomend regarding a production scenario? (like beyond studies, thinking about a daily job task). Thank you mate
Airflow + EKS is probably the most common in the industry because of cost reasons and vertical scaling. You could use ECS + Fargate too, but fargate is really expensive! I don't have any recs atm, but will try to create more examples on production DAGs next time. Check out ruclips.net/video/Xe8wYYC2gWQ/видео.html in the meantime!
You can Dockerize it at the beginning, or once you have a baseline model working. I've seen cases where Data engineers start with Docker, or Dockerize it halfway! I personally prefer the latter
WOW!! ,Thank you so much for this wonderful video, Please keep making dbt + airflow videos, I have one doubt, I can see that one task in airflow which is stg_tpch_orders have run + test in your dag, But it is not showing up in mine, Have you added any tests on stg_tpch_orders ? but maybe missed to show it into the video ?
Hmm it's hard to tell without looking at ur code, but there is a generic test for stg_tpch_orders that looks at the relationship between fct_orders and stg_tpch_orders. Check your generic_tests.yml file to confirm Thanks for the support man!
Hi Jay, good one..am trying same way but getting below error " 1 of 1 ERROR creating view model dbt_schema.stg_tpch_line_items................. [ERROR in 0.04s] 06:17:33 06:17:33 Finished running 1 view model in 2.02s. 06:17:33 06:17:33 Completed with 1 error and 0 warnings: 06:17:33 06:17:33 Compilation Error in model stg_tpch_line_items (models\staging\stg_tpch_line_items.sql) 06:17:33 'dict object' has no attribute 'type_string' 06:17:33 06:17:33 > in macro generate_surrogate_key (macros\sql\generate_surrogate_key.sql) 06:17:33 > called by macro default__generate_surrogate_key (macros\sql\generate_surrogate_key.sql) 06:17:33 > called by model stg_tpch_line_items (models\staging\stg_tpch_line_items.sql)"
Try checking if your dbt_utils version is correct. There seems to be a compile time error with calling generate surrogate key. The code is available in notion page.
Yea that's great question! In theory dbt cloud can trigger jobs too, but in practice you'd want to decouple your orchestration tool away from your transformation tool for a myriad of reasons: ability to orchestrate other tools together with dbt, avoid vendor lock from dbt, many companies are comfortable with Airflow etc. It really depends on your tech stack
If your company only uses dbt and no other tooling, dbt cloud works too However in the real world, it's hard to control your CRON schedule when you have many tools in your stack. Orchestrators job is to focus on scheduling. Linux philosophy of do one thing, do one thing well TLDR
Need more content like this!!! Really amazing video. Just one suggestion I would like to make before diving into the coding part it would be better if you could provide a real world scenario and reference that while writing you code. Thanks
I double checked the link and it's working, try this bittersweet-mall-f00.notion.site/Code-along-build-an-ELT-Pipeline-in-1-Hour-dbt-Snowflake-Airflow-cffab118a21b40b8acd3d595a4db7c15?pvs=74 Let me know what error you see
What do you mean? There’s very little actual coding here to be honest. Most of it is trying to get the different services and database talking to each other and exchanging information. And then automating it. There’s a lot of moving parts.
I had the same error and I realized the issue came from my dbt_project.yml, I had a typo instead of data_marts I wrote data_mart. The name must be the same with the data_marts folder you created in folder structure
Hey Jay, thank you for the video. I'd be happy to see you doing more ELT pipelines and focus on your thought's process ( I can watch longer format 1-2 hours) - why you do things in that way, why is it important and whatnot; and you can throw some explainers to anything else you do and the reason behind it. I think senior DE and others with experience do things bit automatically and it takes time for the newbies to pick up on those skills. So, your thought process for doing things instead of just doing the things is priceless for anyone watching, including me. Appreciate your video, dude :)
Thank you! Will try to create more useful content
Completely agree 😊
Error solved!!!!
for anyone facing this error:
Runtime Error
Database error while listing schemas in database "dbt_db"
Database Error
250001: Could not connect to Snowflake backend after 2 attempt(s).Aborting
Try the second method to update account name for your project inside profile.yml file.
account_locator-account_name
Thank you !
Thank you!
AN ABSOLUTE GOLDMINE OF AN INFORMATION WHICH NOT AY UDEMY OR RUclips TUTOR HAS PROVIDED YET!
I have been struggling with dbt and airflow for a long time. For some reason I could not connect the dots. Having some mixture of knowledge - I landed on this tutorial and it just glued all my scattered dots well. Thanks Jayzern!!! Really appreciate the efforts :)
Hello, thanks for this tutorial. At the very beginning, when trying to run the "dbt deps" command I'm getting this error : "Encountered an error loading local configuration: dbt_cloud.yml credentials file for dbt Cloud not found. Download your credentials file from dbt Cloud to `C:\Users\a.schirina\.dbt`". I'm using dbt command locally and my profiles.yml in the .dbt folder is
data_pipeline:
target: dev
outputs:
dev:
type: snowflake
account: jpb45436
# User/password auth
user: alices
password: mypassword
role: dbt_role
database: dbt_db
warehouse: dbt_wh
schema: dbt_schema
threads: 4
client_session_keep_alive: False
Does anyone know the problem?
honestly never knew about dbt and glad to learn it here thank you
Thanks @jayzern. This tutorial is awesome. I will be recommending it to folks who struggle with connecting dbt with any database engine.
Thank you for the video jayzern. When I push code into Git, should I push code of dbt only, or I need to push all code of dbt-dag ?
I'm struggling within the step to load dbt data_pipeline, it did not show in the airflow dag. How could I be wrong, can you support?
So to my understanding, the singular tests really mean to check if nothing is the result of the query been tested.
If the test is true, then nothing equates to the query been tested - Great your data is fine.
If false, you should run that query to see what exactly are those rows.
Confusing at first but makes sense now.
thank you so much it is 100% worth and useful... expect some more videos in detail.... like prod deployment through git and git interation with airflow
Just wonder in the real world scenario, where are all raw data stored? In AWS s3?
Hi Jay, thanks for the video. I'm having an issue connecting to Snowflake backend at the stage you first perform 'dbt run' @ 14:50 .
This is the error I get:
15:17:54 Encountered an error:
Runtime Error
Database error while listing schemas in database "dbt_db"
Database Error
250001: Could not connect to Snowflake backend after 2 attempt(s).Aborting
I've checked the profiles.yml file and all details are correct. Please help!
facing the same issue!!!!!! can anyone please help I've restarted and tried everything possible to figure out but failed
@MalvinSiew
I solved one of the two errors I was facing. I did not have Git installed in my system. You can simply ask AI for prompts to guide you through the installation process.
Had the same problem, when passing the account_value with 'dbt init' I wasn't able to connecto using the ccount url value, only with the second option which was the - value
did u solve it? I have the same problem. what is the solution?
@@oreschz could you solve it?
i'm new to snowflake, dbt and airflow,
this is awesome tutorial, got to learn a lot
thank you jayzern
Is data engineering dead with advent of AI ? What is the future of data engineering careers in your opinion ?
Awesome video! I already recommended this to my entire team. Please make more like this, they are extremely helpful.
Idea for next video: dbt for Snowflake (again) but with Data Vault 2.0 modeling. I would love to see the logic behind creating dim and fact tables, how you define the stg files for creating the hubs/satellites/links.
Oof yea I did consider doing a Data Vault model where we showcase how hubs, satellites and links work but didn't think ppl would be interested. Thanks for raising 👍
Brown Gary Martinez Dorothy Garcia James
Jay! Thanks for the video and content very cool to see. Curious why Airflow over something like FiveTran besides the ability to self host? Any gotchas?
FiveTran is not really an orchestration tool - it's really meant for the "Extract Load" part only. It's great because of Unix philosophy, i.e. "do one thing, do one thing well only", whereas Airflow is more of a generalist, task-based orchestrator. Another thing is FiveTran is super expensive, unless you're working on something enterprise-y
Thanks bro for your efforts ❤
Hello did anyone else face this error at Airflow after @32:50
Broken DAG: [/usr/local/airflow/dags/dbt-dag.py]
Traceback (most recent call last):
File "/usr/local/lib/python3.12/site-packages/cosmos/operators/base.py", line 361, in __init__
self.full_refresh = full_refresh
^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/airflow/models/baseoperator.py", line 1198, in __setattr__
if key in self.__init_kwargs:
^^^^^^^^^^^^^^^^^^
AttributeError: 'DbtRunLocalOperator' object has no attribute '_BaseOperator__init_kwargs'. Did you mean: '_BaseOperator__instantiated'?
please send help
I am facing the exact same error. Please post a reply, if you were able to figure out the fix. I'll do the same if I find a solution.
Ok, so I think I was able to find the thread related to this issue.. Its still open as of 8/18/2024 11pm PT..
github.com/astronomer/astronomer-cosmos/issues/1161
couldn't run int_order_items.sql because it ruturns a strange error: it says: "The selection criterion 'int_order_items.sql' does not match any enabled nodes". And if aI run "dbt run" it says: " unexpected '.' in line 1" at 20:22
Thank you, thank you THANK YOU! This was so helpful, easy to follow and made perfect sense.
Dude, where did you even mention about dbt_project.yml file, in part 2 of the video, you directly jump to vscode
what are the details ??
Thanks Jay! Could you also upload into the Notion document the code for the dbt_dag.py file for the Airflow deployment? That's still missing 🙏🏻
Totally forgot about that, thanks for the reminder!
No worries, I realized you used it from the Cosmos github repo so I managed to find it there and finally was able to wire up everyhing and deploy it. 🤓 Thanks Jay. It's a super helpful tutorial. @@jayzern
26:00 item_discount_amount is supposed to be negative because the macro defined it as such. I also checked the data on snowflake and they're all negative amounts. Did I miss something?
so supportive and completing the project .
Thank you for your video. I'm stuck at the last step when connecting to Airflow though I followed every step,.it shows this error:
Broken DAG: [/usr/local/airflow/dags/dbt_dag.py]
Traceback (most recent call last):
File "/usr/local/lib/python3.11/site-packages/cosmos/converter.py", line 211, in __init__
project_config.validate_project()
File "/usr/local/lib/python3.11/site-packages/cosmos/config.py", line 206, in validate_project
raise CosmosValueError(f"Could not find {name} at {path}")
cosmos.exceptions.CosmosValueError: Could not find dbt_project.yml at /usr/local/airflow/dags/dbt/data_pipelin/dbt_project.yml
Do you have any idea why I get this error? Thank you in advance!
same error here.... no solution yet.
In your case I think you have a typo missing 'e' in /usr/local/airflow/dags/dbt/data_pipelin
please how did you fix this issue?
I got the same error too. Please let me know how you fixed it.
@@aarthithinakaran6655 yes I fixed it everything works
Hi Jay. Question: Once you have created the Fact table, how does this process work if I run it again? Is it going to append new records and update the existing ones? Or is it going to drop and create the Fact table over again?
Thank you very much. This is very nice and concise tutorial, exactly what I need.
19395 Bryana Station
Hello.. thanks for the tutorial.
I know airflow runs the tasks/dags however I cannot follow one thing; how do we determine the order of the action items at 35:36 within dbt (I believe it is determined on dbt side) since we have only one dag running on this example? I appreciate if anyone replies.
This is such an amazing video @jayzern! The project taken was not overly complex but also not barebones and covered a lot of important stuff! Thanks for being thoughtful and including the code along link (else some of formatting issues would have bugged many newbies)!
I think you should keep creating more videos as you are a good teacher. Only suggestion I have is may be include a bit more explanation, which will help beginners even more! Kudos!
I materialized marts as tables but int_order_items, int_order_items_summary and fct_orders are created as views instead of tables. How do I convert these views to tables?
Thank you very much
Thanks Jayzern,! if I can be of some help for your next video let me know!
thank you so much for this tutorial. hope you have more videos in the future
Thanks man!
This code along session (starting from scratch with environment setup, codebase structure ...) is soooooooo helpful. Hope to see more examples like this. Keep up the work my man
I watched the video "How you start learning Data Engineering..." and wondering that can you do a live coding that step through all those aspects (from SQL, command lines... to Kafka...) in 1 project? I think it would help a lot...
Glad to hear it's helpful! 👍
It's great to hear feedback on what type of live coding videos you find insightful. Will keep note on Kafka and Command lines
hi guys kindly help me out, does only snowflakes and dbt is enought are i have to learn hadoop, spark etc i am working as data analyst for last 1 year and planning to switch to de
Great video and explanation. we need more videos from you.
This video is like a gold mine for building a portfolio especially for someone starting out as a Data Engineer like me!... Manny Thanks and Kudos to you!.. Love from India
Hey how did you use snowflake? Did you buy it because it shows me that it is a paid software
Ziemann Bypass
Dude this is so good :)
nice
Thank you, love your work
Roberts Isle
Thanks for sharing this dbt tutorial! It’s definitely super hot rn and useful to learn. 🎉
wait till you learn about sqlmesh
Excellent tutorial!!!
Extremely useful content, i especially liked live googling and debugging parts
Thank you for the support! Hope other people find it useful too.
Amazingly explained 👌
At 32:21, how did you copy the dbt folders to airflow project?
Hey, thanks for the project tutorial. i was wondering if there is the best way to deploy airflow on a cloud enviroment... I see a lot of Ec2 or EKS (kubernetes). But maybe i could work on ECS + Fargate? Which deploy method would you please recomend regarding a production scenario? (like beyond studies, thinking about a daily job task). Thank you mate
Airflow + EKS is probably the most common in the industry because of cost reasons and vertical scaling. You could use ECS + Fargate too, but fargate is really expensive!
I don't have any recs atm, but will try to create more examples on production DAGs next time. Check out ruclips.net/video/Xe8wYYC2gWQ/видео.html in the meantime!
Thanks very much for posting this! Definately earned another subscriber/viewer
This is great! At what point would you need to dockerize the files though? Sorry, new to data engineering. Thank you!
You can Dockerize it at the beginning, or once you have a baseline model working. I've seen cases where Data engineers start with Docker, or Dockerize it halfway! I personally prefer the latter
WOW!! ,Thank you so much for this wonderful video, Please keep making dbt + airflow videos,
I have one doubt, I can see that one task in airflow which is stg_tpch_orders have run + test in your dag, But it is not showing up in mine,
Have you added any tests on stg_tpch_orders ? but maybe missed to show it into the video ?
Hmm it's hard to tell without looking at ur code, but there is a generic test for stg_tpch_orders that looks at the relationship between fct_orders and stg_tpch_orders. Check your generic_tests.yml file to confirm
Thanks for the support man!
I'm struggling with airflow connection to snowflake, can you make another video to elaborate it more?
For sure, I didn't explain the airflow integration with snowflake as much as I wanted to
hi, I would like to know about singular test, we want to check negative value in test, why we use the condition as positive?
how do you know your username>? jayzer? like I went back to my profile but it did not work. where can I find the name of my user?
In Snowflake you should see your user name in the bottom left corner. It'll be the top bolded value
Pleas make complete videos on DBT WITH snowflake migration project with real time scenario videos bro thnk u❤ nice explaind
Thank you man! Will take that into consideration
How could i get the project folder structure?
90' it's a long Time
1.5 hours?
Hi Jay, good one..am trying same way but getting below error " 1 of 1 ERROR creating view model dbt_schema.stg_tpch_line_items................. [ERROR in 0.04s]
06:17:33
06:17:33 Finished running 1 view model in 2.02s.
06:17:33
06:17:33 Completed with 1 error and 0 warnings:
06:17:33
06:17:33 Compilation Error in model stg_tpch_line_items (models\staging\stg_tpch_line_items.sql)
06:17:33 'dict object' has no attribute 'type_string'
06:17:33
06:17:33 > in macro generate_surrogate_key (macros\sql\generate_surrogate_key.sql)
06:17:33 > called by macro default__generate_surrogate_key (macros\sql\generate_surrogate_key.sql)
06:17:33 > called by model stg_tpch_line_items (models\staging\stg_tpch_line_items.sql)"
Try checking if your dbt_utils version is correct. There seems to be a compile time error with calling generate surrogate key. The code is available in notion page.
I got the same error. How did you solve it?
One question here, As we have dbt jobs feature available in dbt cloud and it is very easy to create job here then why it is need to use airflow?
Yea that's great question! In theory dbt cloud can trigger jobs too, but in practice you'd want to decouple your orchestration tool away from your transformation tool for a myriad of reasons: ability to orchestrate other tools together with dbt, avoid vendor lock from dbt, many companies are comfortable with Airflow etc. It really depends on your tech stack
Thank you so much for this, I've been trying to learn how to do this and you helped me solve this
Do you have trainings!!
Thanks man! Yea I'm working on live trainings too so stay tuned 🙌
hey, I have a small request
can you please make a video on how to make use of pyspark efficiently in low spec system with huge amount of data
Low compute Spark + high volumes of data is challenging but will take note. Thx for the suggestion
love this! thanks for sharing this tutorial, very useful
can u tell why have we used airflow since dbt cloud has feature to schedule the jobs?
If your company only uses dbt and no other tooling, dbt cloud works too
However in the real world, it's hard to control your CRON schedule when you have many tools in your stack. Orchestrators job is to focus on scheduling. Linux philosophy of do one thing, do one thing well TLDR
Please post more videos, your videos are awesome and very instructive
I need a longer video. Please give me.
Hi @jayzern, thanks a lot for your video, really valuable content!
Great video.
I would love to see a complex ETL pipelines.
How did he stat? did he create a wroksheet? I tried it but it di not work, the very first steps ?? what arethey?
yes you need to write the queries in a worksheet
I haven't a lot from this tutorial...
Thank you
Great tutorial, i've learning a lot thanks!
WOW! That's is amazing tutorial, thanks a lot.
is do i need to pay on astro ? if i want to use this for prod env
Overall great, the airflow orchestration felt a bit clunky especially given that the source code had to be kept in the same directory.
Thx for the feedback 👍 ideally should wrap this in a container image, but for simplicity decided to keep it as code
@@jayzern Makes sense, any good resources on self hosting dbt core?
well done! great tutorial!
excellent video, thank you
You should check out Meltano
I've heard great things about Meltano!
Thanks Jayzern
Need more content like this!!! Really amazing video. Just one suggestion I would like to make before diving into the coding part it would be better if you could provide a real world scenario and reference that while writing you code. Thanks
Appreciate the feedback man 🙏 will try to incorporate more real-world context before and during the live coding part, that's a great idea
@@jayzern thanks a lot, waiting for some more tutorials😃
Nice Explaination
100% worth it
I am not sure why I cannot open the notes, can anyone help?
I double checked the link and it's working, try this
bittersweet-mall-f00.notion.site/Code-along-build-an-ELT-Pipeline-in-1-Hour-dbt-Snowflake-Airflow-cffab118a21b40b8acd3d595a4db7c15?pvs=74
Let me know what error you see
thank you very much
Good
Great video Jay
Hii mr.prasad garu are you data engineer too?
@@RohithPatelKanchukatla Hi there. I am a Data Scientist
Are we in 2024? it all looks a lot like old century ... coding is necessary but come on, this is another story
What do you mean? There’s very little actual coding here to be honest. Most of it is trying to get the different services and database talking to each other and exchanging information. And then automating it. There’s a lot of moving parts.
what do you mean?
wdym
Yeah its alot by just building tables and populating it
Can you please post more videos like this? Really appreciate it. Helps me understand the Dbt/Snowflake/Airflow a lot
Yes sir am working on future videos right now!
Doing this for the second time and for some reason dbt is only creating views and not tables FML
same. Its not creating tables. Just views.
@@Rajdeep6452 Hi I fix it just buy running again dbt run and also checking first that the dbt_project.yml say table on the marts
@@duvanzapata6761 Verified all of that and reran dbt run and still I have only views. any ideas?
I had the same error and I realized the issue came from my dbt_project.yml, I had a typo instead of data_marts I wrote data_mart. The name must be the same with the data_marts folder you created in folder structure