Hey Jay, thank you for the video. I'd be happy to see you doing more ELT pipelines and focus on your thought's process ( I can watch longer format 1-2 hours) - why you do things in that way, why is it important and whatnot; and you can throw some explainers to anything else you do and the reason behind it. I think senior DE and others with experience do things bit automatically and it takes time for the newbies to pick up on those skills. So, your thought process for doing things instead of just doing the things is priceless for anyone watching, including me. Appreciate your video, dude :)
This code along session (starting from scratch with environment setup, codebase structure ...) is soooooooo helpful. Hope to see more examples like this. Keep up the work my man
I watched the video "How you start learning Data Engineering..." and wondering that can you do a live coding that step through all those aspects (from SQL, command lines... to Kafka...) in 1 project? I think it would help a lot...
Glad to hear it's helpful! 👍 It's great to hear feedback on what type of live coding videos you find insightful. Will keep note on Kafka and Command lines
Awesome video! I already recommended this to my entire team. Please make more like this, they are extremely helpful. Idea for next video: dbt for Snowflake (again) but with Data Vault 2.0 modeling. I would love to see the logic behind creating dim and fact tables, how you define the stg files for creating the hubs/satellites/links.
Oof yea I did consider doing a Data Vault model where we showcase how hubs, satellites and links work but didn't think ppl would be interested. Thanks for raising 👍
This video has the exact answer to my questions as I'm diving into data modeling for analytics. I'm sure everyone doing this for their first time that they will find this video super helpful. Would be cool to see dbt with Cosmos for smoother operation 👌 EDIT: I was literally just getting into the Deployment part of the video, and there you introduce using Cosmos for Airflow. Kudos!!
I have been struggling with dbt and airflow for a long time. For some reason I could not connect the dots. Having some mixture of knowledge - I landed on this tutorial and it just glued all my scattered dots well. Thanks Jayzern!!! Really appreciate the efforts :)
This video is like a gold mine for building a portfolio especially for someone starting out as a Data Engineer like me!... Manny Thanks and Kudos to you!.. Love from India
This is such an amazing video @jayzern! The project taken was not overly complex but also not barebones and covered a lot of important stuff! Thanks for being thoughtful and including the code along link (else some of formatting issues would have bugged many newbies)! I think you should keep creating more videos as you are a good teacher. Only suggestion I have is may be include a bit more explanation, which will help beginners even more! Kudos!
Need more content like this!!! Really amazing video. Just one suggestion I would like to make before diving into the coding part it would be better if you could provide a real world scenario and reference that while writing you code. Thanks
thank you so much it is 100% worth and useful... expect some more videos in detail.... like prod deployment through git and git interation with airflow
Hi! really enjoy your tutorial, would like to see a tutorial how to create data CI/CD pipeline starting from pulling latest branch, running data test on staging, and deploy changes to production after test is complete since not lot of youtuber explaining this
26:00 item_discount_amount is supposed to be negative because the macro defined it as such. I also checked the data on snowflake and they're all negative amounts. Did I miss something?
2 месяца назад+2
Thank you for the video jayzern. When I push code into Git, should I push code of dbt only, or I need to push all code of dbt-dag ?
Hi Jay, thanks for the video. I'm having an issue connecting to Snowflake backend at the stage you first perform 'dbt run' @ 14:50 . This is the error I get: 15:17:54 Encountered an error: Runtime Error Database error while listing schemas in database "dbt_db" Database Error 250001: Could not connect to Snowflake backend after 2 attempt(s).Aborting I've checked the profiles.yml file and all details are correct. Please help!
@MalvinSiew I solved one of the two errors I was facing. I did not have Git installed in my system. You can simply ask AI for prompts to guide you through the installation process.
Had the same problem, when passing the account_value with 'dbt init' I wasn't able to connecto using the ccount url value, only with the second option which was the - value
WOW!! ,Thank you so much for this wonderful video, Please keep making dbt + airflow videos, I have one doubt, I can see that one task in airflow which is stg_tpch_orders have run + test in your dag, But it is not showing up in mine, Have you added any tests on stg_tpch_orders ? but maybe missed to show it into the video ?
Hmm it's hard to tell without looking at ur code, but there is a generic test for stg_tpch_orders that looks at the relationship between fct_orders and stg_tpch_orders. Check your generic_tests.yml file to confirm Thanks for the support man!
Hi Jay. Question: Once you have created the Fact table, how does this process work if I run it again? Is it going to append new records and update the existing ones? Or is it going to drop and create the Fact table over again?
Hello.. thanks for the tutorial. I know airflow runs the tasks/dags however I cannot follow one thing; how do we determine the order of the action items at 35:36 within dbt (I believe it is determined on dbt side) since we have only one dag running on this example? I appreciate if anyone replies.
Hi I am trying your proect and got stuk here can you here 21:32:24 Unable to do partial parsing because saved manifest not found. Starting full parse. 21:32:25 Encountered an error: Compilation Error Model 'model.DATA_PIPELINE.stg_tpch_orders' (models/staging/stg_tpch_orders.sql) depends on a source named 'tpch.orders' which was not found
couldn't run int_order_items.sql because it ruturns a strange error: it says: "The selection criterion 'int_order_items.sql' does not match any enabled nodes". And if aI run "dbt run" it says: " unexpected '.' in line 1" at 20:22
So to my understanding, the singular tests really mean to check if nothing is the result of the query been tested. If the test is true, then nothing equates to the query been tested - Great your data is fine. If false, you should run that query to see what exactly are those rows. Confusing at first but makes sense now.
Jay! Thanks for the video and content very cool to see. Curious why Airflow over something like FiveTran besides the ability to self host? Any gotchas?
FiveTran is not really an orchestration tool - it's really meant for the "Extract Load" part only. It's great because of Unix philosophy, i.e. "do one thing, do one thing well only", whereas Airflow is more of a generalist, task-based orchestrator. Another thing is FiveTran is super expensive, unless you're working on something enterprise-y
No worries, I realized you used it from the Cosmos github repo so I managed to find it there and finally was able to wire up everyhing and deploy it. 🤓 Thanks Jay. It's a super helpful tutorial. @@jayzern
You can Dockerize it at the beginning, or once you have a baseline model working. I've seen cases where Data engineers start with Docker, or Dockerize it halfway! I personally prefer the latter
Hi Jay, good one..am trying same way but getting below error " 1 of 1 ERROR creating view model dbt_schema.stg_tpch_line_items................. [ERROR in 0.04s] 06:17:33 06:17:33 Finished running 1 view model in 2.02s. 06:17:33 06:17:33 Completed with 1 error and 0 warnings: 06:17:33 06:17:33 Compilation Error in model stg_tpch_line_items (models\staging\stg_tpch_line_items.sql) 06:17:33 'dict object' has no attribute 'type_string' 06:17:33 06:17:33 > in macro generate_surrogate_key (macros\sql\generate_surrogate_key.sql) 06:17:33 > called by macro default__generate_surrogate_key (macros\sql\generate_surrogate_key.sql) 06:17:33 > called by model stg_tpch_line_items (models\staging\stg_tpch_line_items.sql)"
Try checking if your dbt_utils version is correct. There seems to be a compile time error with calling generate surrogate key. The code is available in notion page.
Hey, thanks for the project tutorial. i was wondering if there is the best way to deploy airflow on a cloud enviroment... I see a lot of Ec2 or EKS (kubernetes). But maybe i could work on ECS + Fargate? Which deploy method would you please recomend regarding a production scenario? (like beyond studies, thinking about a daily job task). Thank you mate
Airflow + EKS is probably the most common in the industry because of cost reasons and vertical scaling. You could use ECS + Fargate too, but fargate is really expensive! I don't have any recs atm, but will try to create more examples on production DAGs next time. Check out ruclips.net/video/Xe8wYYC2gWQ/видео.html in the meantime!
I cannot run my dbt project. I’m still a beginner but I do not understand why this happens, considering that my macros directory is empty except for a .gitkeep file: Compilation Error dbt found two macros named "materialization_table_default" in the project "dbt". To fix this error, rename or remove one of the following macros: - macros/materializations/models/table/table.sql - macros/materializations/models/table.sql
I materialized marts as tables but int_order_items, int_order_items_summary and fct_orders are created as views instead of tables. How do I convert these views to tables?
If your company only uses dbt and no other tooling, dbt cloud works too However in the real world, it's hard to control your CRON schedule when you have many tools in your stack. Orchestrators job is to focus on scheduling. Linux philosophy of do one thing, do one thing well TLDR
hi guys kindly help me out, does only snowflakes and dbt is enought are i have to learn hadoop, spark etc i am working as data analyst for last 1 year and planning to switch to de
Error solved!!!! for anyone facing this error: Runtime Error Database error while listing schemas in database "dbt_db" Database Error 250001: Could not connect to Snowflake backend after 2 attempt(s).Aborting Try the second method to update account name for your project inside profile.yml file. account_locator-account_name
Yea that's great question! In theory dbt cloud can trigger jobs too, but in practice you'd want to decouple your orchestration tool away from your transformation tool for a myriad of reasons: ability to orchestrate other tools together with dbt, avoid vendor lock from dbt, many companies are comfortable with Airflow etc. It really depends on your tech stack
hi ! I'm having trouble connecting to snowflake. can someone please help me resolve it . I just started learning dbt and snowflake . Runtime Error Database error while listing schemas in database "dbt_db" Database Error 250001: Could not connect to Snowflake backend after 2 attempt(s).Aborting
I double checked the link and it's working, try this bittersweet-mall-f00.notion.site/Code-along-build-an-ELT-Pipeline-in-1-Hour-dbt-Snowflake-Airflow-cffab118a21b40b8acd3d595a4db7c15?pvs=74 Let me know what error you see
Hello did anyone else face this error at Airflow after @32:50 Broken DAG: [/usr/local/airflow/dags/dbt-dag.py] Traceback (most recent call last): File "/usr/local/lib/python3.12/site-packages/cosmos/operators/base.py", line 361, in __init__ self.full_refresh = full_refresh ^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/airflow/models/baseoperator.py", line 1198, in __setattr__ if key in self.__init_kwargs: ^^^^^^^^^^^^^^^^^^ AttributeError: 'DbtRunLocalOperator' object has no attribute '_BaseOperator__init_kwargs'. Did you mean: '_BaseOperator__instantiated'? please send help
Ok, so I think I was able to find the thread related to this issue.. Its still open as of 8/18/2024 11pm PT.. github.com/astronomer/astronomer-cosmos/issues/1161
Hey Jay, thank you for the video. I'd be happy to see you doing more ELT pipelines and focus on your thought's process ( I can watch longer format 1-2 hours) - why you do things in that way, why is it important and whatnot; and you can throw some explainers to anything else you do and the reason behind it. I think senior DE and others with experience do things bit automatically and it takes time for the newbies to pick up on those skills. So, your thought process for doing things instead of just doing the things is priceless for anyone watching, including me. Appreciate your video, dude :)
Thank you! Will try to create more useful content
Completely agree 😊
honestly never knew about dbt and glad to learn it here thank you
This code along session (starting from scratch with environment setup, codebase structure ...) is soooooooo helpful. Hope to see more examples like this. Keep up the work my man
I watched the video "How you start learning Data Engineering..." and wondering that can you do a live coding that step through all those aspects (from SQL, command lines... to Kafka...) in 1 project? I think it would help a lot...
Glad to hear it's helpful! 👍
It's great to hear feedback on what type of live coding videos you find insightful. Will keep note on Kafka and Command lines
I was struggling to simplify airflow and DBT integration and this tutorial really helped me get through the finish line. Thank you!
Awesome video! I already recommended this to my entire team. Please make more like this, they are extremely helpful.
Idea for next video: dbt for Snowflake (again) but with Data Vault 2.0 modeling. I would love to see the logic behind creating dim and fact tables, how you define the stg files for creating the hubs/satellites/links.
Oof yea I did consider doing a Data Vault model where we showcase how hubs, satellites and links work but didn't think ppl would be interested. Thanks for raising 👍
thank you so much for this tutorial. hope you have more videos in the future
Thanks man!
This video has the exact answer to my questions as I'm diving into data modeling for analytics. I'm sure everyone doing this for their first time that they will find this video super helpful.
Would be cool to see dbt with Cosmos for smoother operation 👌
EDIT: I was literally just getting into the Deployment part of the video, and there you introduce using Cosmos for Airflow. Kudos!!
Extremely useful content, i especially liked live googling and debugging parts
Thank you for the support! Hope other people find it useful too.
Thanks for sharing this dbt tutorial! It’s definitely super hot rn and useful to learn. 🎉
wait till you learn about sqlmesh
When ETL came about the Cloud did not exist, I was writing shell scripts and SQL almost 30 years ago to do ETL. Useful video thanks!
Concise and to the point. It was very helpful. Thanks, please show more end to end complex projects like this
amazing vedio very clear to explain How snowflake,dbt,airflow and cosmos are all linked together to provide data transformation and the orchestration.
I have been struggling with dbt and airflow for a long time. For some reason I could not connect the dots. Having some mixture of knowledge - I landed on this tutorial and it just glued all my scattered dots well. Thanks Jayzern!!! Really appreciate the efforts :)
Great video and explanation. we need more videos from you.
very good session, helped me get a much more concrete idea about how those tools look like and how they work together
AN ABSOLUTE GOLDMINE OF AN INFORMATION WHICH NOT AY UDEMY OR RUclips TUTOR HAS PROVIDED YET!
Thank you very much. This is very nice and concise tutorial, exactly what I need.
brilliant tutorial, thanks for this!
This video is like a gold mine for building a portfolio especially for someone starting out as a Data Engineer like me!... Manny Thanks and Kudos to you!.. Love from India
Hey how did you use snowflake? Did you buy it because it shows me that it is a paid software
@@adityakulkarni3798 I am wondering the same thing
i'm new to snowflake, dbt and airflow,
this is awesome tutorial, got to learn a lot
thank you jayzern
Thank you, thank you THANK YOU! This was so helpful, easy to follow and made perfect sense.
Can you please post more videos like this? Really appreciate it. Helps me understand the Dbt/Snowflake/Airflow a lot
Yes sir am working on future videos right now!
Great video.
I would love to see a complex ETL pipelines.
Thanks @jayzern. This tutorial is awesome. I will be recommending it to folks who struggle with connecting dbt with any database engine.
This is such an amazing video @jayzern! The project taken was not overly complex but also not barebones and covered a lot of important stuff! Thanks for being thoughtful and including the code along link (else some of formatting issues would have bugged many newbies)!
I think you should keep creating more videos as you are a good teacher. Only suggestion I have is may be include a bit more explanation, which will help beginners even more! Kudos!
Please post more videos, your videos are awesome and very instructive
It's beautiful! Thx man!
Need more content like this!!! Really amazing video. Just one suggestion I would like to make before diving into the coding part it would be better if you could provide a real world scenario and reference that while writing you code. Thanks
Appreciate the feedback man 🙏 will try to incorporate more real-world context before and during the live coding part, that's a great idea
@@jayzern thanks a lot, waiting for some more tutorials😃
Thanks very much for posting this! Definately earned another subscriber/viewer
thank you so much it is 100% worth and useful... expect some more videos in detail.... like prod deployment through git and git interation with airflow
Thank you so much for this, I've been trying to learn how to do this and you helped me solve this
Do you have trainings!!
Thanks man! Yea I'm working on live trainings too so stay tuned 🙌
Thank you for this content
Hi @jayzern, thanks for video. Is the airflow running singular tests as well? Where did we mentioned "dbt test" in the airflow ?
Thank for rich content!
Dude this is so good :)
Hi! really enjoy your tutorial, would like to see a tutorial how to create data CI/CD pipeline starting from pulling latest branch, running data test on staging, and deploy changes to production after test is complete since not lot of youtuber explaining this
This is actually a brilliant idea, thanks for the rec!
26:00 item_discount_amount is supposed to be negative because the macro defined it as such. I also checked the data on snowflake and they're all negative amounts. Did I miss something?
Thank you for the video jayzern. When I push code into Git, should I push code of dbt only, or I need to push all code of dbt-dag ?
Great tutorial, i've learning a lot thanks!
so supportive and completing the project .
Jay good job 🎉
Thanks bro for your efforts ❤
I haven't a lot from this tutorial...
Thank you
love this! thanks for sharing this tutorial, very useful
Hi Jay, thanks for the video. I'm having an issue connecting to Snowflake backend at the stage you first perform 'dbt run' @ 14:50 .
This is the error I get:
15:17:54 Encountered an error:
Runtime Error
Database error while listing schemas in database "dbt_db"
Database Error
250001: Could not connect to Snowflake backend after 2 attempt(s).Aborting
I've checked the profiles.yml file and all details are correct. Please help!
facing the same issue!!!!!! can anyone please help I've restarted and tried everything possible to figure out but failed
@MalvinSiew
I solved one of the two errors I was facing. I did not have Git installed in my system. You can simply ask AI for prompts to guide you through the installation process.
Had the same problem, when passing the account_value with 'dbt init' I wasn't able to connecto using the ccount url value, only with the second option which was the - value
did u solve it? I have the same problem. what is the solution?
@@oreschz could you solve it?
Excellent tutorial!!!
Amazingly explained 👌
amazing tutorial
Thank you, love your work
well done! great tutorial!
WOW! That's is amazing tutorial, thanks a lot.
Great video Jay
Hii mr.prasad garu are you data engineer too?
@@RohithPatelKanchukatla Hi there. I am a Data Scientist
WOW!! ,Thank you so much for this wonderful video, Please keep making dbt + airflow videos,
I have one doubt, I can see that one task in airflow which is stg_tpch_orders have run + test in your dag, But it is not showing up in mine,
Have you added any tests on stg_tpch_orders ? but maybe missed to show it into the video ?
Hmm it's hard to tell without looking at ur code, but there is a generic test for stg_tpch_orders that looks at the relationship between fct_orders and stg_tpch_orders. Check your generic_tests.yml file to confirm
Thanks for the support man!
At 32:21, how did you copy the dbt folders to airflow project?
Your guideline is a gem. But the airflow part is not very clear, i deep dive so many times to fix hahaha
Hi Jay. Question: Once you have created the Fact table, how does this process work if I run it again? Is it going to append new records and update the existing ones? Or is it going to drop and create the Fact table over again?
Hello.. thanks for the tutorial.
I know airflow runs the tasks/dags however I cannot follow one thing; how do we determine the order of the action items at 35:36 within dbt (I believe it is determined on dbt side) since we have only one dag running on this example? I appreciate if anyone replies.
Hi I am trying your proect and got stuk here can you here
21:32:24 Unable to do partial parsing because saved manifest not found. Starting full parse.
21:32:25 Encountered an error:
Compilation Error
Model 'model.DATA_PIPELINE.stg_tpch_orders' (models/staging/stg_tpch_orders.sql) depends on a source named 'tpch.orders' which was not found
Hi @jayzern, thanks a lot for your video, really valuable content!
Great video! What text editor are you using?
VS Code
couldn't run int_order_items.sql because it ruturns a strange error: it says: "The selection criterion 'int_order_items.sql' does not match any enabled nodes". And if aI run "dbt run" it says: " unexpected '.' in line 1" at 20:22
So to my understanding, the singular tests really mean to check if nothing is the result of the query been tested.
If the test is true, then nothing equates to the query been tested - Great your data is fine.
If false, you should run that query to see what exactly are those rows.
Confusing at first but makes sense now.
excellent video, thank you
Jay! Thanks for the video and content very cool to see. Curious why Airflow over something like FiveTran besides the ability to self host? Any gotchas?
FiveTran is not really an orchestration tool - it's really meant for the "Extract Load" part only. It's great because of Unix philosophy, i.e. "do one thing, do one thing well only", whereas Airflow is more of a generalist, task-based orchestrator. Another thing is FiveTran is super expensive, unless you're working on something enterprise-y
Thanks Jay! Could you also upload into the Notion document the code for the dbt_dag.py file for the Airflow deployment? That's still missing 🙏🏻
Totally forgot about that, thanks for the reminder!
No worries, I realized you used it from the Cosmos github repo so I managed to find it there and finally was able to wire up everyhing and deploy it. 🤓 Thanks Jay. It's a super helpful tutorial. @@jayzern
Pleas make complete videos on DBT WITH snowflake migration project with real time scenario videos bro thnk u❤ nice explaind
Thank you man! Will take that into consideration
This is great! At what point would you need to dockerize the files though? Sorry, new to data engineering. Thank you!
You can Dockerize it at the beginning, or once you have a baseline model working. I've seen cases where Data engineers start with Docker, or Dockerize it halfway! I personally prefer the latter
Thanks Jayzern,! if I can be of some help for your next video let me know!
Nice Explaination
Just wonder in the real world scenario, where are all raw data stored? In AWS s3?
Thanks Jayzern
Hi Jay, good one..am trying same way but getting below error " 1 of 1 ERROR creating view model dbt_schema.stg_tpch_line_items................. [ERROR in 0.04s]
06:17:33
06:17:33 Finished running 1 view model in 2.02s.
06:17:33
06:17:33 Completed with 1 error and 0 warnings:
06:17:33
06:17:33 Compilation Error in model stg_tpch_line_items (models\staging\stg_tpch_line_items.sql)
06:17:33 'dict object' has no attribute 'type_string'
06:17:33
06:17:33 > in macro generate_surrogate_key (macros\sql\generate_surrogate_key.sql)
06:17:33 > called by macro default__generate_surrogate_key (macros\sql\generate_surrogate_key.sql)
06:17:33 > called by model stg_tpch_line_items (models\staging\stg_tpch_line_items.sql)"
Try checking if your dbt_utils version is correct. There seems to be a compile time error with calling generate surrogate key. The code is available in notion page.
I got the same error. How did you solve it?
Hey, thanks for the project tutorial. i was wondering if there is the best way to deploy airflow on a cloud enviroment... I see a lot of Ec2 or EKS (kubernetes). But maybe i could work on ECS + Fargate? Which deploy method would you please recomend regarding a production scenario? (like beyond studies, thinking about a daily job task). Thank you mate
Airflow + EKS is probably the most common in the industry because of cost reasons and vertical scaling. You could use ECS + Fargate too, but fargate is really expensive!
I don't have any recs atm, but will try to create more examples on production DAGs next time. Check out ruclips.net/video/Xe8wYYC2gWQ/видео.html in the meantime!
I'm struggling within the step to load dbt data_pipeline, it did not show in the airflow dag. How could I be wrong, can you support?
100% worth it
make video related star and dimension modeling
hi, I would like to know about singular test, we want to check negative value in test, why we use the condition as positive?
thank you very much
Dude, where did you even mention about dbt_project.yml file, in part 2 of the video, you directly jump to vscode
what are the details ??
I'm struggling with airflow connection to snowflake, can you make another video to elaborate it more?
For sure, I didn't explain the airflow integration with snowflake as much as I wanted to
any prerequisites for this
I need a longer video. Please give me.
is do i need to pay on astro ? if i want to use this for prod env
I cannot run my dbt project. I’m still a beginner but I do not understand why this happens, considering that my macros directory is empty except for a .gitkeep file:
Compilation Error
dbt found two macros named "materialization_table_default" in the project
"dbt".
To fix this error, rename or remove one of the following macros:
- macros/materializations/models/table/table.sql
- macros/materializations/models/table.sql
hey, I have a small request
can you please make a video on how to make use of pyspark efficiently in low spec system with huge amount of data
Low compute Spark + high volumes of data is challenging but will take note. Thx for the suggestion
I materialized marts as tables but int_order_items, int_order_items_summary and fct_orders are created as views instead of tables. How do I convert these views to tables?
How could i get the project folder structure?
can u tell why have we used airflow since dbt cloud has feature to schedule the jobs?
If your company only uses dbt and no other tooling, dbt cloud works too
However in the real world, it's hard to control your CRON schedule when you have many tools in your stack. Orchestrators job is to focus on scheduling. Linux philosophy of do one thing, do one thing well TLDR
hi guys kindly help me out, does only snowflakes and dbt is enought are i have to learn hadoop, spark etc i am working as data analyst for last 1 year and planning to switch to de
Error solved!!!!
for anyone facing this error:
Runtime Error
Database error while listing schemas in database "dbt_db"
Database Error
250001: Could not connect to Snowflake backend after 2 attempt(s).Aborting
Try the second method to update account name for your project inside profile.yml file.
account_locator-account_name
Thank you !
Thank you!
Hey! How did you go about updating the account name(or resolving the error)? I can't find the profile.yml file.
One question here, As we have dbt jobs feature available in dbt cloud and it is very easy to create job here then why it is need to use airflow?
Yea that's great question! In theory dbt cloud can trigger jobs too, but in practice you'd want to decouple your orchestration tool away from your transformation tool for a myriad of reasons: ability to orchestrate other tools together with dbt, avoid vendor lock from dbt, many companies are comfortable with Airflow etc. It really depends on your tech stack
How did he stat? did he create a wroksheet? I tried it but it di not work, the very first steps ?? what arethey?
yes you need to write the queries in a worksheet
Tell me one thing , is data engineering good job profile for freshers
nice
Is data engineering dead with advent of AI ? What is the future of data engineering careers in your opinion ?
Overall great, the airflow orchestration felt a bit clunky especially given that the source code had to be kept in the same directory.
Thx for the feedback 👍 ideally should wrap this in a container image, but for simplicity decided to keep it as code
@@jayzern Makes sense, any good resources on self hosting dbt core?
You should check out Meltano
I've heard great things about Meltano!
hi ! I'm having trouble connecting to snowflake. can someone please help me resolve it . I just started learning dbt and snowflake .
Runtime Error
Database error while listing schemas in database "dbt_db"
Database Error
250001: Could not connect to Snowflake backend after 2 attempt(s).Aborting
worth checking your snowflake credentials again, I got the same error due to an incorrect account id
I am not sure why I cannot open the notes, can anyone help?
I double checked the link and it's working, try this
bittersweet-mall-f00.notion.site/Code-along-build-an-ELT-Pipeline-in-1-Hour-dbt-Snowflake-Airflow-cffab118a21b40b8acd3d595a4db7c15?pvs=74
Let me know what error you see
Hello did anyone else face this error at Airflow after @32:50
Broken DAG: [/usr/local/airflow/dags/dbt-dag.py]
Traceback (most recent call last):
File "/usr/local/lib/python3.12/site-packages/cosmos/operators/base.py", line 361, in __init__
self.full_refresh = full_refresh
^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/airflow/models/baseoperator.py", line 1198, in __setattr__
if key in self.__init_kwargs:
^^^^^^^^^^^^^^^^^^
AttributeError: 'DbtRunLocalOperator' object has no attribute '_BaseOperator__init_kwargs'. Did you mean: '_BaseOperator__instantiated'?
please send help
I am facing the exact same error. Please post a reply, if you were able to figure out the fix. I'll do the same if I find a solution.
Ok, so I think I was able to find the thread related to this issue.. Its still open as of 8/18/2024 11pm PT..
github.com/astronomer/astronomer-cosmos/issues/1161