Merging your data in a modern lakehouse data warehouse
HTML-код
- Опубликовано: 15 окт 2024
- Learn how you can move your data through different tiers using MERGE either with pySpark or SQL in Azure Synapse Analytics. Stijn Wynants walks you through the different steps.
Stijn Wynants
/ sqlstijn
pyspark.sql.SparkSession.createDataFrame
spark.apache.o...
Table deletes, updates, and merges
docs.delta.io/...
Delta Lake Documentation
docs.delta.io/...
📢 Become a member: guyinacu.be/me...
*******************
Want to take your Power BI skills to the next level? We have training courses available to help you with your journey.
🎓 Guy in a Cube courses: guyinacu.be/co...
*******************
LET'S CONNECT!
*******************
-- / guyinacube
-- / awsaxton
-- / patrickdba
-- / guyinacube
-- / guyinacube
-- guyinacube.com
**Gear**
🛠 Check out my Tools page - guyinacube.com...
#AzureSynapse #merge #GuyInACube
Great Video. I'd love to see a future one addressing how to handle type 2 SCDs.
Great video! That's exactly what I was searching for early.
Thanks a lot!
Thank you for the video, this is amazing! I've been following this playlist of "Building a modern data lakehouse in Azure Synapse". Up to to this video demonstrates Inserts and Updates, but what about Deletes from the Bronze layer to Silver Layer? Can you please make a video about deletes in Synapse SQL, thanks.
Great vid, Thanks guys!
Let me ask you, is there any difference in the cost of execution when you run a notebook using PySpark or SQL?
Hi Gabriel, both use the same SparkAPI in the background so they should cost/perform about the same. The cost is the amount of time your cluster runs!
@@stijnwynants7307 thank you very much Stijn!!
Question?. When my data merge with new data I need to re-build the data base in the workspace?, or the data base in the workspace read the delta file and have the changes?, please help me
Hi Stijn - how do you keep running the merge statement so that you get live data in your SILVER layer?
Why delta is not supported in azure synapse dedicated sequel pool?
It's a sql analytics engine. Delta is an Apache spark feature.
Great video.
Supercool!
It's not clear how this tie up to the first time data load to bronze.
Nice 👍
Cool 😎
This is great stuff