Great episode - thanks for going deeper into problematic and not just covering strictly DP-203 topics, it really helps people who really want learn Azure data engineering. Looking forward to see next epodes!
I have been strugling for a while at founding good material to learn data engineering. This is one of the best channels that i have found, your explanations are quite simple and awesomes, and comes with a demostration. Thank you so much! Greetings from Colombia.
I really love this content. You are a true teacher! I appreciate your desire teach your understanding rather that rattling off a list of features like I see in other training video series. I hope you can keep doing this.
Very concise, Simple and to the point explanations; with a good technical story telling; wonderful content!! Thank you Piotr! Liked & Subscribed immeditaley 🤛
Wow, it's started from root to the top.. did not found better explanation of that hierarchical checkbox function.. Liked and subscribed for this explanation...
Thanks a lot. I followed first three videos so far and it is really useful. Hope that by the end of 41 videos, I will have good understanding of most of the Azure Data engineering concept. Thanks for your effort and making my life easier :-)
@@TybulOnAzure , Thanks a lot for letting me know, I will look forward to rest of the 9 episodes to be uploaded. I am helping one of my friend who is seeking employment plus I am building my knowledge also..I will recommend these videos to anyone who is looking for data enginer job..the vidoes starts with introduction and cover most of the topics..very well done..thanks again 🙂
@@TybulOnAzure So sad that my exam is scheduled for the next week :(. Finished the whole DP-203 course on Coursera for that 50% exam discount, only to find their course confusing, messy, and more importantly outdated for more than two years. Now I'm gonna have to try to speed through your 40 videos within a week😢
Incredible content! Really loving it (and definetly not because it's free). I'm just in the third class, so could someone tell me if the teacher will propose exercises and solve them with us in the next classes?
Yes, at the end of every major, step the teacher records a milestone episode that summarises covered area and gives a challenge to viewers that requires them to apply that knowledge. However, there is no completed solution available done by the teacher as he wants viewers to try implementing it on their own.
@@TybulOnAzure thanks for the answer! I would just like to recommend you to record the explantion of the exercises because for us it can be difficult to get everything done. Sometimes it's a simple mistake that hinders us to complete everything. Maybe you could even sell it separately (I would buy it for sure).
What’s the best approach when table in the source has changed? Lets say i have to reload whole data, to my understanding all historical data will be saved in directory with todays date (as it’s extraction date). Is it possible to extract data from the source and save it in directory, based on the date of when row was added to the table instead of loading all data to the directory with todays date? So it would look like this Y= 2022 Y=2023 Instead of having all data dropped in directory Y= 2024 MM = 06 DD = 21 Whats best approach in such case? What i thought can be done 1. Save data instantly in proper directories during extraction 2. Save it all in today’s extraction date 2.1 Load it do different container with proper directories dates
In the first phase I usually partition the data based on the ingestion date just to know what data was extracted when. Only then (if needed) I would repartition it based on a change date - but it would be done in the next data lake layer. Organizing data by a change date might be beneficial if you have huuuge amounts of data and you want to process only those parts that have changed. But at the end - it's your decision to organize the data in a data lake according to your requirements.
You shouldn't use ADLS gen1 at all as it is a legacy stuff. It will be retired in 2024 and you might not be even able to provision one in your subscription.
Sorry for the late response - somehow I overlooked your question. Anyway, the price is similar in both cases except for meta-data related costs that are applied only in case of hierarchical namespace. Take a look here for details: 1. Pricing for block blobs in Blob Storage: azure.microsoft.com/en-us/pricing/details/storage/blobs/ 2. Pricing for ADLSg2: azure.microsoft.com/en-us/pricing/details/storage/data-lake/ Also, a very good way of comparing costs is to start playing with Azure Pricing Calculator (azure.microsoft.com/en-us/pricing/calculator/) and selecting various options to see how they impact the final price.
Great episode - thanks for going deeper into problematic and not just covering strictly DP-203 topics, it really helps people who really want learn Azure data engineering. Looking forward to see next epodes!
exactly exactly.... very honest exam .. you really know the deeper not just the surface.
I really like the presentation of information, very structured and consistent, I don’t feel like I’m getting scraps of information🙂
Thank you for starting such a valuable course, man. Your way of explaining various topics is fantastic for me. Please keep doing this!
I have been strugling for a while at founding good material to learn data engineering. This is one of the best channels that i have found, your explanations are quite simple and awesomes, and comes with a demostration. Thank you so much!
Greetings from Colombia.
Thanks Camilo for those kind words, it really means a lot to me.
You are a good tutor. You can explain tech issue in plain English.
I really love this content. You are a true teacher! I appreciate your desire teach your understanding rather that rattling off a list of features like I see in other training video series. I hope you can keep doing this.
Thanks a lot!
Wonderful…the way you covered topics is awesome and even beginner can find it easy to grab things easily. Thanks for all your efforts
wonderful lecture. Too useful. Thank you for making these videos.
It's surprising that you have only 300 subscribers with that good quality of contents on your channel
I think it takes time to build the audience. But it is getting bigger every day.
Thank you for making learning so interesting.. looking forward to complete the playlist. I am confident that I can clear the certification.
Very concise, Simple and to the point explanations; with a good technical story telling; wonderful content!!
Thank you Piotr!
Liked & Subscribed immeditaley 🤛
Many thanks!
Wow, it's started from root to the top.. did not found better explanation of that hierarchical checkbox function..
Liked and subscribed for this explanation...
Keep up this great series! Looking forward to the next video.
Thanks mate! Comments like this really motivate me to record new episodes.
Excellent explanation, It makes literal sense to everyone .Hats off to you. Please Keep it up.
Loving your content, gonna watch all the series!
Your explaining made a lot of sense to me. Thank you.
Great explanation. Glad to find this playlist
Glad it was helpful!
I wish I got hold of your channel at the start of my Data career!!
I wish I started that channel 5 years ago :)
Thanks a lot. I followed first three videos so far and it is really useful. Hope that by the end of 41 videos, I will have good understanding of most of the Azure Data engineering concept. Thanks for your effort and making my life easier :-)
You are welcome. BTW: there will be around 50 episodes in total.
@@TybulOnAzure , Thanks a lot for letting me know, I will look forward to rest of the 9 episodes to be uploaded. I am helping one of my friend who is seeking employment plus I am building my knowledge also..I will recommend these videos to anyone who is looking for data enginer job..the vidoes starts with introduction and cover most of the topics..very well done..thanks again 🙂
@@TybulOnAzure So sad that my exam is scheduled for the next week :(. Finished the whole DP-203 course on Coursera for that 50% exam discount, only to find their course confusing, messy, and more importantly outdated for more than two years. Now I'm gonna have to try to speed through your 40 videos within a week😢
Or you might reschedule it and take your time to prepare.
Mister cat behind Piotr makes learning curve to go rocket up!
And the best part is there's a lot more of my cats in the background just being cats and doing their usual cat stuff.
Great , thanks for this lucid delivery
You are welcome
Thanks Man, waiting for the next one 😊
Thanks. The next episode will be released this Thursday.
Keep going on your great works!
nice explanation - your a addiction.:)
Love Your training videos sir :)
Glad you like them!
great content .you earned a follower!
Is it common practice to name the folders like this: Year=2023, like you did at 1:06:40?
Yes, it is. I'll talk more about this in one of future episodes.
Got good idea about how storage account is different for different data types/source
Incredible content! Really loving it (and definetly not because it's free). I'm just in the third class, so could someone tell me if the teacher will propose exercises and solve them with us in the next classes?
Yes, at the end of every major, step the teacher records a milestone episode that summarises covered area and gives a challenge to viewers that requires them to apply that knowledge. However, there is no completed solution available done by the teacher as he wants viewers to try implementing it on their own.
@@TybulOnAzure thanks for the answer! I would just like to recommend you to record the explantion of the exercises because for us it can be difficult to get everything done. Sometimes it's a simple mistake that hinders us to complete everything. Maybe you could even sell it separately (I would buy it for sure).
Beautiful. Thanks!
What’s the best approach when table in the source has changed?
Lets say i have to reload whole data, to my understanding all historical data will be saved in directory with todays date (as it’s extraction date).
Is it possible to extract data from the source and save it in directory, based on the date of when row was added to the table instead of loading all data to the directory with todays date?
So it would look like this
Y= 2022
Y=2023
Instead of having all data dropped in directory
Y= 2024 MM = 06 DD = 21
Whats best approach in such case?
What i thought can be done
1. Save data instantly in proper directories during extraction
2. Save it all in today’s extraction date
2.1 Load it do different container with proper directories dates
In the first phase I usually partition the data based on the ingestion date just to know what data was extracted when. Only then (if needed) I would repartition it based on a change date - but it would be done in the next data lake layer. Organizing data by a change date might be beneficial if you have huuuge amounts of data and you want to process only those parts that have changed.
But at the end - it's your decision to organize the data in a data lake according to your requirements.
What is the difference between containers and directories in the data lake?
Thanks for this question. I'll cover that in one of future episodes.
Nicely done!
I have one question.
Why would anyone still use gen1 for logs, if you could not use hierarchical structure in gen2?
You shouldn't use ADLS gen1 at all as it is a legacy stuff. It will be retired in 2024 and you might not be even able to provision one in your subscription.
Hi Tybul,
You have other courses in any other platform other than youtube?
Nope, only YT.
Do you use any other learning platforms? If so, which ones?
Great episode! Can you please share, what program you are using for painting those red squares?
Sure, it's ZoomIt: learn.microsoft.com/en-us/sysinternals/downloads/zoomit
@@TybulOnAzure thank you so much!
awsome cat cameo
How would the File Sync feature work from an ingress/egress perspective?
Regular bandwidth pricing applies: azure.microsoft.com/en-us/pricing/details/bandwidth/.
Is there any extra charge for checking the hierarchial namespace checkbox?
Sorry for the late response - somehow I overlooked your question.
Anyway, the price is similar in both cases except for meta-data related costs that are applied only in case of hierarchical namespace. Take a look here for details:
1. Pricing for block blobs in Blob Storage: azure.microsoft.com/en-us/pricing/details/storage/blobs/
2. Pricing for ADLSg2: azure.microsoft.com/en-us/pricing/details/storage/data-lake/
Also, a very good way of comparing costs is to start playing with Azure Pricing Calculator (azure.microsoft.com/en-us/pricing/calculator/) and selecting various options to see how they impact the final price.
keep going thqnks
its the best explain what I found, thank you so much.
Glad it was helpful!
Love from india awesome content
Thanks and welcome on the channel.