Azure Data Factory | Copy multiple tables in Bulk with Lookup & ForEach
HTML-код
- Опубликовано: 20 апр 2020
- With Azure Data Factory Lookup and ForEach activities you can perform dynamic copies of your data tables in bulk within a single pipeline.
In this episode I will show you how to perform bulk copies with ADF.
Source code for demos: github.com/MarczakIO/azure4ev...
In this episode live demo of ADF lookup activity in SQL to Blob export scenario and how to control this using metadata table.
Next steps for you after watching the video
1. ADF bulk copy scenario
- docs.microsoft.com/en-us/azur...
2. Lookup Activity docs
- docs.microsoft.com/en-us/azur...
3. ForEach Activity docs
- docs.microsoft.com/en-us/azur...
Want to connect?
- Blog marczak.io/
- Twitter / marczakio
- Facebook / marczakio
- LinkedIn / adam-marczak
- Site azure4everyone.com - Наука
Thank you, Adam. I rewatch this video at least twice a year, Thank you for all you do.
Cannot thank you enough for your incredibly well laid out, thorough explanations. The world needs more folks like you :)
I really really like how you guide step by step like this, it is quite easy to understand. You are the best “trainner” I’ve seen, really appreciated for your time on creating those useful videos.
You're the best Adam! Thanks for all the help, been watching your tutorials on ADF and they're very helpful. Keep them coming!
My pleasure! Thanks Alberto!
Thank you Adam! I had been trying to follow some other written content to do exactly what you showed with no success. Your precise steps and explanation of the process were so helpful. I am successful now.
Excellent, Excellent video. This has truly cemented the concepts and processes you are explaining in my brain. You are awesome, Adam!
You are the example for how teaching should be. Just awesome 👍
Wow, thank you!
What a wonderful content you have place in social media.. What a world class personality you... People certainly fall in love with your teaching..
Brilliant teaching style Adam. Very watchable. I particularly like how you explain the background. I've subscribed and will watch more of your videos.
Your videos are great. This is the best channel on RUclips platform to learn about ADF. THANKS 🙏😊
Awesome explanation, the way you teach assuming in layman terms is pretty great, thanks!!
Videos are very much clear to the people who would like to learn and practice.Thanks alot.your hard work is appreciated.
Brilliant tutorial. Easy to follow and it all works like a charm. Thank you!!
Adam, thanks for all your great video's! I appreciate your work very much! Keep up your great work!
My pleasure! Thanks!
Great and simple walk through, good job Adam
Thank you, I appreciate it! :)
Fantastic clear-cut explanation. Nice job!
Glad it was helpful!
Thank you Adam!! These videos are really very helpful and builds the foundation to understand ADF.
My pleasure!
You have depth knowledge in every service. I learn from scratch using your channel. Keep posting Thanks you and God bless you.
Awesome, thanks!
Incredibly simplified to learn. .. Great!!
You are a very good teacher.
Thank you! 😃
The way you explain is super Adam. Really nice
Thanks Adam, I'm waiting like this video on ADF, Please do regularly...
You got it!
Thanks Adam, your tutorials are very useful, hope to see more in the future
Glad you like them! Will do more!
Very professionally demonstrated and very clear to understand. Thank you very much
It's my pleasure Paul! :)
fantastic video Adam!! Really helpful to understand the parametrisation in ADF.
Great to hear that!
Hi Adam
Ur videos are just too brilliant. This is subscription I wouldn’t mind paying to support. Ur lessons are invaluable to learning.
Awesome, thank you!
Wow! What Great video, very easy way step by step tutorials and explanations. Well done!
THANK YOU SO MUCH for this! The step-by-step really helped with what I needed to do.
I was looking for this video. Thanks for making this. It helps a lot. Thanks again.
Great video, easy to follow and to the point, really helped me to quickly get up a running with data factory.
Glad it helped!
This video was really helpful! you have leveled up my Azure skills, Thank you sir, you have gained another subscriber
very simple yet powerful explanation
Glad you think so!
Your videos are awesome man. Gave me a firm grasp and encouraged me to get an azure subscription and play around some more.
That is amazing to hear! Thank you!
I am a beginner in Azure Data Engineering and you made it simple to learn all the tactics.. thanks
Glad to hear that!
your skills are in the tops thanks, love to see your channel grow
I appreciate that!
Thank you so much for the clear and nice explanation, I am new to ADF and learning a lot from your channel
Great to hear!
it was so perfect , I was able to follow and copy data in first attempt .thanks
You are a legend. Next level editing and explanation
Thanks Adam, amazing workshop, very clear and easy to follow, thanks for helping, i am wiser now :)
Perfect! Thank you!
Adam you are just awesome man! The way you are teaching is excellent. Keep it up.. you are the best...
Thanks! 😃
Amazingly simple and informative!
It is extremely hard to find information online about this topic. Thank you for making it easy!
Glad it was helpful! Thanks!
Very well explained. Thank you so much!
Wow this was explained so well. Thank you!!!
Excellent video and knowledge sharing. Great Job!
Glad you enjoyed it!
Very interesting video Adam. I found quite enlightening your idea of storing metadata. probably it could be maintained separately tracking last record loaded so we could use it as an input for delta loads through queries instead of reloading the full table on each run.
You can either use watermark or change tracking patterns check this out docs.microsoft.com/en-us/azure/data-factory/tutorial-incremental-copy-overview?WT.mc_id=AZ-MVP-5003556
You are awesome Adam. Thank you so much for detailed explanation.
My pleasure!
Thank you! I really appreciate all you share, it truly helps me
Awsome adam there cant be a way to explain better than this
You are really best Adam! Your tutorial helped me a lot. Thanks
Happy to hear that!
@@AdamMarczakYT You are welcome. Please keep up the good work.
Great session!! Thanks Adam.
My pleasure!
Thank you so much for sharing these valued knowledge. It's very helpful for me.
Glad it was helpful!
You're awesome Adam, thanks for such a great tutorial. I also tweeted this video. Thanks.!!
Awesome, thank you!
Great content, easy to follow!!
Glad you think so!
Thank you so much for this. It helped a lot
Thank you! It's under appreciated how important it is to name things something other than "demoDataset", but it makes a big difference both for understanding concepts, and maintainability.
Glad it was helpful! You are of course correct, if it's not demo then take care of your naming conventions.
It is really cool that you make it so simple :)
Thank you! 😊
Awesome Adam!! you are the best. Thank you so much
My pleasure! :)
Awesome. Thank you so much Adam!
My pleasure!
Adam, You are the best!. Thanks man!
Thank you :)
These tutorials are so useful!
Glad you like them!
Realy great stuff sir.this what am looking in youtube
Thanks a ton!
Really helpful! you made it very easy!
Glad you think so!
Thank You so much.... Very good explanation, Just Awesome
Thanks for your awesome video, it helped me out a great deal
Glad I could help
Thank you for your great videos, they have been super helpful. I'm working on a proof of concept similar to this video, however it's SQL to Azure SQL. Any links or references you can offer to help me with parameterizing the Azure SQL sink side?
Very nice video with good explanation.
Glad you liked it
Thank you Adam, Very much informatics video
My pleasure!
👍👍👍 very good explanation.. 👍👍.
Hey, one thing about English - please guys correct me if I am wrong, but I am pretty sure what I am talking about - you shouldn't say inside a sentence "how does it work", but "what it works". Despite that, the content is awesome!
You can if you ask a question. "How does it work" is a question structure, not a statement. it should be "how it works" if I'm stating a fact. You wrote "What it works" but I assume that's a typo. It's one of my common mistakes, my English teacher tries to fix it but it is still a common issue for me ;) Thanks for watching!
Hi Adam, very nice work this, I made this for a client of mine and found out one important thing: within the For each not all blocks are executed as if they are working together atomically. What I mean is that if you start two thinks in parallel using the For Each block and within the for each block you have two blocks - say block A and B - connected using parameters (item()) within these blocks say X and Y. Block A starting with item X will not necessarily be using the item X in block B although connected!
So I want to suggest one extra advice to use only one block in a For Each block at max if using parameterazed block within or if you need more than one block start a separate pipeline within the For Each block which will have the multiple blocks. These pipelines will be started as separate childs and to the work in the correct order.
With kind regards,
Jeroen
Hey, not sure I understood what you meant here. Using parameters is not making any connection between the actions.
@@AdamMarczakYT I'm using a for each loop to load tables with dynamic statements, if I need more than one block (like a logging call to sl server, a copy block to load the data and a logging block after being done with loading these blocks can be in the for each loop itself, but if you start in parallel multiple times different load of tables the blocks will not follow each other sequencially, but will be running through each other, so the logging will not belong to the copy block for example. I will see if I can make an example if I find the time. To solve this I start always another pipeline within the for each and put the blocks in this pipeline. This will create child pipelines in the for each loop ensuring the right order of execution of the blocks (logging start, copy and logging end)
Hi Adam, this is awesome session. One question: Where in Azure documentation I can find the information regarding all possible output variable of a given activity e.g. when you were explaining about lookup activity, you talked about the property "firstrow". Where can I find such properties supported by all activities in Azure documentation ?
Thanks! For details always check the docs by googling "ADF action". For lookup this would pop up which explains everything you asked docs.microsoft.com/en-us/azure/data-factory/control-flow-lookup-activity
Great Explanation !!!!
Hey Adam. Great video! Two questions regarding the pipeline itself. 1. How do we approach Source Version Control of the pipeline? In SSIS we could export a package and commit to Git or use TFS. How do we approach versioning in Azure? 2. What is the approach to deploy this pipeline in upper environment? Assuming that this pipeline was created in dev, how do I approach deployment in i.e. UAT?
I think this page describes and answers both of your questions. docs.microsoft.com/en-us/azure/data-factory/continuous-integration-deployment?WT.mc_id=AZ-MVP-5003556 thanks for watching :)
You are very good 👍 explained well thanks 😊
Amazing explanation Adam! Thank you for this! qq- Can the For Each activity run things concurrently? i.e. in this example, can it pass the 3 table_name, schema_name values to the Copy Data activity at the same time?
Hey Adam, awesome work and explanation! Do you have another video explaining how to deal with massive data copies from tables in bulk using ADF and that may resolve issues with maximum data or rows of data? Can you make a video with a demo explaining how to deal with this kind of scenarios that you mentioned that's the story for another day? Thanks a lot in advance!! =D
Thanks. Well, Lookup shouldn't be used for data, but for metadata driven approach, so 5000 rows limit is very good here. It is rare when you will copy over 5000 tables/files with different structure)/etc. If you do you can do different techniques but in those cases I probably would shift approach entirely. Will think about this.
Thanks Adam! very clear!
Glad it was helpful!
dzieki! bardzo pomocne filmy!
Dziekoweczka!
Fine tutorial. Thanks.
Glad it was helpful!
Thanks, this is beautiful.
Very well explained & succinct. One request - if possible create a video for loading ADW (Synapse) data-warehouse by ADF
Thanks! I'm waiting for synapse new workspace experience to be released to make video about it ;)
Hi Adam, I really appreciate you video. Thanks for your videos! I hope you can also create a video for ODBC as data source.
amazing work. thanks.
Thanks for the vidoes. I am letting the adds runs till the end :)
Awesome thank you for your support! ♥
Thank you! well done.
Hi Adam, thanks for the video! I am wondering if it is possible specify just one row from the table by id and copy it? Thanks in advance!
Excellent!!!!!
Many thanks! Cheers!
Very good and clear..
Glad it was helpful!
Thanks for your great video. Is there any limitation in terms of a number of tables, or table size when we use copy multiple in bulk according to your experience?
Great Video Adam!! Simple and elegant. Thanks!!
Quick question: Is it possible to make the copy activities happen in parallel instead of looping sequence.
Also, it would be great if you can make another video showing how to incrementally update data of these table from blob to a sink/SQL Server. May be you have already did it - just couldn't find it.
Yes you can! Actually by defualt for-each runs in parallel unless you select 'sequential' checkbox on it.
@@AdamMarczakYT - Thank you for answering my question.
@@AdamMarczakYT, Hi Adam. Thank you for the awesome videos, I got a requirement to copy parallelly, I got more than 40 tables to copy. Can I know how many tables I can run parallelly at a time? please let me know. Thank you so much.
Good one adam sir .
Thank you :)
Thank you for the great information. I wonder how much of this gui-based development can be replaced by code only. In my experience it is easier to troubleshoot a thousand lines of code than gui settings across 50 forms.
Thanks Adam. Awesome videos for Azure! Do you have one for Azure CI/CD pipeline?
Not yet! I plan to work on this but it's tricky with data factory to get it right. In the meanwhile check out MS article on this docs.microsoft.com/en-us/azure/data-factory/continuous-integration-deployment
Super cool video sir. You are my priority to search about azure after stackoverflow LOL.
Sir, how about the cost ? Is it better using loop or paralel. Coz using loop it makes you have extra running time.
Thanks for your all helpful videos, thank you so much. I have one Query
How can we run a pipeline in parallel to copy data from 5 different Sources to respectively 5 different Targets ? 1 is it possible by passing 5-5 different connection strings (source & target string) ? 2nd Can we have a master pipeline wherein a Foreach activity we can call this One pipeline and run it in parallel for 5 different source and targets movement?
This is really great stuff, thank you for really easy explanation. It looks like you are using some kind of tool to access Azure Portal, what is this tool?
I use only chrome browser. Thanks for checking the video out Rafal! ;)
Hi Adam, how would you add a system/custom column in a bulk copy? For example I want to add a pipeline name, date or the value '1' in a column that is shared on all tables.
There are many ways to do this. Simplest and most similar would be loading data into staging tables and calling stored procedure with merge in it. Then you can apply any additional logic you need.
Greate one!
Awesome video Adam.
I would like to understand the next step on how to loop through the files and load into tables. Do you have a video on that or could you point me to a link with that info?
No video on this, but it's very similar, just use GetMetadata activity instead of the lookup :)