Hi Annu, thanks for detailed explanation. Can you also please create a video on how to copy all files from all subfolders in one ADLS location and store it with proper name in other ADLS folder. Thanks in advance.
@@himanshuarora6822 hey, I Copy Activity, in the Source tab, use the "Wildcard File Path" option and use * in the last box(filename box). Ensure your source dataset points to the folder but not to any specific file, & "Recursive" is enabled in the Source tab of Copy Activity this will load as is structure of the source container/folder.
Hi Annu. please can you do a video on parameters in very detailed explanation? As parameters are created at different level that is @linkedservices, dataset level, and while running pipeline level. And I have a doubt when to use which one?
Hello Annu,very nice explanation,i am doing same copy activity ,but i have multiple folders and having files in it those need to copied from source to destination,in my case source is output of filter activity,could you pls help me on this .
Thank you for explanation. I had following doubt. as you mentioned you are copying all the files at once by using the batch count. In that case copy activity need to be executed only once to copy four files. But it is showing copy activity executed four times. How it will be possible. Correct me if I am wrong.
But here we are fetching each file names and iterating over it and copying it . Here also single copy activity is used but since it's inside loop,it will copy each file in each iteration
Hi Annu - thanks for the content. Why we have to create two source datasets (one for get metadata and one for copy), can’t we do it one source dataset?
The dataset used in getmetadata activity is pointing to folder so that we can retrieve all the filenames inside the folder. The one in copy acitivity needs to be pointed to files dynamically, so we have parameterized that dataset. So, yes we need two different datasets here. To achieve the same requirement in most optimized way, you can use wildcard pattern in ADF. Watch out this video: ruclips.net/video/zwey2ZVeROg/видео.html
Hi Annu - Could you please explain how to add delete activity here to delete successfully processed files from source folder ? (Only files , not folder and folder should be present after pipeline execution) - btw I am adding my query here from other video as it’s more relevant to ask my query here
Hi Annu, Very informative video. My question is related to merging of multiple files into one files, so lets suppose in your case at source side you have 4 files and I want to club those into 1 one file. Could it be possible using the copy behavior merge file options as you are iterating the data one by one, so will merge file work. Please suggest the approach as mentioned. Thanks
Hi Shikhar Yes merge files option in copy behavior will do the job provided the schema of all the files are same.. No need to iterate through the files..point the source to folder and sink to file
Thanks for the reply. So you mean to say there is no need of having a for loop directly we can do the same with get metadata activity and the output we can directly pass as an input in adf. Is my assumption correct?
@@100shikhargupta yes no need of get metadata as well.. Only one copy will do the job.. Point the source dataset to folder level, select file path type as 'wild card file path' and debug the pipeline
I'm looking to configure a ADF job to Copy filenames at source based on wildcard to be transferred to seperate folders. The job will take files from source container using Copy function and transfer them to another container to their respective folders. For example. Source Container A Directory -fileA-20240201.txt -fileB-20240302.txt -fileB-20230201.txt -fileC-20230201.txt Sink Container B Directory -Folder_A -Folder_B -Folder_C transfer fileA* using wildcard and transfer those files to Folder_A. transfer fileB* using wildcard and transfer those files to Folder_B. transfer fileC* using wildcard and transfer those files to Folder_C. Looking for help on config details on how I would set up the ADF job.
Hi. I think we can achieve this like 1.Get metadata to get all the files from the folder 2.run a for loop 3.Add a copy data activity .The source will be the item from the loop and in the sink the file name will be dynamic 4.In sink the dynamic expression for the folder will be like this @if(contains(A),'Folder_A', if(contains(B),'Folder_B','Folder_C'))
Very nice explanation and walk-through. I will look for more of your videos.
Hello Annu mam, I am new to Azure.your video is very helpful for me. Thank you so much , very good contents
Welcome 🙂
Nice method to copy data in Azure
Good explanation Annu
Keep up the good work😊✌️
Thankyou
Thank you. This video made things easy 👍🏿
Good explanation beta☺
Thank you Maa 😊
Hi Annu, thanks for detailed explanation.
Can you also please create a video on how to copy all files from all subfolders in one ADLS location and store it with proper name in other ADLS folder. Thanks in advance.
I have tried everything, but I am failing to achieve this, please guide
@@himanshuarora6822
hey, I Copy Activity, in the Source tab, use the "Wildcard File Path" option and use * in the last box(filename box). Ensure your source dataset points to the folder but not to any specific file, & "Recursive" is enabled in the Source tab of Copy Activity
this will load as is structure of the source container/folder.
good information.
Thankyou 😊
Very informative. Just a query, can we copy images and video files from cloud storages(AWS/Google) to ADLS Gen 2 using ADF?
Hi Annu. please can you do a video on parameters in very detailed explanation? As parameters are created at different level that is @linkedservices, dataset level, and while running pipeline level. And I have a doubt when to use which one?
Thank you +1 to your follower mam
Hello Annu,very nice explanation,i am doing same copy activity ,but i have multiple folders and having files in it those need to copied from source to destination,in my case source is output of filter activity,could you pls help me on this .
Thank you for explanation. I had following doubt. as you mentioned you are copying all the files at once by using the batch count. In that case copy activity need to be executed only once to copy four files. But it is showing copy activity executed four times. How it will be possible. Correct me if I am wrong.
If you use wildcard ,there's no need to iterate using for each and a single copy activity can perform the same job for multiple files at once
But here we are fetching each file names and iterating over it and copying it . Here also single copy activity is used but since it's inside loop,it will copy each file in each iteration
Can we simply use wildcards if we want to copy all files?
yes , here is the video: ruclips.net/video/zwey2ZVeROg/видео.html
Hi Annu - thanks for the content. Why we have to create two source datasets (one for get metadata and one for copy), can’t we do it one source dataset?
The dataset used in getmetadata activity is pointing to folder so that we can retrieve all the filenames inside the folder. The one in copy acitivity needs to be pointed to files dynamically, so we have parameterized that dataset. So, yes we need two different datasets here. To achieve the same requirement in most optimized way, you can use wildcard pattern in ADF. Watch out this video: ruclips.net/video/zwey2ZVeROg/видео.html
Hi Annu - Could you please explain how to add delete activity here to delete successfully processed files from source folder ? (Only files , not folder and folder should be present after pipeline execution) - btw I am adding my query here from other video as it’s more relevant to ask my query here
Hi Annu,
Very informative video.
My question is related to merging of multiple files into one files, so lets suppose in your case at source side you have 4 files and I want to club those into 1 one file.
Could it be possible using the copy behavior merge file options as you are iterating the data one by one, so will merge file work.
Please suggest the approach as mentioned.
Thanks
Hi Shikhar
Yes merge files option in copy behavior will do the job provided the schema of all the files are same.. No need to iterate through the files..point the source to folder and sink to file
Thanks for the reply.
So you mean to say there is no need of having a for loop directly we can do the same with get metadata activity and the output we can directly pass as an input in adf.
Is my assumption correct?
@@100shikhargupta yes no need of get metadata as well.. Only one copy will do the job.. Point the source dataset to folder level, select file path type as 'wild card file path' and debug the pipeline
@@100shikhargupta I will create a detailed video soon thanks
If this is the case , let me try today and I will update the same here.Many thanks
Can we do the same using AZcopy?
If yes please share the steps
can you provide the all files table you used in example?
I'm looking to configure a ADF job to Copy filenames at source based on wildcard to be transferred to seperate folders. The job will take files from source container using Copy function and transfer them to another container to their respective folders. For example.
Source Container A
Directory
-fileA-20240201.txt
-fileB-20240302.txt
-fileB-20230201.txt
-fileC-20230201.txt
Sink Container B
Directory
-Folder_A
-Folder_B
-Folder_C
transfer fileA* using wildcard and transfer those files to Folder_A. transfer fileB* using wildcard and transfer those files to Folder_B. transfer fileC* using wildcard and transfer those files to Folder_C.
Looking for help on config details on how I would set up the ADF job.
Hi.
I think we can achieve this like
1.Get metadata to get all the files from the folder
2.run a for loop
3.Add a copy data activity .The source will be the item from the loop and in the sink the file name will be dynamic
4.In sink the dynamic expression for the folder will be like this @if(contains(A),'Folder_A', if(contains(B),'Folder_B','Folder_C'))
thank you
THanks muh
can we do it by selcting wildcard *
Yes here is more details : ruclips.net/video/zwey2ZVeROg/видео.htmlsi=K3P6O7EtBxVPRJyT