Hi Anu, I have been watching all your videos. Your explanation is really extraordinary. In this video, I couldn't understand one thing. Durig runtime, why does it create a new file with current time?
How only one new file come 9:25 your active flag =1 in first lookup this is only for one row in metadatatable which having flag 1.. And what about incremental load ...
Hi Anuj There are 3 instances of copy activity and 3 of get metadata activity, proportional to number of active tables. One file in each folder got created. So in total 3 files in 3 seperate folders. Next video will be about incremental load. Thanks
Hi Thanks for the video, I need a suggestion, As we are using pre-copy script to truncate the previously loaded table data. My query is what if copy data fails due to some unexpected reason. We either have the old data in the table and new data. any suggestion on this scenario please
Hi Pavan.. Thankyou for watching the video. This is a very good question indeed. Here u can use the concept of transfer schema..i wl try to create video related to that in near future..Suppose your target table is stg.table1.. So first of all before starting ETL, introduce a dummy schema say test. Create table test.table1 Copy the data to test.table1 and then transfer schema between test and stg.. Again before loading data, do not truncate stg, only truncate test schema, then load to test and transfer schema between test and stg..
@@azurecontentannu6399 I am wondering with this much fast response🙂 Thanks for the quickest response. Your suggestion works fine with 100% perfection in my scenario. But as a cost optimization I have another dependent query here. What if we have huge volume of data say like billions of records with huge number of tables Having staging and test table is kind of duplication which is again costs a cost aspect. Any optimization here. And is there any difference between dedicated pool and synapse analytics. When I use dedicated pool I am getting connection issue. Where in synapse analytics is working fine but only with bulk insert but not with poly base and copy command.
Hi @@PavanKumar-eu5qg There is no duplication of data going to happen.. It's just transferring schema.. Not copying the data. You wl just have one set of data. First run: stg-0 record create test table (0 record) .. Load data to test (1M records) then transfer schema to stg Stg- 1M record Test- 0 record There are two kinds of sql pool- serverless and dedicated.. U can't have physical tables in serverless
@@PavanKumar-eu5qg Syntax is same as sql server. To change the schema of any object, use the following syntax: alter schema [new_schema] transfer [old_schema].[object_name];
Thanks for the video! 1 question though. Let's say we receive 100s of files every day. So as time passes, will not it be inefficient to get the max timestamp for every ingestion? Can we use a landing zone and pick the files from there for ingestion/copy and once we are done, we can move the files to cold storage. Thus the landing zone will always have unprocessed files only. Please share your views on this.
Hi Shahinur Yes you are right! This approach is good for small transactions. Most efficient way would be to make the files replaceable by overwriting with the latest one on each load. There can be multiple ways of doing things. This is just one of the ways.
really great content and explanation .
Really nice explanations. Thanks
Great work. Keep up the good work Annu✌️
Welcome back with a bang!! Hope to see more and more real time videos in ADF, Thanks @Annu
Thanks Madhur ! :)
Please add videos frequently. I am waiting for playlist completion.
Hi Anu, I have been watching all your videos. Your explanation is really extraordinary.
In this video, I couldn't understand one thing. Durig runtime, why does it create a new file with current time?
🔥🔥🔥
If we want to load all tables from Adls to dedicated SQL, what will change in the pipelines?
In the control table, we have a column isActive, for all the tables , you can set IsActive='Y'
How only one new file come 9:25 your active flag =1 in first lookup this is only for one row in metadatatable which having flag 1..
And what about incremental load ...
Hi Anuj
There are 3 instances of copy activity and 3 of get metadata activity, proportional to number of active tables. One file in each folder got created. So in total 3 files in 3 seperate folders. Next video will be about incremental load. Thanks
Hi Thanks for the video, I need a suggestion, As we are using pre-copy script to truncate the previously loaded table data. My query is what if copy data fails due to some unexpected reason. We either have the old data in the table and new data. any suggestion on this scenario please
Hi Pavan..
Thankyou for watching the video.
This is a very good question indeed. Here u can use the concept of transfer schema..i wl try to create video related to that in near future..Suppose your target table is stg.table1.. So first of all before starting ETL, introduce a dummy schema say test. Create table test.table1
Copy the data to test.table1 and then transfer schema between test and stg.. Again before loading data, do not truncate stg, only truncate test schema, then load to test and transfer schema between test and stg..
@@azurecontentannu6399 I am wondering with this much fast response🙂 Thanks for the quickest response.
Your suggestion works fine with 100% perfection in my scenario.
But as a cost optimization I have another dependent query here.
What if we have huge volume of data say like billions of records with huge number of tables
Having staging and test table is kind of duplication which is again costs a cost aspect. Any optimization here.
And is there any difference between dedicated pool and synapse analytics.
When I use dedicated pool I am getting connection issue. Where in synapse analytics is working fine but only with bulk insert but not with poly base and copy command.
Hi @@PavanKumar-eu5qg
There is no duplication of data going to happen.. It's just transferring schema.. Not copying the data. You wl just have one set of data.
First run: stg-0 record
create test table (0 record) .. Load data to test (1M records) then transfer schema to stg
Stg- 1M record
Test- 0 record
There are two kinds of sql pool- serverless and dedicated.. U can't have physical tables in serverless
Can you please guide how to transfer the schema in synapse
@@PavanKumar-eu5qg
Syntax is same as sql server.
To change the schema of any object, use the following syntax:
alter schema [new_schema] transfer [old_schema].[object_name];
Hi,
Is there any way we can load partitioned data from ADLS to Dedicated SQL Pool using Synapse?
Thanks for the video! 1 question though. Let's say we receive 100s of files every day. So as time passes, will not it be inefficient to get the max timestamp for every ingestion?
Can we use a landing zone and pick the files from there for ingestion/copy and once we are done, we can move the files to cold storage. Thus the landing zone will always have unprocessed files only. Please share your views on this.
Hi Shahinur
Yes you are right! This approach is good for small transactions. Most efficient way would be to make the files replaceable by overwriting with the latest one on each load. There can be multiple ways of doing things. This is just one of the ways.