Data Factory Tutorial - How to add an Pre Copy Script

Поделиться
HTML-код
  • Опубликовано: 16 окт 2024
  • Learn how to add an Pre-copy script to your pipelines in this Data Factory Tutorial. In this video I show you how to truncate a table using an Pre-copy script within a copy activity. This is useful if you are loading data to a staging table that will then be processed further, we don't want to just keep adding data and a Pre-copy script can be used to truncate the sink or target table.
    I also discuss adding permissions for the principal that Data Factory runs under to the database so that it can truncate tables in the database by granting alter permission on the object.

Комментарии • 13

  • @RenukaDeviPadmanaban-s3c
    @RenukaDeviPadmanaban-s3c 7 месяцев назад +2

    I want to truncate the stage table only once, before the For loop start. how to write that in precopyscript?

    • @BeardedDevData
      @BeardedDevData  7 месяцев назад

      I would recommend using a script activity.

  • @NawazShareefMohammed
    @NawazShareefMohammed 9 месяцев назад +1

    how to truncate multiple tables, if i use multiple truncate statetements it is copy only one and the rest are left with 0 records

    • @BeardedDevData
      @BeardedDevData  9 месяцев назад

      The first thing you need to consider is do you want to truncate all tables or one at a time, this is important to consider because if you get any failures you will be left with blank tables, there are a couple of approaches to copying multiple datasets, the first is multiple copy activities, these can either be dependent in a chain or ran in parallel, the second option is metadata driven Pipelines where you can store the tables to copy within a table then call that table and look through.

  • @RenukaDeviPadmanaban-s3c
    @RenukaDeviPadmanaban-s3c 7 месяцев назад +1

    how can we write a precopyscript for truncating the Stage table once before the For loop start.

    • @BeardedDevData
      @BeardedDevData  7 месяцев назад

      I would recommend using a script activity.

  • @kreds7733
    @kreds7733 Год назад +1

    My trigger does not fail, but it does not truncate. Data keeps duplicating. Please help.
    I am using following in prescript.
    TRUNCATE TABLE [dbo].[TableName];

    • @BeardedDevData
      @BeardedDevData  Год назад

      That's unusual, if it was a permissions problem, we should see an error when trying to run the pipeline. Can you try running the truncate statement manually in a tool like SSMS and check that it actually clears the data?

    • @kreds7733
      @kreds7733 Год назад

      @@BeardedDevData Yes, truncate works if I run it manually. But not in prescript.

  • @neeraj_dama
    @neeraj_dama 2 года назад +1

    Nice video. can we write pre query query like -> delete from tableName where date=$$someDate ?

    • @BeardedDevData
      @BeardedDevData  2 года назад +1

      Yes I'm not aware of any limitations, as long as the Data Factory has permission to perform the operation on the table and the query is valid, it will run.

  • @fernastyi
    @fernastyi 2 года назад +1

    You should explain better that this table is a dimensional table and not a fact table.
    Given that Customer Table in Csv is a table which is not always refreshed and does not have incremental data or events data, that is why you use a truncate table, because you are "Updating" this table.
    But what it you have a fact table that comes from an API with a start_date and end_date , for example from Microsoft_Graph API ?

    • @BeardedDevData
      @BeardedDevData  2 года назад +1

      Appreciate that, what I did try to explain is that this data is going through staging; validation, cleaning and transformations before landing in the final table. That process can happen with both dimensions and fact tables, we would truncate the staging tables on each load, it's only on the insert into the final table that we will apply an SCD or store the event. Of course if you need to keep the data then you shouldn't truncate.