8. Increment Keys from Existing Source Using Mapping Data Flow in Azure Data Factory

23. Send Email alert when Pipeline fails in Azure Data Factory

64. Alter Row Transformation in Mapping Data Flow in Azure Data Factory

Stephen A. & Shannon Sharpe react to Deshaun Watson's season-ending Achilles news | First Take

Selling items on FB Marketplace BUT it's Cake

CHROMAKOPIA VINYL

7. Remove Duplicate Rows using Mapping Data Flows in Azure Data Factory

WafaStudies

Просмотров 55 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 22 окт 2024

Комментарии • 44

@anithasantosh6729 3 года назад ⁺²
Thank for the video . I was trying to use Groupby and rest of the columns as a stored procedure. Your video made my job easy.
@WafaStudies 3 года назад
Welcome 😁
@susmitapandit8785 2 года назад ⁺²
In Output file , Why EmpID is not in sorted format even though we used Sort function?
@rajkiranboggala9722 3 года назад ⁺¹
Well explained!! Thank you. If I have only one csv file and I want to delete the duplicate rows, I guess I can do the same by self union’ing the file, I’m not sure if there’s any other simpler method
@WafaStudies 3 года назад ⁺¹
Thank you 🙂
@mankev9255 3 года назад ⁺¹
Good concise tutorial with clear explanations. Thank you.
@WafaStudies 3 года назад
Thank you ☺️
@nareshpotla2588 Год назад
Thank you Maheer. If we have 2 same records with unique empid you use last($$)/first($$) to get either of one. If we have 3 records like
1,abc
2,xyz
3,pqr. if we use first($$) we will get 1,abc and last($$) will give 3,pqr.How to get the middle one (2,xyz)?
@luislacadena9689 3 года назад
Excellent video, do you think that it is possible to eliminate the values keeping for example the one that has the higher department id/number? I've seen that you kept the first register by using first ($$), but im curious if you can remove duplicates in the RemoveDuplicateRows box based in other criteria. Is it possible to keep only the duplicates with higher department id?
@marcusrb1048 2 года назад
Great video, it's clear. But, what happen with new records? Because If you use an Union table and use only upsert, check only duplicates rows isn't it? I tried same of yours, but new one are removed in the final step. I tried and I figure out an issue for INSERT, UPDATE and DELETE in three separate steps, how could I achieve it? Thanks
@lehlohonolomakoti7828 2 года назад ⁺¹
Amazing video, super helpful, allowed me to remove duplicates from a restapi source and create a ref table inside my db
@WafaStudies 2 года назад ⁺¹
Thank you 😊
@maheshpalla2954 3 года назад ⁺¹
How do you know which function to use since we are not sure about duplicate rows if we have millions of records in Source??
@PhaniChakravarthi 4 года назад ⁺¹
Hi, thank you for the sessions. They are wonderful. Just have a query, can you make any video on identifying the DELTA change between two data sources and capture only the mismatched records with in ADF?
@WafaStudies 4 года назад ⁺³
Sure. I will plan one video on this.
@benediktbuchert9002 2 года назад
You could use a window function and mark all duplicates, and then use a filter and filter them out.
@EmmaSelma 3 года назад
Hello Wafa,
Thank you so much for this tutorial, it's very helpful. New subscriber here.
Thinking of scenarios to use this, I have a question please : Is it correct to use this to get last data from ODS to DWH in the case of a full load (only insertion occuring in ODS and no truncate) just like row partition by ?
Thank you Upfront.
@gsunita123 4 года назад ⁺³
Data in Output Consolidated CSV is not sorted on EmployeeID , we did use the Sort before the Sink , then why the data is not sorted ?
@Anonymous-cj4gy 3 года назад
Yes, it is not sorted. Same thing happened with me
@Aeditya 2 года назад
Yeah it's not sorted
@rohitkumar-it5qd 2 года назад
How do I update the records in the same destination , the updated record and the new record without having any duplicates on ID. PLEASE SUGGEST.
@Anonymous-cj4gy 3 года назад ⁺¹
in the output file data is still not sorted, if you see it. same thing happen with me also. even after using sort - data is still unsorted.
@ACsakvith Год назад
Thank you for the nice explanation
@swapnilghorpadewce Год назад
Hi, I am trying to bulk load multiple json files to cosmosDB. Each json file contains json array 5000 objects. total data size is around 120 GB.
have used "copy data" with "foreach" iterator It is throwing error for respective file but inserts some records from file.
I am not able to skip incompatible rows. also, not able to log skipped rows. have tried all available options. Can you please help?
@kajalchopra695 3 года назад
How we can optimize the cluster start up time. Basically it is taking 4m 48 sec to start a cluster. So how i can reduce that?
@AkshayKumar-ou8in 2 года назад ⁺¹
thank you for the video, very good explanation
@WafaStudies 2 года назад
Welcome 😊
@battulasuresh9306 2 года назад
What if we wanna remove both columns
Point2 what if u wanna specifically want in middle of a row saying latest modified date column like that
@arifkhan-qe4td 2 года назад
Aggregates is not allowing me add $$ as an expression. Any suggestions pls.
@karthike1715 Год назад
Hi,I have to check all the colum duplicate and how to handle in aggregate activity, please help me
@pachinkosv Год назад
I don't want to import a few columns to datatable, how is it done?
@krishj8011 2 года назад ⁺¹
great tutorial...
@WafaStudies 2 года назад
Thank you ☺️
@MigmaStreet Год назад
Thank you for this tutorial!
@vishvesbhesania7767 3 года назад
why data is not sorted in output file ? even after using sort transformation in data flow.
@DataForgeAcademy 2 года назад
Why you didn't use sort function with remove duplicate option?
@vishaljhaveri7565 Год назад
In the aggregation step, choose column pattern and write name!=columnnameonwhichyougroupedby -> Basically this will filter out all the columns which are mentinoed in GroupBy step and will perform aggregation on the rest of the other columns. Write $$ if you don't want to change the name of the main column and write first($$) or last($$) as per your requirement.
@MrSuryagitam 3 года назад
If we have multiple files in adf then how to remove dublicate files in adf in single time
@varun8952 Год назад
super
@paulhepple99 3 года назад
Great Vid - thnx
@isanayang6338 4 года назад ⁺¹
Can you speak slowly and clearly?
@WafaStudies 4 года назад ⁺¹
Sure. Thank for feedback. I will try to improve on it.
@isanayang6338 4 года назад
Your strong accent makes it so difficult to understand you.
@kumarpolisetty3048 4 года назад
Suppose if we have more than 2 records for one empid. And if i want to take Nth record , how can i do that ?

Следующие

Автовоспроизведение

8. Increment Keys from Existing Source Using Mapping Data Flow in Azure Data Factory

8. Increment Keys from Existing Source Using Mapping Data Flow in Azure Data Factory

23. Send Email alert when Pipeline fails in Azure Data Factory

23. Send Email alert when Pipeline fails in Azure Data Factory

64. Alter Row Transformation in Mapping Data Flow in Azure Data Factory

64. Alter Row Transformation in Mapping Data Flow in Azure Data Factory

Stephen A. & Shannon Sharpe react to Deshaun Watson's season-ending Achilles news | First Take

Stephen A. & Shannon Sharpe react to Deshaun Watson's season-ending Achilles news | First Take

Selling items on FB Marketplace BUT it's Cake

Selling items on FB Marketplace BUT it's Cake

CHROMAKOPIA VINYL

CHROMAKOPIA VINYL

When Arby shows Shayla the DNA test results!

When Arby shows Shayla the DNA test results!

1. Handle Error Rows in Data Factory Mapping Data Flows

1. Handle Error Rows in Data Factory Mapping Data Flows

18. Azure Data Engineer Interview Question & Answer Remove Duplicate Rows in Azure Data Factory

18. Azure Data Engineer Interview Question & Answer Remove Duplicate Rows in Azure Data Factory

10. Log Pipeline Executions to file Using Mapping Data Flows in Azure Data Factory

10. Log Pipeline Executions to file Using Mapping Data Flows in Azure Data Factory

22. Merge Multiple rows in to Single row using Mapping data flow in Azure Data Factory

22. Merge Multiple rows in to Single row using Mapping data flow in Azure Data Factory

Delete Duplicates in Data Factory | Window Transformation Usage | RowNumber function | Deduplication

Delete Duplicates in Data Factory | Window Transformation Usage | RowNumber function | Deduplication

17. Slowly Changing Dimension(SCD) Type 2 Using Mapping Data Flow in Azure Data Factory

17. Slowly Changing Dimension(SCD) Type 2 Using Mapping Data Flow in Azure Data Factory

Azure Data Factory Part 8 - ADF Data Flow Joins

Azure Data Factory Part 8 - ADF Data Flow Joins

ADF Mapping Data Flows: Distinct Rows

ADF Mapping Data Flows: Distinct Rows

18. Copy multiple tables in bulk by using Azure Data Factory

18. Copy multiple tables in bulk by using Azure Data Factory

Cool Wrap! My Book is OUT 🥳

Cool Wrap! My Book is OUT 🥳

ТЕСЛА КИБЕРТРАК x WYLSACOM / РАЗГОН

ТЕСЛА КИБЕРТРАК x WYLSACOM / РАЗГОН

Fake Referee Whistle Moments 😅

Fake Referee Whistle Moments 😅

Старый уролог подсказал: даже в 80 лет недержание пройдет навсегда, если…

Старый уролог подсказал: даже в 80 лет недержание пройдет навсегда, если…

Сова Ёль протестует против поездки в Москву @yoll

Сова Ёль протестует против поездки в Москву @yoll

Цените друг друга)часть1. тгк-Колян Карелия #юмор #прикол #roblox

Цените друг друга)часть1. тгк-Колян Карелия #юмор #прикол #roblox

【斗罗大陆】唐老六电量不足，谁能来救救她啊！#斗罗大陆#唐老六#小舞

【斗罗大陆】唐老六电量不足，谁能来救救她啊！#斗罗大陆#唐老六#小舞