How to Identify and Delete Duplicate Records in SQL | Easy Explanation with Real Time Scenarios

Поделиться
HTML-код
  • Опубликовано: 28 дек 2024

Комментарии • 81

  • @VikashKumar0409
    @VikashKumar0409 4 дня назад

    watching this lec exactly after 2 years in December,helpful concept

  • @avi8016
    @avi8016 2 года назад +2

    Great video sir💯
    Most of the time I have faced duplication issues with entire row
    So I used row_number to take care of that

  • @soumyagobbani5336
    @soumyagobbani5336 Год назад +1

    Great explanation on different approaches of deleting duplicates.
    How to delete duplicates without creating backup table when there are duplicates with exact same rows. Below would give an error saying cte does not exist.
    with cte as (select *, row_number() over(partition by emp_id order by emp_id) rwn from employee)
    delete from cte where rwn > 1

    • @Rekha_m_angadi
      @Rekha_m_angadi 9 месяцев назад +2

      cte is a temporary table, that you have created in this query, it doesnt have existence unlike employee table, thats why you cant delete records from it

    • @pr-rb6jl
      @pr-rb6jl 2 месяца назад

      @@Rekha_m_angadi But it worked for me with CTE ...i was using MSSMS.

    • @Rekha_m_angadi
      @Rekha_m_angadi 2 месяца назад

      Not sure if its bcoz of Mssms it worked ..but conceptually CTE is a temporary table to view right? How would delete work on a temp table.. may be @ankit can respond on this..​@pr-rb6jl

  • @ApoorvaSharma-x8b
    @ApoorvaSharma-x8b 2 месяца назад +1

    Very well explained 😊

  • @tusharshivhare9651
    @tusharshivhare9651 Год назад +5

    Hello sir,
    Can we use this approach also to remove multiple duplicates
    with cte as(
    select *,
    row_number() over(partition by emp_id order by timestamp desc) as rn
    from employee)
    delete from cte where rn>1

    • @mayankgupta7548
      @mayankgupta7548 Год назад +1

      No you deleting from cte
      Cte are views only

    • @pr-rb6jl
      @pr-rb6jl 2 месяца назад

      @@mayankgupta7548 No Bro i was thinking same but it gets deleted from table only ..i have verified it by doing myself on table data....the rows gets deleted from table as well .....

    • @pr-rb6jl
      @pr-rb6jl 2 месяца назад

      yes you can use this approach ...this will work..

    • @wonderoverload0
      @wonderoverload0 4 дня назад

      @@mayankgupta7548 CTEs are also temporary results just like subqueries but stored separately on top of your main query. Using CTE to delete duplicates will work perfectly and will delete duplicate records from the table.

  • @ShreyaMishra0307__
    @ShreyaMishra0307__ Год назад +1

    Great Video Sir !!
    Also, for SQL Server, the best approach would be Row_Number(), that will solve both the problems.

  • @kishanthacker9999
    @kishanthacker9999 10 дней назад

    Great video though 1 question and a possible video request:
    What if we want to remove pure duplicates without creating backup table? Deleting from cte is not reliable across DBMS as per other's comments it is working in few DBMS and not in other. Please confirm 1 robust method of deleting duplicates by creating a back and not creating a back up which can work across most of the popoular DBMS like MySQL,SQL server,Big Query etc Thanks

  • @sudarsanrout5787
    @sudarsanrout5787 Год назад +2

    Great video sir❤

  • @souravsinha5330
    @souravsinha5330 Год назад +2

    Thanks a lot greatly explained...

  • @sarathmaya6083
    @sarathmaya6083 2 года назад +2

    Nice explanation. May i know which IDE you are using?

  • @khushishrivastava331
    @khushishrivastava331 2 года назад +1

    Thank you for making such insightful videos on sql. Please advise some do's and don'ts in an interview or how to approach a problem statement.

    • @ankitbansal6
      @ankitbansal6  2 года назад +1

      Noted

    • @shubhamgoyal3358
      @shubhamgoyal3358 Год назад

      @@ankitbansal6 Instead of giving links of so many videos try to give the code script

  • @Artouple
    @Artouple 2 года назад +2

    Great approach!

  • @ankimedia5225
    @ankimedia5225 Год назад

    Please provide optimised query for to detect duplicates when we will having million rows and want query to run fast

  • @manishasaxena9829
    @manishasaxena9829 Месяц назад

    So, each null is treated as a diff value, how can we remove duplicate null from a table? Because when we use the row_number() logic it won't keep the duplicate nulls in single partition, right?

  • @sauravrajchaudhary9036
    @sauravrajchaudhary9036 Год назад +2

    Nice explnation

  • @florincopaci6821
    @florincopaci6821 2 года назад +2

    MERGE INTO your_table tgt
    USING (
    SELECT ROWID AS rid, name,
    ROW_NUMBER() OVER (
    PARTITION BY name
    ORDER BY ROWNUM
    ) As rn
    FROM your_table
    ) src
    ON (src.rid = tgt.ROWID AND src.rn > 1)
    WHEN MATCHED THEN
    UPDATE
    SET name = name
    DELETE WHERE 1 = 1;
    Unfortunatlly this will not work in Sql Server or at last me i couldn t make it work but in Oracle work like a charm.
    Happy Holydays.

  • @sandipjamdade4564
    @sandipjamdade4564 Год назад +1

    Thank You So So Much Ankit for putting in such great efforts....💯💯 But I have one Question How to delete pure duplicates without taking a backup of the table? Is It Possible?

    • @ankitbansal6
      @ankitbansal6  Год назад +1

      There is no standard way of doing it. Depending on databases it can be done.

  • @akshayshirbhate7
    @akshayshirbhate7 Год назад

    Hey, what about if i have duplicate record but its id is different. In that case how would we delete that record?

  • @subhashdixit5167
    @subhashdixit5167 2 года назад

    Please provide create table and insert statement like you used to provide in your odl videos

    • @ankitbansal6
      @ankitbansal6  2 года назад

      There is nothing in the table. You can create table and data as per the video

  • @sahilummat8555
    @sahilummat8555 Год назад +1

    Hello Ankit
    Just wanted a little help . How is the course different from all the videos you have put . What additional knowledge will we get ?

    • @ankitbansal6
      @ankitbansal6  Год назад

      The course is in a structured manner covers from basic to advanced all the concepts along with assignments. Also covers 2 projects and premium subscriptions to practice SQL problems.

    • @sahilummat8555
      @sahilummat8555 Год назад

      @@ankitbansal6 when will the next batch start and is live batch any different from the recorded one you currently have on the website

  • @mbkamath3720
    @mbkamath3720 2 года назад +1

    Mind blowing🤯🤯🤯..!
    Why using the alias A when you havnt used it anywhere like A.empid or A.salary??
    This doubt follows me evrywhere..😓😔😔

    • @ankitbansal6
      @ankitbansal6  2 года назад

      In SQL server it's mandatory to give aliases to a sub query.

  • @mbkamath3720
    @mbkamath3720 2 года назад +4

    Please give tips on not going blank during intrvws. The intrvwer acts hurry hurry and i end up saying 'i dont know' even though later i realise i knew the answer. So many opportunities missed due to this fear.😓😔😔😔

    • @rishav144
      @rishav144 2 года назад +4

      Maybe u need to practice more questions before Interview ...For hurry , You can tell interviewer to give u 1 minute to think ... Ask him " May i think for a minute " and try to recall the answer in ur mind... and then answer accordingly ....This worked for me in 6-7 interviews of Data Analyst

    • @skillup638
      @skillup638 2 года назад

      @@rishav144 hi rishav what are the Major skills required for data analyst and how to apply for companies ?

    • @LoveIndia3
      @LoveIndia3 Год назад

      i thought I am the only one who is facing this problem..i jumbled even after knowing the answers which make bad impression to interviewers which lead to rejection unnecessarily

    • @muralidhar9266
      @muralidhar9266 Год назад

      It's same with me also. knewing the right answer ,still can't recall at that moment, mind runs in hurry to impress or something else

  • @rahulgautam511
    @rahulgautam511 Год назад

    Ankit, your videos are just fabulous my friend. I am using SQL Server, While deleting duplicate records with multiple columns I am getting an error ("An expression of non-boolean type specified in a context where a condition is expected, near ','). Is it because that ,SQL Server doesn't support multiple column subqueries? Please confirm.
    Thanks in advance..

    • @ankitbansal6
      @ankitbansal6  Год назад

      Yes it doesn't support multiple columns. You can concatenate the 2 columns and then use in filter

  • @yuktibhatnagar9955
    @yuktibhatnagar9955 Год назад

    With cte as
    (Select*, row_number() over(partition by emp_id order by create_timestamp asc) as rank from employee a)
    Select emp_id, max(create_timestamp)
    From cte
    Group by emp_id;

  • @rajat1999gupta
    @rajat1999gupta 4 месяца назад

    Hello Ankit, is there a way to delete pure duplicate record without taking backup of original table. Please help me with this. Thanks in advance.

  • @chennaivibes8428
    @chennaivibes8428 2 года назад

    Hi can u provide the database you use so that it will be useful for practicing

    • @ankitbansal6
      @ankitbansal6  2 года назад

      You can practice on any database. Just create the table and insert records

  • @paragkelkar8538
    @paragkelkar8538 2 года назад

    I am getting this error "An expression of non-boolean type specified in a context where a condition is expected, near ','." after running query
    delete from EMP2 where (Emp_id,create_date) in (select Emp_id, min(create_Date) as time_c from EMP2 group by Emp_ID having count(1)>1)
    Please comment what needs to be update in this

    • @ankitbansal6
      @ankitbansal6  2 года назад

      Is it SQL server?

    • @paragkelkar8538
      @paragkelkar8538 2 года назад

      @@ankitbansal6 Yes

    • @ankitbansal6
      @ankitbansal6  2 года назад

      @@paragkelkar8538 ok so in SQL server you can not filter on 2 columns as tuple. Use this syntax using inner join
      DELETE w
      FROM WorkRecord2 w INNER JOIN Employee e ON EmployeeRun=EmployeeNo
      Do join on the 2 columns

  • @kundankumar5520
    @kundankumar5520 Год назад

    Hi Sir, thanks again for wonderful video, I tried deleting in workbench platform but it is not allowing me to delete. I have uncheck the safe mode as well. Error: You can't specify target table 'employee_new' for update in FROM clause. Same query I am using it:
    delete from employee_new where (emp_id, create_timestamp) = (select emp_id, min(create_timestamp) as create_timestamp from employee_new
    group by 1
    having count(1) >1);

  • @sreemalapal
    @sreemalapal 9 месяцев назад

    Hi..Can you share the name of the application you are using..This is not MS SQL Server

  • @patilnaveenkumarreddy4059
    @patilnaveenkumarreddy4059 Год назад +1

    thankyou

  • @mohdmahtab8991
    @mohdmahtab8991 2 года назад

    Please sir strat SQL courses from basic to Advance.

    • @ankitbansal6
      @ankitbansal6  2 года назад

      Join here www.namastesql.com

    • @mohdmahtab8991
      @mohdmahtab8991 2 года назад

      @@ankitbansal6 i checked it out but you don't list Normalization topic in your courses

  • @reachrishav
    @reachrishav 2 года назад +1

    Please continue the tutorials in SSMS, this editor does not seem to go well

    • @ankitbansal6
      @ankitbansal6  2 года назад +1

      Ok. Thanks for the feedback 😊

  • @mohitupadhayay1439
    @mohitupadhayay1439 2 года назад +1

    Not the best video in terms of video production but well explained a few things.

    • @ankitbansal6
      @ankitbansal6  2 года назад

      Agree with you. Was trying a new IDE but it did not go well.

  • @harineralla5366
    @harineralla5366 10 дней назад

    Why your used this tool Ankit....It was bit late for execution process

  • @chriskeo392
    @chriskeo392 2 года назад +1

    Accent too heavy , thanks for content tho

  • @girdhar3224
    @girdhar3224 Год назад

    somewhat confusing...

  • @iamneeljadhav
    @iamneeljadhav Год назад

    Good SQL problems but explanations not that good. Will love to watch if you speak in language you are comfartable to explain

  • @addhyasumitra90
    @addhyasumitra90 5 месяцев назад +1

    create table employee (
    emp_id int,
    emp_name varchar(20),
    salary int,
    create_timestamp time);
    INsert into employee values
    (1, 'Ankit', 20000, GETDATE()),
    (2, 'Rahul', 20000, GETDATE()),
    (3, 'Agam', 30000, GETDATE()),
    (1, 'Ankit', 15000, GETDATE())

  • @RitamGanguly-e9m
    @RitamGanguly-e9m 2 месяца назад

    INSERT INTO duplicate_entries (emp_id, emp_name, salary, create_timestamp) VALUES
    (1, 'Ankit', 10000, NOW()),
    (2, 'Rahul', 20000, NOW()),
    (3, 'Agam', 30000, NOW()),
    (4, 'Ankit', 15000, NOW());
    WITH RankedSalaries AS (
    SELECT emp_id,
    DENSE_RANK() OVER(PARTITION BY emp_name ORDER BY salary DESC) AS salary_rank
    FROM duplicate_entries
    )
    DELETE FROM duplicate_entries
    WHERE emp_id IN (SELECT emp_id
    FROM RankedSalaries
    WHERE salary_rank > 1);
    select * from duplicate_entries;
    This is one of the method which i used