=IF(COUNTIFS(A7:A11,A7:A11,B7:B11,B7:B11)>1, "Duplicate", "") for Excel 365 and 2021. Power Query definitely is my favorite if updates will be expected. Set up once and use again and again. I have learned a lot from your videos. Thank you and wait for your next video.
another brilliant video, Mynda! Thank you so much. You make these videos easy to understand -even if some of us aren't as advanced as others (or need to be ie: retired). I love learning from you and always share your videos.
One of the things I like about Excel is that there are multiple ways to accomplish essentially the same task. Within the scope of one workbook personally I've been using the dynamic array functions (UNIQUE in this case) more and more. Power Query is also great (and I guess might be less resource-intensive with bigger data sets?) but I really like everything just updating real-time, without having to refresh or re-do anything.
this video is really helpful, THX a lot. And I want to share another simple way to highlight and remove duplicates by using WPS Office. 1. select the range of your values 2. click “highlight duplicates”< “set” or if you want to remove duplicates 1. select the range of your values 2. click “highlight duplicates”< “remove duplicates” very simple and quick.
The duplicate rows example you did in the highlight duplicates with conditional formatting section is tricky. The only reason they were both highlighted is because the values appeared multiple times on the respective lists. If you did a series of 1,2,3,4,5 and the column beside 2,3,4,5,6 and did that step all of the numbers 2,3,4,5 on each list would be highlighted (1 and 6 would not be). That specific function looks at the entire selection of cells for anything that appears more than once, NOT the contents of an entire row with the contents of other rows. I'm not sure if that is how you originally intended to explain that part, but the video you showed made it look like it was looking for entire duplicated rows. Which you explain how to do in the next section haha.
Hi Mynda! This is yet another informative video that is really helpful for me. You may already have a video on this, but I was hoping to better understand the 'Replace Duplicates' function for a table. For example, if I have .csv files saved every week. Is the best way to incorporate new data and remove duplicates by using Power Query? I know you can remove duplicates from the data once you've copied/pasted the new data into the table? That's what I'm doing right now, manually...baby steps. I have a dataset that increases by at least 5,000 rows each month but I want to make sure I get rid of duplicate data by ensuring that the most recent data is in the table. I currently copy/paste the new data into Excel and then sort by the number col. (a-z) and then by updated col. (z-a) and then I remove duplicates from the table. In other words I want to ensure that I have the latest updated data (i.e. something that was new last month could be closed this month). I'm sure there is an easier way. Thanks for your help.
Yes, absolutely use Power Query for this. You won't know yourself when all you need to do is click a button to update the data. You can learn Power Query in my course here (and if you get stuck implementing the techniques in your own work as a course member you can email me and I'll help you): www.myonlinetraininghub.com/excel-power-query-course
Hi Mynda All good examples, and as you point out it depend on the use, which one to choose. In my job I prefer to use the power editor, and since most of the data I’m working with contains dates, I will need to sort them before removing duplicates. Absolutely not to criticize the always good videos you make, but maybe you can cover the “table.buffer” function, at least it took me some time to figure out how to keep the one that I want to use.
Thanks for yet another great video! I have string values in a column (let’s call it column1) which contains string values in LARGE CAPS and in Mixed caps. I want to split it into two new columns with one holding all string values in LARGE CAPS and the other holding the Mixed caps. How do I do this? It seems like an easy problem but I have tried everything I can think of and asked chatGPT and it does not come up with anything that works. Please help!!
Thank you 🙏You can probably use List.ContainsAny to look for the capitalised alphabet and extract that text into a separate column. The remaining words can go in another column. If you get stuck, please post your question and sample Excel file on our forum where we can help you further: www.myonlinetraininghub.com/excel-forum
EDIT: I've just seen a much easier way =EXACT(A2,UPPER(A2)) will return TRUE for all upper and you can work out a suitable way of splitting from there. I have just been working with a long IFS formula to do something quite different but I believe it is the same principle. No doubt Mynda would know an easier way but it works! When you say large and mixed caps do you mean All Caps and Mixed Case? If you so you can set up a long IFS(ISNUMBER(FIND formula that looks for each lower case letter of the alphabet (FIND is case specific). Let's say your values are in Col A. Mixed Case formula In Col B you have your =IFS(ISNUMBER(FIND formula to find Mixed Case =IFS(ISNUMBER(FIND("a",A2)),A2,ISNUMBER(FIND("b",A2)),A2,ISNUMBER(FIND("c",A2)),A2, ... etc All Caps will return #N/A which you can avoid with ISERROR-type element if you wish All Caps formula In Col C you put a formula to say if Col B value #N/A, Col A value for All Caps. For my long IFS formula I created it by separating all the elements like so: Col 1: ISNUMBER(FIND(" Col 2: a Col 3: ",A2)),A2, And then copied down Cols 1 and 3 for all the values I needed to search on adding initial "=IFS(" and putting an ending parenthesis instead of comma, changing the letter of the alphabet in Col 2 in each row and then concatenating all the bits and making text of the formula before pasting it in as a formula. Hope this helps if Mynda doesn't help you with something easier.
@@petraliverani1641 Sure. Second argument of FILTER does the same :) but in a more versatile "dynamic" manner for the range/array named "column1": row-by-row, cell-by-cell...
Not sure what you mean. If there are blanks, why do you need to use text from a row? Note: I won't see your response here due to the volume of comments. Please post your question and sample Excel file on our forum where someone can help you further: www.myonlinetraininghub.com/excel-forum
It's difficult to visualise what you mean. Please post your question and sample Excel file on our forum where someone can help you further: www.myonlinetraininghub.com/excel-forum
Hi Mynda - lovely presentation, as usual! I am trying to link to the hi-res version of the PDF, but it only loops me back to the article. Can you update the link? Thanks!
Thanks so much! The cheat sheet PDF is available to download from the article page just above the video. See the heading on that page "Download Example Workbook & Cheat Sheet". If you have any problems downloading it, please reach out via email: website at MyOnlineTrainingHub.com because I won't see follow up replies to this thread due to the volume of comments.
With countif there's also a way to find not only the duplicates but also which of the 2 was the second occurrence, in case this matters. Countif is quite powerful when used correctly.
Mynda hi, thank you. After determining duplicate values then I want to filter those values with their own different cell ccolors How can I filter with two different colors at the same time because excel filter button allows me to select just one color?
You'd have to find another way. Either select multiple items in the Text filters or write a conditional formatting rule with a formula that applies the same colour to all cells you want to filter using the OR function.
Nice work. This might be helpful. I used to get very odd behaviour using COUNTIF embedded in a lambda function that was called from within another LAMBDA (something to do with it not accepting the variable which although was a vector, was strictly not a RANGE, which COUNTIF must take as its first argument). It did my head in! This formula avoids using COUNTIF, and IMO is more robust (when doing complex and nested lambdas) . Hope its useful COLUMN_DUPLICATES = LAMBDA(col_vector, LET( count, SCAN(0, col_vector, LAMBDA(a, r, SUM(--(r = col_vector)))), duplicates, IF( IFERROR(ROWS(UNIQUE(FILTER(col_vector, count > 1))), 0) = 0, "No Duplicates", UNIQUE(FILTER(col_vector, count > 1)) ), duplicates ) );
Not sure what you mean. Please post your question and sample Excel file on our forum where someone can help you further: www.myonlinetraininghub.com/excel-forum
Just for example if you have number 10,20,30,40,50 in column A and have double number in coloum B like 20,42,36,80,102 and you have to separate them or show them they are duplicate
Dear Mynda, In the "Exctracting with Formulas" worksheet, the extraction is of the unique values, but you can extract, by formula, the duplicate value(s): =UNIQUE(VSTACK(UNIQUE(A7:B11),UNIQUE(A7:B11,,1)),,1) - (duplicate line) OR =UNIQUE(VSTACK(UNIQUE(B7:B11),UNIQUE(B7:B11,,1)),,1) - (duplicate cell) 🤗
Good ideas. Or with FILTER: =FILTER(A7:B11,COUNTIFS(A7:A11,A7:A11,B7:B11,B7:B11)>1) - duplicates =FILTER(A7:B11,COUNTIFS(A7:A11,A7:A11,B7:B11,B7:B11)=1) - unique/distinct
@@MyOnlineTrainingHub Dear Mynda, Your formula for displaying duplicates with the FILTER function is better, because it shows the amount of repetitions. ❤🤗
Great video. I've been struggling with the problem of duplicates today and we haven't solved it, so any ideas would be welcome. I have a large data table from which I need to remove duplicates based on one column (essentially a case ref). However, I want to choose which of the duplicates to remove based on the entry in another column To simplify, let's say I have two rows with the same case reference but in the second column one row is "A" and one is "B". I want to keep the Bs but the As and Bs can appear in any order. Rows with the same case ref can have several As and Bs. I want to keep only one of the rows with a "B" Make sense? Thoughts?
Yes, you can do this, but it's difficult to explain here. Please post your question and sample Excel file on our forum where we can help you further: www.myonlinetraininghub.com/excel-forum
@@MyOnlineTrainingHub Thank you. I will post the file Monday as not working today. I did have an idea in my sleep - to custom sort the table by case ref with secondary sort by the second column Z to A. That would bring the Bs to the top of each case. I then remove duplicates leaving those cases that do have Bs with that row. Would that work? If there is more than one B on a case I don't think it matters which I leave and which I remove. If that solution works then it shows the power of the subconscious!
If I have a big set of numbers in a matrix form, with no headers (examples: 10 rows, and 10 columns, hence 100 numbers). How can I remove duplicates from this set of data?
You're best to use the UNIQUE function to extract a list of distinct values. If you want them converted into a column of values, you can use the TOCOL function e.g. =TOCOL(UNIQUE(...))
Wrong - I spent 2 hours until I understood where to activate duplicates - to disable the option to find duplicates - Home - Conditional Formatting - Manage Rules - Select Rules - Delete Rules.@@MyOnlineTrainingHub
My bad. I thought you were asking me how to disable something that is on by default. It wasn't clear that you had set up a Conditional Formatting rule and simply wanted to delete it. That's as easy as Home tab > Conditional Formatting > Clear Rules.
Hope you found a solution. If you're having trouble, you're welcome to post your question and sample Excel file on our forum where someone can help you further: www.myonlinetraininghub.com/excel-forum
=IF(COUNTIFS(A7:A11,A7:A11,B7:B11,B7:B11)>1, "Duplicate", "") for Excel 365 and 2021. Power Query definitely is my favorite if updates will be expected. Set up once and use again and again. I have learned a lot from your videos. Thank you and wait for your next video.
Great to hear 🙏
You are GENIUOS! In comparison to other tutorials (despite of topic), you are like superhero! Thanks for all, what you show us for free.
Thank you for your kind words! You're very welcome. 😊
Thank you, Mynda, from Canada. The workbook wonderfully complements the otherwise excellent overview presented in your tutorial.
Thanks for sharing that, I'm happy you like it!
Nice, Mynda! It is beneficial for the work I am planning to do tomorrow. Thanks
Glad you found it helpful!
another brilliant video, Mynda! Thank you so much. You make these videos easy to understand -even if some of us aren't as advanced as others (or need to be ie: retired). I love learning from you and always share your videos.
Thanks so much for your kind words and support 🥰
One of the things I like about Excel is that there are multiple ways to accomplish essentially the same task. Within the scope of one workbook personally I've been using the dynamic array functions (UNIQUE in this case) more and more. Power Query is also great (and I guess might be less resource-intensive with bigger data sets?) but I really like everything just updating real-time, without having to refresh or re-do anything.
Good points for when to use the different options 🙏
All known.
Good material, quite complete.
Thank you very much Minda.
Cheers, Ivan!
this video is really helpful, THX a lot.
And I want to share another simple way to highlight and remove duplicates by using WPS Office.
1. select the range of your values
2. click “highlight duplicates”< “set”
or if you want to remove duplicates
1. select the range of your values
2. click “highlight duplicates”< “remove duplicates”
very simple and quick.
Even with seemingly simple topics, you manage to point out features and capabilities I did not know. Thank you!
Glad it was worth your time to watch, Chris!
Very useful! I regularly use Unique and conditional formatting, but got quite other tips today
Great to hear!
Thanks, Mynda, for the valuable overview! Kisses from Ukraine
My pleasure, Alexander! Thinking of you over there.
you always come with useful videos which are helpful in daily work. Thanks
Glad to hear that 🙏
Loving the new backdrop to go along with the great tips!
Thanks so much!
Thank you so much for your tutorials they helped me with my capstone project. Awesome job!!
Great to hear!
Thanks Mynda, this was great!
Thanks so much, Chris!
Very useful. Love how you implemented PQ as well.
Thanks so much!
The duplicate rows example you did in the highlight duplicates with conditional formatting section is tricky. The only reason they were both highlighted is because the values appeared multiple times on the respective lists. If you did a series of 1,2,3,4,5 and the column beside 2,3,4,5,6 and did that step all of the numbers 2,3,4,5 on each list would be highlighted (1 and 6 would not be). That specific function looks at the entire selection of cells for anything that appears more than once, NOT the contents of an entire row with the contents of other rows. I'm not sure if that is how you originally intended to explain that part, but the video you showed made it look like it was looking for entire duplicated rows. Which you explain how to do in the next section haha.
Yes, good point 👍
Thank you for sharing Knowledge!!
Appreciate it.!!
You're welcome 😊
Hi Mynda! This is yet another informative video that is really helpful for me. You may already have a video on this, but I was hoping to better understand the 'Replace Duplicates' function for a table. For example, if I have .csv files saved every week. Is the best way to incorporate new data and remove duplicates by using Power Query? I know you can remove duplicates from the data once you've copied/pasted the new data into the table? That's what I'm doing right now, manually...baby steps. I have a dataset that increases by at least 5,000 rows each month but I want to make sure I get rid of duplicate data by ensuring that the most recent data is in the table. I currently copy/paste the new data into Excel and then sort by the number col. (a-z) and then by updated col. (z-a) and then I remove duplicates from the table. In other words I want to ensure that I have the latest updated data (i.e. something that was new last month could be closed this month). I'm sure there is an easier way. Thanks for your help.
Yes, absolutely use Power Query for this. You won't know yourself when all you need to do is click a button to update the data. You can learn Power Query in my course here (and if you get stuck implementing the techniques in your own work as a course member you can email me and I'll help you): www.myonlinetraininghub.com/excel-power-query-course
Excelentes opciones, útiles y bien explicadas y graficadas. graciass.!!
Gracias!
Hi Mynda
All good examples, and as you point out it depend on the use, which one to choose.
In my job I prefer to use the power editor, and since most of the data I’m working with contains dates, I will need to sort them before removing duplicates.
Absolutely not to criticize the always good videos you make, but maybe you can cover the “table.buffer” function, at least it took me some time to figure out how to keep the one that I want to use.
Thanks, Ivan! Yes, specifying which duplicate to keep with Power Query is tricky. A video for another day, perhaps 😊
Thanks for yet another great video!
I have string values in a column (let’s call it column1) which contains string values in LARGE CAPS and in Mixed caps. I want to split it into two new columns with one holding all string values in LARGE CAPS and the other holding the Mixed caps. How do I do this? It seems like an easy problem but I have tried everything I can think of and asked chatGPT and it does not come up with anything that works. Please help!!
Thank you 🙏You can probably use List.ContainsAny to look for the capitalised alphabet and extract that text into a separate column. The remaining words can go in another column. If you get stuck, please post your question and sample Excel file on our forum where we can help you further: www.myonlinetraininghub.com/excel-forum
Not sure - quick idea
=FILTER(column1, EXACT(column1, UPPER(column1)))
=FILTER(column1, NOT(EXACT(column1, UPPER(column1))))
EDIT: I've just seen a much easier way =EXACT(A2,UPPER(A2)) will return TRUE for all upper and you can work out a suitable way of splitting from there.
I have just been working with a long IFS formula to do something quite different but I believe it is the same principle. No doubt Mynda would know an easier way but it works!
When you say large and mixed caps do you mean All Caps and Mixed Case?
If you so you can set up a long IFS(ISNUMBER(FIND formula that looks for each lower case letter of the alphabet (FIND is case specific).
Let's say your values are in Col A.
Mixed Case formula
In Col B you have your =IFS(ISNUMBER(FIND formula to find Mixed Case
=IFS(ISNUMBER(FIND("a",A2)),A2,ISNUMBER(FIND("b",A2)),A2,ISNUMBER(FIND("c",A2)),A2, ... etc
All Caps will return #N/A which you can avoid with ISERROR-type element if you wish
All Caps formula
In Col C you put a formula to say if Col B value #N/A, Col A value for All Caps.
For my long IFS formula I created it by separating all the elements like so:
Col 1: ISNUMBER(FIND("
Col 2: a
Col 3: ",A2)),A2,
And then copied down Cols 1 and 3 for all the values I needed to search on adding initial "=IFS(" and putting an ending parenthesis instead of comma, changing the letter of the alphabet in Col 2 in each row and then concatenating all the bits and making text of the formula before pasting it in as a formula. Hope this helps if Mynda doesn't help you with something easier.
@@viktorasgolubevas Alternatively, the formula =EXACT(A2,UPPER(A2)) will return TRUE for all upper case and you can work it out from there.
@@petraliverani1641
Sure.
Second argument of FILTER does the same :) but in a more versatile "dynamic" manner for the range/array named "column1": row-by-row, cell-by-cell...
Thank you for the video. Could you create a video on how to remove blanks using text from a row?
Not sure what you mean. If there are blanks, why do you need to use text from a row? Note: I won't see your response here due to the volume of comments. Please post your question and sample Excel file on our forum where someone can help you further: www.myonlinetraininghub.com/excel-forum
How to use formulas if data is a table (instead of range)?
I need the table format to also do other data stuff.
Learning from you :)
It's difficult to visualise what you mean. Please post your question and sample Excel file on our forum where someone can help you further: www.myonlinetraininghub.com/excel-forum
Hi Mynda - lovely presentation, as usual! I am trying to link to the hi-res version of the PDF, but it only loops me back to the article. Can you update the link? Thanks!
Thanks so much! The cheat sheet PDF is available to download from the article page just above the video. See the heading on that page "Download Example Workbook & Cheat Sheet". If you have any problems downloading it, please reach out via email: website at MyOnlineTrainingHub.com because I won't see follow up replies to this thread due to the volume of comments.
Nice video Thank u for ur hard work
Thank you 😊
Really Useful
Many thanks.
Glad to hear that 😊
With countif there's also a way to find not only the duplicates but also which of the 2 was the second occurrence, in case this matters. Countif is quite powerful when used correctly.
Yes, great alternative to the =IF(COUNTIF(... example in the video to just use =COUNTIF(...
Mynda hi, thank you.
After determining duplicate values then I want to filter those values with their own different cell ccolors How can I filter with two different colors at the same time because excel filter button allows me to select just one color?
You'd have to find another way. Either select multiple items in the Text filters or write a conditional formatting rule with a formula that applies the same colour to all cells you want to filter using the OR function.
Nice work. This might be helpful. I used to get very odd behaviour using COUNTIF embedded in a lambda function that was called from within another LAMBDA (something to do with it not accepting the variable which although was a vector, was strictly not a RANGE, which COUNTIF must take as its first argument). It did my head in! This formula avoids using COUNTIF, and IMO is more robust (when doing complex and nested lambdas) . Hope its useful
COLUMN_DUPLICATES = LAMBDA(col_vector,
LET(
count, SCAN(0, col_vector, LAMBDA(a, r, SUM(--(r = col_vector)))),
duplicates, IF(
IFERROR(ROWS(UNIQUE(FILTER(col_vector, count > 1))), 0) = 0,
"No Duplicates",
UNIQUE(FILTER(col_vector, count > 1))
),
duplicates
)
);
Thanks for sharing, @sachin.tandon!
4:45 Doesn't the UNIQUE() function remove duplicates - because it would only list a duplicated value once!
Isn't that what I said?
Can we use pivot tables to show duplicates in two columns or more
Yes, just add the other column to the row labels.
Is there a way to highlight duplicates between 2 workbooks?
With identical sheets layout, where values might differ.
I'd use Power Query to do this, as explained here: www.myonlinetraininghub.com/excel-compare-two-lists
@@MyOnlineTrainingHub Thanks.
Thanks 👍
Pleasure 😊
Thank you!
You're welcome!
What if you have double amount in another coloum like coloum A1 having 50 and column B having 100?
Not sure what you mean. Please post your question and sample Excel file on our forum where someone can help you further: www.myonlinetraininghub.com/excel-forum
Just for example if you have number 10,20,30,40,50 in column A and have double number in coloum B like 20,42,36,80,102 and you have to separate them or show them they are duplicate
Gracias... guardaré el video
Thank you!
Dear Mynda,
In the "Exctracting with Formulas" worksheet, the extraction is of the unique values, but you can extract, by formula, the duplicate value(s):
=UNIQUE(VSTACK(UNIQUE(A7:B11),UNIQUE(A7:B11,,1)),,1) - (duplicate line) OR
=UNIQUE(VSTACK(UNIQUE(B7:B11),UNIQUE(B7:B11,,1)),,1) - (duplicate cell) 🤗
Good ideas. Or with FILTER:
=FILTER(A7:B11,COUNTIFS(A7:A11,A7:A11,B7:B11,B7:B11)>1) - duplicates
=FILTER(A7:B11,COUNTIFS(A7:A11,A7:A11,B7:B11,B7:B11)=1) - unique/distinct
@@MyOnlineTrainingHub Dear Mynda,
Your formula for displaying duplicates with the FILTER function is better, because it shows the amount of repetitions. ❤🤗
thanks great help
You're welcome!
Great video. I've been struggling with the problem of duplicates today and we haven't solved it, so any ideas would be welcome.
I have a large data table from which I need to remove duplicates based on one column (essentially a case ref). However, I want to choose which of the duplicates to remove based on the entry in another column
To simplify, let's say I have two rows with the same case reference but in the second column one row is "A" and one is "B". I want to keep the Bs but the As and Bs can appear in any order.
Rows with the same case ref can have several As and Bs. I want to keep only one of the rows with a "B"
Make sense? Thoughts?
Yes, you can do this, but it's difficult to explain here. Please post your question and sample Excel file on our forum where we can help you further: www.myonlinetraininghub.com/excel-forum
@@MyOnlineTrainingHub Thank you. I will post the file Monday as not working today. I did have an idea in my sleep - to custom sort the table by case ref with secondary sort by the second column Z to A. That would bring the Bs to the top of each case. I then remove duplicates leaving those cases that do have Bs with that row. Would that work? If there is more than one B on a case I don't think it matters which I leave and which I remove.
If that solution works then it shows the power of the subconscious!
That will work for some of the methods, but it won't work for Power Query.
@@MyOnlineTrainingHub That did work for my particular problem so thank you for helping my brain cells work it through.
If I have a big set of numbers in a matrix form, with no headers (examples: 10 rows, and 10 columns, hence 100 numbers). How can I remove duplicates from this set of data?
You're best to use the UNIQUE function to extract a list of distinct values. If you want them converted into a column of values, you can use the TOCOL function e.g. =TOCOL(UNIQUE(...))
Great👍
Thank you!
How can I remove the duplicates and remain position also?
Don't think you can, sorry.
@@MyOnlineTrainingHub I used another method where it highlights it and makes it red, so I can manually go back and change the words. That helped. :)
What if, we only have one column and need to highlight those with the first 5 matching characters on the same column?
Add a column with this formula: =LEFT(cell containing the value, 5) then find duplicates on the new column.
@@MyOnlineTrainingHub Much appreciate the prompt response. Thanks
Much appreciated! @@MyOnlineTrainingHub
Seven.. yikes! 👍😎✊
😁
how do I disable the option to detect duplicate values in excel - how do I disable this option - I want to have duplicate values????
Excel doesn't tell you there are duplicates unless you ask it to, so there's nothing to disable.
Wrong - I spent 2 hours until I understood where to activate duplicates - to disable the option to find duplicates - Home - Conditional Formatting - Manage Rules - Select Rules - Delete Rules.@@MyOnlineTrainingHub
My bad. I thought you were asking me how to disable something that is on by default. It wasn't clear that you had set up a Conditional Formatting rule and simply wanted to delete it. That's as easy as Home tab > Conditional Formatting > Clear Rules.
you are right, but it is too late.@@MyOnlineTrainingHub
If I could learn & remember just 10% of what you know about Excel I'd be happy.
😊 keep practicing and you'll get there.
👍
Cheers 😉
Super desperate!! I have 3 data that are the same and I need to remove 2 so leaves one unique value.
Hope you found a solution. If you're having trouble, you're welcome to post your question and sample Excel file on our forum where someone can help you further: www.myonlinetraininghub.com/excel-forum
👍
Cheers!