Thank you for your video - especially the last part where you reveal that most Array Functions that are placed in aggregate functions don't require Ctrl + Shift + Enter, except for that stubborn TRANSPOSE Array Function!! You have created some great cool formulas!!! I also like your use of TRANSPOSE(COLUMN()) without the ^0 (in this formula: =SUM(--(MMULT((B19:D68=P2)+(B19:D68=P3)+(B19:D68=P4),TRANSPOSE(COLUMN(B18:D18)))>0)) ) because it does not matter for the resultant array of numbers because anything greater than zero will work as a count. I think I like this one the best: =SUM(--(MMULT((B19:D68=P2)+(B19:D68=P3)+(B19:D68=P4),ROW(A16:INDEX(A16:A22,COLUMNS(B18:D18))))>0)) because it has a clever, INDEX as a cell reference. Love this formula Element: ROW(A16:INDEX(A16:A22,COLUMNS(B18:D18))) : ) In the end, I often think that because they are array formulas, I usually like the one that will calculate most quickly. I did not time them on a big data set, so I am not sure… I have a question: How is this a Unique Count Problem? Isn't this an Or Logical Test Problem, where the question is "how many records contain at least one of the listed companies?" The data set does not appear to have any duplicate records. If you add a duplicate, like Company B Company R Company K as the third record, the count goes from 29 to 30, even though there is now a duplicate and the unique count would be 28. Amazing work, Leila Gharani!
Thanks for the great feedback Mike. The formulas here are based on mixing and matching the tricks I've seen you use in your tutorials. I like the one with INDEX too, it seems cleaner than the rest. I agree with you that the best formula would be the one that calculates faster on larger data sets. I'll put them to the test. I know you shared a good tool from Charles Williams for testing these. You are right about the phrasing. Your phrasing makes more sense than mine :-) I don't know why the word "unique count" got stuck in my head. It was basically supposed to imply each row gets counted once - but the more I think about it, the more I realize the word "unique" doesn't fit here at all. Thanks for pointing that out. I'll update the descriptions at least :-) Your input is very much appreciated. Thanks for the taking the time.
The funny thing is that formula elements like FREQUENCY()>0 are used in Unique Count count formulas, so it looks like a unique count formula, but the end result does not seem like unique count. This video is truly great because you showed some many great ways to do it! Thanks for the Excel fun! Go Online Excel Team!!! : )
Thank you. Very happy to be a part of it all. I ran the timing test on 30000 rows of data and averaged the three tests. The results were as follows: 1. MMULT formula with fixed array (the one using {1;1;1} ) - it came to 0.0018 2. FREQUENCY - 0.0019 3. INDEX and Transpose functions - both coming in at 0.0020 4. Last were the two indirect ones at 0.0356 Using helper cells came to around 0.0030 - faster than the indirect method, but not than the other array formulas.
Nice Timing Work! So the first three are about the same. That is really useful information. So maybe my favorite is: =SUM(--(FREQUENCY(((B19:D68=P2)+(B19:D68=P3)+(B19:D68=P4))*(ROW(B19:B68)-ROW(B19)+1),ROW(B19:B68)-ROW(B19))>0))-1 Because it is: 1) fast calculating and 2) information about number of columns comes from data rather than be hard coded in and 3) it does not require Ctrl + Shift + Enter. Your contributes positively to the Array Formula World : ) Go Online Excel Team!!
Here are two other ways to do this, using COUNTIFS and OR Criteria in the criteria argument: =SUMPRODUCT(--(MMULT(COUNTIFS(D2:D4,B19:D68),{1;1;1})>0)) or =SUMPRODUCT(--(MMULT(COUNTIFS(D2:D4,B19:D68),TRANSPOSE(COLUMN(B18:D18)))>0)) (Ctrl + Shift + Enter) The one potential downside is that sometimes COUNTIFS function take a LOOOOOOOOONG time to calculate. I did not time them, so I am not sure. But as you read in my book, most of the times, COUNTIFS is much too slow to risk using one big data sets. Maybe you can time these two?
I'm glad you like it. You should also check out the new Dynamic Arrays in Office 365 (currently still in Office Insider). They are a game changer: ruclips.net/video/2USJsIyIzvo/видео.html
Hi Leila.. this is an excellent 3-part series for anyone wanting to understand array formulas and matrix manipulations. I learned a lot. I particularly liked your trick with A1:INDEX() to build a dynamic array2 for MMULT and solve the problem of future column inserts in the data. Also, the insights on how and why CSE is not needed for some of the solutions was very interesting. Lots to think about. Thanks for Part 3 of 3. I'm sure I'll revisit all 3 parts again in the future. Triple thumbs up!!!
Hi Leila. Thanks for your videos. The quicker way to solve this problem is Power Query. You just need to unpivot the last 3 columns and then filter out unwanted companies, finally remove duplicates from the Workshops column and you get the final list.
Hi Vida. Absolutely. Power Query is a great tool. The good thing with the Power Query method you mentioned is that you can easily get the list of workshops that matched as well. The good thing with the formula is, that's it's more immediate once you have it set up. That's what I love about Excel, there are just so many ways of getting to the answer and the more we become aware of all the possibilities the better we can select what fits the situation best. Thank you for the great feedback.
Here's VBA code that does the the same thing. Option Explicit Sub RemoveDuplicatesInRow() Dim lastRow As Long Dim lastCol As Long Dim R As Long 'row index Dim c As Long 'column index Dim i As Long, Cell As Range Dim Target As Range, rng As Range Dim F1 As String, F2 As String, R1 As String F1 = Worksheets("UniqueCountRows").Range("P4").Value F2 = Worksheets("UniqueCountRows").Range("P5").Value R1 = Worksheets("UniqueCountRows").Range("P3").Value
Application.ScreenUpdating = False Application.DisplayAlerts = False Range("Table1[[#Headers],[Workshops]]").Select Application.Goto Reference:="Table1" Selection.Copy Sheets.Add.Name = "Temp Sheet" Range("B4").Select ActiveSheet.Paste With Sheet2 'AcitiveSheet.UsedRange Set Target = Sheets("Temp Sheet").Range(Range("C4"), Range("E4").End(xlDown)) For Each Cell In Target If Cell.Value = F1 Then Cell.Value = R1 Next Cell For Each Cell In Target If Cell.Value = F2 Then Cell.Value = R1 Next Cell With ActiveSheet.UsedRange lastRow = .Row + .Rows.Count - 1 lastCol = .Column + .Columns.Count - 1 End With For R = 1 To lastRow For c = 1 To lastCol For i = c + 1 To lastCol 'change lastCol to c+2 will remove adjacent duplicates only If Cells(R, i) "" And Cells(R, i) = Cells(R, c) Then Cells(R, i) = "" End If Next i Next c Next R Worksheets("Temp Sheet").Range("H4").Formula = "=COUNTIF($C$4:$E$100,UniqueCountRows!P3)+COUNTIF($C$4:$E$100,UniqueCountRows!P4)+COUNTIF($C$4:$E$100,UniqueCountRows!P5)" ThisWorkbook.Worksheets("UniqueCountRows").Range("Q6").Value = Range("H4") Sheets("UniqueCountRows").Select Range("Q6").Select Sheets("Temp Sheet").Delete Application.DisplayAlerts = True Application.ScreenUpdating = True End With End Sub
Thank you for your video - especially the last part where you reveal that most Array Functions that are placed in aggregate functions don't require Ctrl + Shift + Enter, except for that stubborn TRANSPOSE Array Function!!
You have created some great cool formulas!!!
I also like your use of TRANSPOSE(COLUMN()) without the ^0 (in this formula: =SUM(--(MMULT((B19:D68=P2)+(B19:D68=P3)+(B19:D68=P4),TRANSPOSE(COLUMN(B18:D18)))>0)) ) because it does not matter for the resultant array of numbers because anything greater than zero will work as a count.
I think I like this one the best: =SUM(--(MMULT((B19:D68=P2)+(B19:D68=P3)+(B19:D68=P4),ROW(A16:INDEX(A16:A22,COLUMNS(B18:D18))))>0)) because it has a clever, INDEX as a cell reference. Love this formula Element: ROW(A16:INDEX(A16:A22,COLUMNS(B18:D18))) : )
In the end, I often think that because they are array formulas, I usually like the one that will calculate most quickly. I did not time them on a big data set, so I am not sure…
I have a question: How is this a Unique Count Problem? Isn't this an Or Logical Test Problem, where the question is "how many records contain at least one of the listed companies?" The data set does not appear to have any duplicate records. If you add a duplicate, like Company B Company R Company K as the third record, the count goes from 29 to 30, even though there is now a duplicate and the unique count would be 28.
Amazing work, Leila Gharani!
Thanks for the great feedback Mike. The formulas here are based on mixing and matching the tricks I've seen you use in your tutorials. I like the one with INDEX too, it seems cleaner than the rest. I agree with you that the best formula would be the one that calculates faster on larger data sets. I'll put them to the test. I know you shared a good tool from Charles Williams for testing these.
You are right about the phrasing. Your phrasing makes more sense than mine :-) I don't know why the word "unique count" got stuck in my head. It was basically supposed to imply each row gets counted once - but the more I think about it, the more I realize the word "unique" doesn't fit here at all. Thanks for pointing that out. I'll update the descriptions at least :-)
Your input is very much appreciated. Thanks for the taking the time.
The funny thing is that formula elements like FREQUENCY()>0 are used in Unique Count count formulas, so it looks like a unique count formula, but the end result does not seem like unique count.
This video is truly great because you showed some many great ways to do it! Thanks for the Excel fun!
Go Online Excel Team!!! : )
Thank you. Very happy to be a part of it all.
I ran the timing test on 30000 rows of data and averaged the three tests. The results were as follows:
1. MMULT formula with fixed array (the one using {1;1;1} ) - it came to 0.0018
2. FREQUENCY - 0.0019
3. INDEX and Transpose functions - both coming in at 0.0020
4. Last were the two indirect ones at 0.0356
Using helper cells came to around 0.0030 - faster than the indirect method, but not than the other array formulas.
Nice Timing Work! So the first three are about the same. That is really useful information. So maybe my favorite is:
=SUM(--(FREQUENCY(((B19:D68=P2)+(B19:D68=P3)+(B19:D68=P4))*(ROW(B19:B68)-ROW(B19)+1),ROW(B19:B68)-ROW(B19))>0))-1
Because it is: 1) fast calculating and 2) information about number of columns comes from data rather than be hard coded in and 3) it does not require Ctrl + Shift + Enter.
Your contributes positively to the Array Formula World : )
Go Online Excel Team!!
Here are two other ways to do this, using COUNTIFS and OR Criteria in the criteria argument:
=SUMPRODUCT(--(MMULT(COUNTIFS(D2:D4,B19:D68),{1;1;1})>0))
or
=SUMPRODUCT(--(MMULT(COUNTIFS(D2:D4,B19:D68),TRANSPOSE(COLUMN(B18:D18)))>0)) (Ctrl + Shift + Enter)
The one potential downside is that sometimes COUNTIFS function take a LOOOOOOOOONG time to calculate. I did not time them, so I am not sure. But as you read in my book, most of the times, COUNTIFS is much too slow to risk using one big data sets. Maybe you can time these two?
I'm learning so much out of these 3 videos. They've opened up a whole world of possibilities, definitely getting bookmarked for regular review.
I'm glad you like it. You should also check out the new Dynamic Arrays in Office 365 (currently still in Office Insider). They are a game changer: ruclips.net/video/2USJsIyIzvo/видео.html
Hi Leila.. this is an excellent 3-part series for anyone wanting to understand array formulas and matrix manipulations. I learned a lot. I particularly liked your trick with A1:INDEX() to build a dynamic array2 for MMULT and solve the problem of future column inserts in the data. Also, the insights on how and why CSE is not needed for some of the solutions was very interesting. Lots to think about. Thanks for Part 3 of 3. I'm sure I'll revisit all 3 parts again in the future. Triple thumbs up!!!
Great tutorial !!! You have explained different ways of solutions. Most amazing part was the maze of the Array functions!!!!
You're welcome Mahbub. The maze can be frustrating but also a lot of fun... Specially once you manage to find the way out :)
Hi Leila. Thanks for your videos. The quicker way to solve this problem is Power Query. You just need to unpivot the last 3 columns and then filter out unwanted companies, finally remove duplicates from the Workshops column and you get the final list.
Hi Vida. Absolutely. Power Query is a great tool. The good thing with the Power Query method you mentioned is that you can easily get the list of workshops that matched as well. The good thing with the formula is, that's it's more immediate once you have it set up. That's what I love about Excel, there are just so many ways of getting to the answer and the more we become aware of all the possibilities the better we can select what fits the situation best. Thank you for the great feedback.
Thanks a lot ! Je vais devenir carrément expert grâce à toi ;)
Superb.... just to know..are you an Indian?
سپاس
👍🇮🇳
Here's VBA code that does the the same thing.
Option Explicit
Sub RemoveDuplicatesInRow()
Dim lastRow As Long
Dim lastCol As Long
Dim R As Long 'row index
Dim c As Long 'column index
Dim i As Long, Cell As Range
Dim Target As Range, rng As Range
Dim F1 As String, F2 As String, R1 As String
F1 = Worksheets("UniqueCountRows").Range("P4").Value
F2 = Worksheets("UniqueCountRows").Range("P5").Value
R1 = Worksheets("UniqueCountRows").Range("P3").Value
Application.ScreenUpdating = False
Application.DisplayAlerts = False
Range("Table1[[#Headers],[Workshops]]").Select
Application.Goto Reference:="Table1"
Selection.Copy
Sheets.Add.Name = "Temp Sheet"
Range("B4").Select
ActiveSheet.Paste
With Sheet2
'AcitiveSheet.UsedRange
Set Target = Sheets("Temp Sheet").Range(Range("C4"), Range("E4").End(xlDown))
For Each Cell In Target
If Cell.Value = F1 Then Cell.Value = R1
Next Cell
For Each Cell In Target
If Cell.Value = F2 Then Cell.Value = R1
Next Cell
With ActiveSheet.UsedRange
lastRow = .Row + .Rows.Count - 1
lastCol = .Column + .Columns.Count - 1
End With
For R = 1 To lastRow
For c = 1 To lastCol
For i = c + 1 To lastCol 'change lastCol to c+2 will remove adjacent duplicates only
If Cells(R, i) "" And Cells(R, i) = Cells(R, c) Then
Cells(R, i) = ""
End If
Next i
Next c
Next R
Worksheets("Temp Sheet").Range("H4").Formula = "=COUNTIF($C$4:$E$100,UniqueCountRows!P3)+COUNTIF($C$4:$E$100,UniqueCountRows!P4)+COUNTIF($C$4:$E$100,UniqueCountRows!P5)"
ThisWorkbook.Worksheets("UniqueCountRows").Range("Q6").Value = Range("H4")
Sheets("UniqueCountRows").Select
Range("Q6").Select
Sheets("Temp Sheet").Delete
Application.DisplayAlerts = True
Application.ScreenUpdating = True
End With
End Sub
Wow! For this one too. Very impressive!
Thank you for your kind words.