If you're planning to Sum data using Excel VBA then I think you will find the comparison of the different methods very interesting. Enjoy the video and please add any comments or questions below.
Just an important note if using the Scripting.Dictionary it isn't available on the Mac. There are some VBA alternatives available Tim Hall's version github.com/VBA-tools/VBA-Dictionary and well I've done one using interfaces so can easily switch Dictionary implementations. github.com/MarkJohnstoneGitHub/VBA-IDictionary Now I know what I one prefer thou would be biased. :) After examining Tim Halls Dictionary version I discovered numerous areas for performance improvements. Note I haven't tested on a Mac thou it should be compatible. I'll have to have a more in-depth look at what you're attempting to do. I was working on creating indexes for a dictionary with groupby functionality then got distracted doing the Dictionary project. Doing a Groupby Count, Sum, Average etc would be handy functionality using a Dictionary.
Hi Paul, I thaught that the fastest was Pivotable. I discover the use of vba dictionary with you. I will try to work on it to be more confortable. Thank you very much.
Hi Paul.. great video. So very interesting when you compare different methods.. love that way of illustrating the point. It also gives various options to use in different situations. Thanks for sharing it. Thumbs up!
Another awesome video showing multiple ways to approach a task and revealing which is the most efficient. Thanks for posting. I noticed that in the Dictionary approach the code tests for the existence of a key before adding an object (record). I am not sure but it seems that testing for the existence of a key can take extra time. My solution is to pass through the array twice, the first time collecting the keys of the dictionary (e.g. "Apple", "Orange", "Pear" etc) and the second time through I accumulate the totals for each key, never having to check for the existence of a given key. On a data set of 100K rows my code seems to work pretty fast. 20 Set rngData = ActiveSheet.Range("A1").CurrentRegion 30 Set rngData = rngData.Offset(RowOffset:=1).Resize(Rowsize:=rngData.Rows.Count - 1) 40 arData = rngData.Value 50 lngSize = UBound(arData, 1) 'Get dictionary keys 60 For i = 1 To lngSize 70 dictSums.Item(arData(i, 1)) = 0 80 Next i 90 For i = 1 To lngSize 'Accumulate total for each key 100 vntKey = arData(i, 1) 110 vntTotal = dictSums.Item(vntKey) 120 vntAmount = arData(i, 2) 130 dictSums.Item(vntKey) = vntTotal + vntAmount 140 Next i 150 For Each vntKey In dictSums.Keys 160 Debug.Print vntKey & " Total: " & FormatCurrency(dictSums.Item(vntKey)) 170 Next vntKey Output: Peach Total: $10,065,915.00 Grape Total: $9,840,145.00 Orange Total: $9,998,053.00 Pear Total: $9,932,674.00 Apple Total: $10,000,821.00 Thank you kindly.
Thanks for the suggestion. If you are only summing one column then you can actually do it like this: dictSums(vntKey) = dictSums(vntKey) + vntAmount It will automatically add the key if it doesn't exist. If you are adding an object then it is complicated because you have to add an object per key.
Thanks Paul, new to your channel and the more advanced VBA. This is exactly what I am looking for so have adapted it for my own use. However, the "Dim oABC As clsABC throws up an error but clearly works with what you have. Now trying to Google why!!
Excellent content comparing methods for solving frequently-encountered problems. Some of your methods are new to me, or previously mysterious; your explanations give me a clear recognition of additional methods and functionality I need to add to my personal "took kit."
I prefer ADO, since I can do join and sum together. it is very handy if you have a data table then a lookup table in the same workbook for extra grouping.
The rather new "SumIfS" is missing. This new Version of SumIf is far more powerful. It would be interesting to see, how fast it is in comparison. As always: a very informative Video! Thanks!
Another excellent tutorial! Thank you for the clear explanations & the measured comparisons. I'm a believer! Since watching this video shortly after it came out, I have been applying these techniques to all new code, and it's great how the code really flies. Thank you! I've also started going back thru old code to replace what I knew then with what I know now. (Oh, how I wish I could get my younger self back to do the same thing!) A question ... In a new procedure under development, I copy data from 1 data table (ListObject) to another table (ListObject). One column/field stores numbers formatted as text (e.g., 002, 005, 010 .. each on a separate row). When I copy the data (loFrom.DataBodyRange.Copy Destination:=loTo.etc.), the command preserves the formatting so the data land as 002, 005, 010. But, when I assign the data (loTo.DataBodyRange.Value = loFrom.DataBodyRange.Value), the data land as numbers (e.g., 2, 5, 10). With my new Excel Macro Mastery knowledge, I would prefer to assign the values, but I need them to retain their formatting. Is there a property or other technique that would accomplish that?
Great video! I would like to know more about ADO. I have doubts using it like: Can I get values from an unopened workbook which are formula-dependent?, etc. Would be glad if you make a separate video about this one. Cheers!
I'm pretty sure if you are reading from a closed workbook, the values that you read from it would be the values that were in those cells at the time the workbook was last saved. (ie. formulas will not update the values)
Hello! Probably a lame question, but at the dictionary method, you've defined oCustomer as clsCustomer. Looks like it's a custom variable that were defined somewhere else. If yes where did you define it and how?
Hi Paul ..is there a way to create multiple values into dictionaries as in not as key but values. Also can there be way to use calculated fields (numbers field) as row label in pivot. I have to optimize a code with around 80colunns with multiple vlookups and sumifs in excel UI on 30k records taking almost an hour to refresh
Hi, thanks a lfor all of your efforts of teaching as the best way of using vba. Btw: What code is behind GetMethod? If you don´t mind, I would be glad about a show of code :)
With 500000 data to sum, pivot table will win over all, just because all other methods have for loop which is becoming more heavier and heavier with more data, pivot table helped me the most on that, because the number of data doesn't affect too much on it
At one point you type the function right into the worksheet to show it before coding it. It would have been nice if you had included that in your speed test for comparison. The code just rights the function to the cell and clocks the time it takes for the worksheet to do the math.
I’m crediting you with teaching me VBA. I think I’ve watched just about every video in this playlist over the past couple months. I do have a question. I regularly have to continually sum data with multiple conditions, frequently with Or criteria as well. Using excel formulas, I’d go for SUMPRODUCT. It doesn’t seem like the dictionary array would work for that because the key would have to be a combination of several attributes. Could you use the class module as the Key instead of the item?
If you have multi criteria then you can use multiple fields as the key. Concatenate them together as a string. You wouldn't usethe class module as a key as this would mean the key was a memory address.
Dictionary is faster, but speed is not always the determinant. If you have many grouping levels with grouping not only by rows but also by columns - using dictionaries can be quite tricky) And time difference might not worth it.
I think youre right. Because, besides Excel only need to handle the ram memory, all memory registers are side by side in an array, so it is really fast. Image processing relies a lot on arrays just because of that, I think...
I am trying to build a sudoku Solver in VBA , i am more then 3 days busy full time..but VBA really sucks on this ..throwing one error after the other. I think I will use another language to program in it, too bad.
I'm pretty sure you could get the SUMIF method to run many times faster if you copied the column of categories into the destination location and ran RemoveDuplicates on that column instead of reading through the data to build a list of unique categories. That would be only a few lines of code with no loops. Your SUMIF method also seems to be unnecessarily creating a new clsCustomer object for every row in the data.
@@Excelmacromastery 1) Copy categories, 2) Remove Duplicates, 3) Create SUMIF on first category. 4) Fill-Down. I am not suggesting this would be faster than your fastest method, but I suspect it would be faster than your example SUMIF method. Also, am I wrong about the unnecessary clsCustomer objects for every row of data? It should only be for every new category.
What about advanced filter for list of unique names(=option) in column X and eg array formulae {=sum((Table4[Full Name]=x2)*Table4[Orders])} for total Orders and copy down and similar formulae for total Amount {=sum((Table4[Full Name]=x2)*Table4[Amount])}. Use Range.Calculate to test and include time of AF and copying formulae?
That's an interesting solution. I'll give it a try when time permits. Update: AdvancedFilter is actually very slow when you try to get unique records for the dataset I used.
Thanks Hui. That's an interesting one. I will test it out and see. Update: In my scenario I am reading records and creating a new report of summed total without knowing the categorie in advance. For this scenario DSum is not efficient because it has to be called for each category that is summed. This means it will be slow like SumIf as it requires multiple calls to the worksheet.
In regards the Pivot Table I suspect that most of that time is taken up in actually setting up the Pivot table I suspect that if you changed the PT macro to reset the PT Source range on a PT that is already in place , that would force a recalc and be more comparable.
@@ihuitson I haven't timed the milliseconds of this method, but in practice this is exactly what I would do normally. that is, setup pivot layout and calculations beforehand. Then in future, reassign and refresh the data source programmatically when necessary, without ever tampering with the underlying pivot structure again.
i see storing data into array is the fastest approach, however wont this put a big burden on RAM? especially when 32bit excel can only make used of 2GB RAM
Yes, working with arrays is always faster, but spends more memory. It's the eternal compromise between memory and speed. But, ... how much memory? 1MB of data? 10MB? I think that's still little
It is only a problem if you are dealing with a huge amount of data. If this is the case the you have to come up with ways to optimize the code for the particular task.
i'm newbie to vba things, but i found out ur tips and video help me to understand ad 02:18, can u share getMethod sub? i run all the code i can write on vba, when i run got error on GetMethod things... if u can give the file also helpful,,, any help appreciate thanks
If you're planning to Sum data using Excel VBA then I think you will find the comparison of the different methods very interesting.
Enjoy the video and please add any comments or questions below.
Just an important note if using the Scripting.Dictionary it isn't available on the Mac. There are some VBA alternatives available Tim Hall's version github.com/VBA-tools/VBA-Dictionary and well I've done one using interfaces so can easily switch Dictionary implementations. github.com/MarkJohnstoneGitHub/VBA-IDictionary
Now I know what I one prefer thou would be biased. :)
After examining Tim Halls Dictionary version I discovered numerous areas for performance improvements. Note I haven't tested on a Mac thou it should be compatible.
I'll have to have a more in-depth look at what you're attempting to do. I was working on creating indexes for a dictionary with groupby functionality then got distracted doing the Dictionary project. Doing a Groupby Count, Sum, Average etc would be handy functionality using a Dictionary.
Awesome video again Paul thank you
Great videos, great channel, you are a reference. Few channels approach these advanced issues about vba and efficiency. Congratulations.
Thanks Victor.
Отличное объяснение. Всё наглядно и понятно. Визуализация, стрелочки и переходы - качественные! 👍
Hi Paul, I thaught that the fastest was Pivotable. I discover the use of vba dictionary with you. I will try to work on it to be more confortable. Thank you very much.
You're welcome.
Hi Paul.. great video. So very interesting when you compare different methods.. love that way of illustrating the point. It also gives various options to use in different situations. Thanks for sharing it. Thumbs up!
Glad you like it Wayne.
Thank you so much for this very informative tutorial today sir EMM. Highly appreciated this one
No problem 😀
Another awesome video showing multiple ways to approach a task and revealing which is the most efficient. Thanks for posting.
I noticed that in the Dictionary approach the code tests for the existence of a key before adding an object (record). I am not sure but it seems that testing for the existence of a key can take extra time. My solution is to pass through the array twice, the first time collecting the keys of the dictionary (e.g. "Apple", "Orange", "Pear" etc) and the second time through I accumulate the totals for each key, never having to check for the existence of a given key. On a data set of 100K rows my code seems to work pretty fast.
20 Set rngData = ActiveSheet.Range("A1").CurrentRegion
30 Set rngData = rngData.Offset(RowOffset:=1).Resize(Rowsize:=rngData.Rows.Count - 1)
40 arData = rngData.Value
50 lngSize = UBound(arData, 1)
'Get dictionary keys
60 For i = 1 To lngSize
70 dictSums.Item(arData(i, 1)) = 0
80 Next i
90 For i = 1 To lngSize
'Accumulate total for each key
100 vntKey = arData(i, 1)
110 vntTotal = dictSums.Item(vntKey)
120 vntAmount = arData(i, 2)
130 dictSums.Item(vntKey) = vntTotal + vntAmount
140 Next i
150 For Each vntKey In dictSums.Keys
160 Debug.Print vntKey & " Total: " & FormatCurrency(dictSums.Item(vntKey))
170 Next vntKey
Output:
Peach Total: $10,065,915.00
Grape Total: $9,840,145.00
Orange Total: $9,998,053.00
Pear Total: $9,932,674.00
Apple Total: $10,000,821.00
Thank you kindly.
Thanks for the suggestion.
If you are only summing one column then you can actually do it like this:
dictSums(vntKey) = dictSums(vntKey) + vntAmount
It will automatically add the key if it doesn't exist.
If you are adding an object then it is complicated because you have to add an object per key.
Thanks Paul, new to your channel and the more advanced VBA. This is exactly what I am looking for so have adapted it for my own use. However, the "Dim oABC As clsABC throws up an error but clearly works with what you have. Now trying to Google why!!
Sir, would be possible to get a copy of the workbook that you used in this video. I like to go over the vba code. Thanks
it is really thrilling to follow your experiment. thank you for sharing your expertise and passion!
Cảm ơn. Glad you liked it.
@@Excelmacromastery seriously, you can type Vietnamese? you rock^^
I’ve never used Dictionary Arrays before, I’ll give them a go.
Excellent content comparing methods for solving frequently-encountered problems. Some of your methods are new to me, or previously mysterious; your explanations give me a clear recognition of additional methods and functionality I need to add to my personal "took kit."
Thanks Sam.
I prefer ADO, since I can do join and sum together. it is very handy if you have a data table then a lookup table in the same workbook for extra grouping.
The rather new "SumIfS" is missing. This new Version of SumIf is far more powerful. It would be interesting to see, how fast it is in comparison.
As always: a very informative Video! Thanks!
Great sir,, Thank you for this
Another excellent tutorial! Thank you for the clear explanations & the measured comparisons. I'm a believer! Since watching this video shortly after it came out, I have been applying these techniques to all new code, and it's great how the code really flies. Thank you! I've also started going back thru old code to replace what I knew then with what I know now. (Oh, how I wish I could get my younger self back to do the same thing!)
A question ... In a new procedure under development, I copy data from 1 data table (ListObject) to another table (ListObject). One column/field stores numbers formatted as text (e.g., 002, 005, 010 .. each on a separate row). When I copy the data (loFrom.DataBodyRange.Copy Destination:=loTo.etc.), the command preserves the formatting so the data land as 002, 005, 010. But, when I assign the data (loTo.DataBodyRange.Value = loFrom.DataBodyRange.Value), the data land as numbers (e.g., 2, 5, 10). With my new Excel Macro Mastery knowledge, I would prefer to assign the values, but I need them to retain their formatting. Is there a property or other technique that would accomplish that?
Another great video, thanks! I haven't used dictionaries yet, only arrays, I need to try them 😀
Thanks Daniel
Thanks a lot for the video, is there a way to download the Excel Example?
Enjoyed that, Thank you for sharing
Thanks Mark
Do you have any vedio explaining the methods? The codes ?
Great video! I would like to know more about ADO. I have doubts using it like: Can I get values from an unopened workbook which are formula-dependent?, etc.
Would be glad if you make a separate video about this one.
Cheers!
I plan to do a video on ADO in the near future.
I'm pretty sure if you are reading from a closed workbook, the values that you read from it would be the values that were in those cells at the time the workbook was last saved. (ie. formulas will not update the values)
Need help with creating a VisualBasic code for Excel to sum a column every 1 minute and output the summation to another column?
Hello! Probably a lame question, but at the dictionary method, you've defined oCustomer as clsCustomer. Looks like it's a custom variable that were defined somewhere else. If yes where did you define it and how?
It's a class module
very nice video.
great videos thanks for sharing!
Thank you Zachary.
Hello, is there a link to the workbook used in the video? Would be very useful.
Please refer to the Sumifs formula based on the date range
Hi Paul ..is there a way to create multiple values into dictionaries as in not as key but values. Also can there be way to use calculated fields (numbers field) as row label in pivot. I have to optimize a code with around 80colunns with multiple vlookups and sumifs in excel UI on 30k records taking almost an hour to refresh
Hi, thanks a lfor all of your efforts of teaching as the best way of using vba. Btw: What code is behind GetMethod? If you don´t mind, I would be glad about a show of code :)
Could you please share the excel example file?
Very good
Would SUMIF not be quicker if you copied the range & removed duplicates instead of using a dictionary?
In general the more worksheet operations(i.e. each SumIf) the slower the code is compared to memory operations(i.e. Dictionary).
With 500000 data to sum, pivot table will win over all, just because all other methods have for loop which is becoming more heavier and heavier with more data, pivot table helped me the most on that, because the number of data doesn't affect too much on it
At one point you type the function right into the worksheet to show it before coding it. It would have been nice if you had included that in your speed test for comparison. The code just rights the function to the cell and clocks the time it takes for the worksheet to do the math.
Can anyone tell me what is the blue box on the sidebar is at 8.46
That is a "bookmark". The "Edit" toolbar has buttons that allow you to jump forward and backward to each bookmark you have defined in the code.
I’m crediting you with teaching me VBA. I think I’ve watched just about every video in this playlist over the past couple months.
I do have a question. I regularly have to continually sum data with multiple conditions, frequently with Or criteria as well. Using excel formulas, I’d go for SUMPRODUCT. It doesn’t seem like the dictionary array would work for that because the key would have to be a combination of several attributes. Could you use the class module as the Key instead of the item?
If you have multi criteria then you can use multiple fields as the key. Concatenate them together as a string. You wouldn't usethe class module as a key as this would mean the key was a memory address.
@@Excelmacromastery that makes sense! Thank you!
How about sumproduct
Sir, may I have the copy of the VBA code for my deep understanding? Thanks
Dictionary is faster, but speed is not always the determinant.
If you have many grouping levels with grouping not only by rows but also by columns - using dictionaries can be quite tricky)
And time difference might not worth it.
That's true. Speed is not always the most important factor. But it is interesting to compare the speed of different methods.
Thanks. Very useful information. What makes the 4th option faster is to write to an array (all cells at once)?
I think youre right. Because, besides Excel only need to handle the ram memory, all memory registers are side by side in an array, so it is really fast.
Image processing relies a lot on arrays just because of that, I think...
Because the arrays are simply data in memory and all the work is being done here. It's much slower to do it by referencing the worksheet.
The Array Dictionary method is by far the fastest method.
However sorting the report sheet is a little faster than sorting the dictionary. I have 0.25 s for the array dictionary method for 200,000 records
I am trying to build a sudoku Solver in VBA , i am more then 3 days busy full time..but VBA really sucks on this ..throwing one error after the other. I think I will use another language to program in it, too bad.
I'm pretty sure you could get the SUMIF method to run many times faster if you copied the column of categories into the destination location and ran RemoveDuplicates on that column instead of reading through the data to build a list of unique categories.
That would be only a few lines of code with no loops.
Your SUMIF method also seems to be unnecessarily creating a new clsCustomer object for every row in the data.
How can you run SumIf on each category without a loop?
@@Excelmacromastery 1) Copy categories, 2) Remove Duplicates, 3) Create SUMIF on first category. 4) Fill-Down.
I am not suggesting this would be faster than your fastest method, but I suspect it would be faster than your example SUMIF method.
Also, am I wrong about the unnecessary clsCustomer objects for every row of data? It should only be for every new category.
Great!!!
Thanks Sergio.
What about advanced filter for list of unique names(=option) in column X and eg array formulae {=sum((Table4[Full Name]=x2)*Table4[Orders])} for total Orders and copy down and similar formulae for total Amount {=sum((Table4[Full Name]=x2)*Table4[Amount])}. Use Range.Calculate to test and include time of AF and copying formulae?
That's an interesting solution. I'll give it a try when time permits.
Update: AdvancedFilter is actually very slow when you try to get unique records for the dataset I used.
Paul, I'd be interested to see how a DSum function compares
Thanks Hui. That's an interesting one. I will test it out and see.
Update:
In my scenario I am reading records and creating a new report of summed total without knowing the categorie in advance.
For this scenario DSum is not efficient because it has to be called for each category that is summed. This means it will be slow like SumIf as it requires multiple calls to the worksheet.
In regards the Pivot Table I suspect that most of that time is taken up in actually setting up the Pivot table
I suspect that if you changed the PT macro to reset the PT Source range on a PT that is already in place , that would force a recalc and be more comparable.
@@ihuitson I haven't timed the milliseconds of this method, but in practice this is exactly what I would do normally. that is, setup pivot layout and calculations beforehand. Then in future, reassign and refresh the data source programmatically when necessary, without ever tampering with the underlying pivot structure again.
Would you mind to upload the code file?
i see storing data into array is the fastest approach, however wont this put a big burden on RAM? especially when 32bit excel can only make used of 2GB RAM
Yes, working with arrays is always faster, but spends more memory. It's the eternal compromise between memory and speed. But, ... how much memory? 1MB of data? 10MB? I think that's still little
It is only a problem if you are dealing with a huge amount of data. If this is the case the you have to come up with ways to optimize the code for the particular task.
please upload a file with examples
i'm newbie to vba things, but i found out ur tips and video help me to understand
ad 02:18, can u share getMethod sub? i run all the code i can write on vba, when i run got error on GetMethod things...
if u can give the file also helpful,,, any help appreciate
thanks
GetMethod returns an enum which I created. You can create them yourself.
Without surprise dictonnary win this round 😀.
😀
Data type SumMethod not define?
Sorry, I don't understand?
@@Excelmacromastery
Thanks for your response.
Dim method As summethod
when I compile the, the above declaration error.
user-defined type not defined
when i compile it gives error
user-defined type not defined for
Dim method As summethod
SumMethod is an enum that I have defined.
Enum SumMethod
mAdo = 0
mArrayDictionary = 1
mPivot = 2
mSumIf = 3
End Enum