Great video, very clear and concise delivery. I like these small breaks giving audience time to digest the information. This is what Java community needs, much appreciated!
Nice video! I think ArrayDeque would have deserved a mention. It‘s basically an variant of ArrayList that outperforms LinkedList in those queue operations like adding to the front. So even if LinkedList would be better for them as ArrayList, there is no reason to use it, because you can use ArrayDeque instead.
Great video and what a awesome professor! I have read Core Java from Horstmann which is a great book btw but Mr. Paumard’s analysis in this video goes even further than what is presented in the book chapter regarding the comparison between ArrayList and LinkedList. I always learn some something new with Mr. Paumard videos (specially Stream API videos). This JEP was really good. Thanks, José.
In terms of memory usage for the typical usage examples (like get data from db, procees it and send to let's say kafka) I think it worth to mention that ArrayList will hold references to the data objects until iteration is over. Making the GC not able to free memory of processed elements. And it can be the case that one data element holds more memory then the list itself. Probably the solution is to iterate from the end and call remove or ... to use LinkedList queue capabilities.
Because of the way processors work (i.e. how pages are cached in the processor) you can't really go wrong with an ArrayList, but let's see what we learn from the video
There are many things to consider regarding the usage of Array list, the array list is an API and not just an array of objects, you have to consider memory consumption (and CPU capabilities), time complexity when dispatching (contains and remove by value), sorted cases and concurrency, those must be in our minds before implementing a data structure.
One famous scenario, you can go wrong with an ArrayList by simply trying to structurally modify the array from another thread while iterating over its elements, this is known as ConcurrentModificationException, the ConcurrentModificationException prevents the race conditions.
@@pavlg3944 same would happen with LinkedList afair, and the title of the video is "Choosing between ArrayList and LinkedList(...)" and I was referring to choosing between one of those.
@@pavlg3944 in case of java it's quite obvious, we're talking about a specific implementation java.util.ArrayList, no one is implementing anything, it's just about choosing implementation and not the API that is quite similar (if not the same) as it is for LinkedList since both implement List interface. There may be some cases when LinkedList is better, but it's a quite small subset of cases imho (e.g. if you remove elements from the list very often and don't iterate through it too often) that's why I'd say in vast majority of cases defaulting to arraylist will do the trick, and even if the LinkedList is a better solution for a particular case, the benefits won't be that big
@José Paumard To continue the argument here. My point i am making that a hashmap/set could benefit from a resize/prune function is. - Yes a put all function exists, but it does require you know almost exactly how many elements you have. Or know in what power of two you roughly fall into, not to trigger a second rehash, while a hashmap that has a ensureCapacity function you just have a rough estimate and you are way less likely to hit a rehash because you are already dealing with higher powers of two with larger gaps. - If you have a lot of elements that have a lot of hash collisions, you can actually force a treeify/untreeify with trim/ensureSize, because you can decide how tightly packed the map is. Also just to answer your question if that solves my "problem": I don't have a problem with the java implementations, a problem would mean i don't have a "fix" for it that i am not actively using. I am suggesting something that could increase quality of life, and could be moved into a dedicated interface. I mean "Sequenced Collections" is also just a Quality of life interface of the same nature. I rarely use actually java collection implementations anyways, because FastUtils/PrimitiveCollections (my version of FastUtil) have just better implementations and are all i need anyways and are in general faster. Which i benchmarked, though it is biased since it tests primitive speeds. I am not sure if i am allowed to post links so unless you ask i won't. End point was: It was a suggestion, because i would like to see no need for my implementations xD
Thanks for your contribution, and when I said "problem" I was more thinking of something that would be bothering you than a real problem. I guess you are aware of the newHashMap(int numMappings) factory method on HashMap, that allows you to create a hash map with the right capacity to accomodate a given number of entries. But indeed you cant "shrink" it in case you need it. All you can do is putAll(), which creates another map. Something a shrinking method would have to do anyway. I think your implementation is great! Why would you see no need for it? :) And I guess that the next big update of the Collections Framework will be when Valhalla, value types, and non-nullable value types are there. At that point having maps with value type keys should basically enable the int key based map implementations, with objects.
@@JosePaumard > I guess you are aware of the newHashMap(int numMappings) Yes i am aware of that, but it doesn't really solve the underlying issue with lack of control for the user. And putAll only takes the inputted hashmap into account for resizing it, not even the hashmaps content itself, making it technically not even optimized for already filled hashmaps... (merging 2 hashmaps to make it clear) (Love when you have like a map of 2k elements and you want to insert another 2k elements, => SHITTY SLOW xD) Edit: Isn't that a bug? > I think your implementation is great! Why would you see no need for it? :) Well even when Valhalla shows up i am pretty sure the implementation need for custom frameworks is there, though they will massively shrink, by a factor of 72... Though it would be nice if java could do it on its own and not require such large custom implementations. Also thank you :)
@@JosePaumard to be fair, that is a tiny bug, but the moment you have to insert multiple maps into each other that can get painful... Because you can't batch the resizes.
@@JosePaumard Oh yeah i found a way/exploit to implement a expand method into HashMaps/Sets without actually requiring to override the existing implementations. Simply implement a abstractMap that gives a wrong size parameter and has a empty iterator. Thanks to that the putAll/addAll method will simply grow the array and not add any elements or crash. I love exploits!
14:58 creating iterator contributes to huge allocation increase in my application. That's why I had to hardcode to ArrayList and use get(index) instead EDIT: 16:18 and yes, JVM creates all those iterators. I can see it in JFR recording. It also results with many additional GC cycles
Thanks for the video! Minor correction: At 20:15, you said that adding an element to an ArrayList is O(log(n)). That's not correct. The worst case complexity is O(n), but the amortized complexity is still O(1).
Example for worst case complexity: When the ArrayList currently has capacity 100 and size 100 and I want to add another element, all 100 elements have to be copied to a new array. In general: In case the array is full, adding an element is O(n) because all n elements have to be copied.
@@JosePaumard Yeah, O(log(n)) feels like it should be right, but it's actually O(1). Let's do the math... Let's say g is the growth factor for each resizing. If we add g^k elements one ofter the other, we have to resize/copy the array k times, and the sizes of the copies are 1+g+g^2+...+g^k = (g^(k+1)-1)/(g-1), which we can approximate as g^(k+1)/(g-1)=g^k*g/(g-1), which is the final size g^k multiplied by the constant factor g/(g-1). So to add an element we have to copy g/(g-1) elements on average. That's a constant, i.e. O(1). I'm not sure I explained this very well, but the Javadoc for ArrayList also says: "The add operation runs in amortized constant time, that is, adding n elements requires O(n) time." docs.oracle.com/en/java/javase/21/docs/api/java.base/java/util/ArrayList.html
@@jcsahnwaldt I'm sorry to disagree. The growth factor is 2, that is, everytime you grow the internal array, you double it's capacity, up to a certain point of course. So if you need to add N elements, to determine the number of times you grew the array, you need to find the smallest k that satisfy N
> O(1) for LinkedList for "Inserting middle" Unless you use ListIterator which is already positioned at the specified index, add(int index, E element) first needs to reach that index, which is O(n).
For all folks saying "ArrayDeque would have performed better", I ran a benchmark of adding 10,000 elements at the end of list without specifying any initial capacity (consider it as a stack) and here are the results Benchmark Mode Cnt Score Error Units ArrayList avgt 5 40593.366 ± 267.998 ns/op ArrayDeque avgt 5 45781.272 ± 1182.283 ns/op LinkedList avgt 5 29307.404 ± 32.218 ns/op LinkedList performs much better than anything else
So, what if the LinkedList not only had the previous and next elements stored in each node, but also the element +/- 10 elements away and +/- 100 elements away and so on? That way it can take shortcuts to wherever you need to access. Each element would be less space efficient, but much more time efficient in theory. Actually, after writing this, I did a bit of research and discovered the concept of Skip Lists, which seems to be more of less what I just proposed. And there he goes mentioning it at 8:16 lol
Thanks to everybody who were there for Premiere, it was great chatting with you!
Great video I think we need some more similar videos on sets and maps. Keep up the good work
Thank you!
Thank you for making the "measure, don't guess" principle clear once more and telling everyone to use JMH
You are welcome!
He drinks coffee so timely that its almost synced with the video !! #Coffee-drinking-skill
And off course, with great content.
😄Maybe it's the opposite?
Great video, very clear and concise delivery. I like these small breaks giving audience time to digest the information.
This is what Java community needs, much appreciated!
Thank you, I really appreciate it!
Nice video! I think ArrayDeque would have deserved a mention. It‘s basically an variant of ArrayList that outperforms LinkedList in those queue operations like adding to the front. So even if LinkedList would be better for them as ArrayList, there is no reason to use it, because you can use ArrayDeque instead.
More like this please. Thanks for this video
Thank you, glad you liked it!
I love the JEPCafe and I esspecially love your french styled englisch ! Keep on doing this series, its so great and informative ! Thumbs up!
Thank you! I guess my French accent is incidental though... 😄
This video is so great. Thanks for making this, Mr. Paumard. I like watching your videos.
Thank you!
Great video and what a awesome professor! I have read Core Java from Horstmann which is a great book btw but Mr. Paumard’s analysis in this video goes even further than what is presented in the book chapter regarding the comparison between ArrayList and LinkedList. I always learn some something new with Mr. Paumard videos (specially Stream API videos). This JEP was really good. Thanks, José.
Thank you for your kind words, I really appreciate them!
Same here.
The Stream API course on Pluralsight helped me really understand the map, filter, and reduce operations better.
This was enlightening. Thank you
In terms of memory usage for the typical usage examples (like get data from db, procees it and send to let's say kafka) I think it worth to mention that ArrayList will hold references to the data objects until iteration is over. Making the GC not able to free memory of processed elements. And it can be the case that one data element holds more memory then the list itself. Probably the solution is to iterate from the end and call remove or ... to use LinkedList queue capabilities.
Honestly when I want a stack, I typically use ArrayDeque and not LinkedList.
Because of the way processors work (i.e. how pages are cached in the processor) you can't really go wrong with an ArrayList, but let's see what we learn from the video
There are many things to consider regarding the usage of Array list, the array list is an API and not just an array of objects, you have to consider memory consumption (and CPU capabilities), time complexity when dispatching (contains and remove by value), sorted cases and concurrency, those must be in our minds before implementing a data structure.
One famous scenario, you can go wrong with an ArrayList by simply trying to structurally modify the array from another thread while iterating over its elements, this is known as ConcurrentModificationException, the ConcurrentModificationException prevents the race conditions.
@@pavlg3944 same would happen with LinkedList afair, and the title of the video is "Choosing between ArrayList and LinkedList(...)" and I was referring to choosing between one of those.
@@pavlg3944 in case of java it's quite obvious, we're talking about a specific implementation java.util.ArrayList, no one is implementing anything, it's just about choosing implementation and not the API that is quite similar (if not the same) as it is for LinkedList since both implement List interface. There may be some cases when LinkedList is better, but it's a quite small subset of cases imho (e.g. if you remove elements from the list very often and don't iterate through it too often) that's why I'd say in vast majority of cases defaulting to arraylist will do the trick, and even if the LinkedList is a better solution for a particular case, the benefits won't be that big
@@pavlg3944 Hmm not quite. You can trigger the ConcurrentModificationException with only one thread.
@José Paumard
To continue the argument here.
My point i am making that a hashmap/set could benefit from a resize/prune function is.
- Yes a put all function exists, but it does require you know almost exactly how many elements you have. Or know in what power of two you roughly fall into, not to trigger a second rehash, while a hashmap that has a ensureCapacity function you just have a rough estimate and you are way less likely to hit a rehash because you are already dealing with higher powers of two with larger gaps.
- If you have a lot of elements that have a lot of hash collisions, you can actually force a treeify/untreeify with trim/ensureSize, because you can decide how tightly packed the map is.
Also just to answer your question if that solves my "problem":
I don't have a problem with the java implementations, a problem would mean i don't have a "fix" for it that i am not actively using.
I am suggesting something that could increase quality of life, and could be moved into a dedicated interface.
I mean "Sequenced Collections" is also just a Quality of life interface of the same nature.
I rarely use actually java collection implementations anyways, because FastUtils/PrimitiveCollections (my version of FastUtil) have just better implementations and are all i need anyways and are in general faster.
Which i benchmarked, though it is biased since it tests primitive speeds.
I am not sure if i am allowed to post links so unless you ask i won't.
End point was: It was a suggestion, because i would like to see no need for my implementations xD
Thanks for your contribution, and when I said "problem" I was more thinking of something that would be bothering you than a real problem.
I guess you are aware of the newHashMap(int numMappings) factory method on HashMap, that allows you to create a hash map with the right capacity to accomodate a given number of entries. But indeed you cant "shrink" it in case you need it. All you can do is putAll(), which creates another map. Something a shrinking method would have to do anyway.
I think your implementation is great! Why would you see no need for it? :)
And I guess that the next big update of the Collections Framework will be when Valhalla, value types, and non-nullable value types are there. At that point having maps with value type keys should basically enable the int key based map implementations, with objects.
@@JosePaumard
> I guess you are aware of the newHashMap(int numMappings)
Yes i am aware of that, but it doesn't really solve the underlying issue with lack of control for the user.
And putAll only takes the inputted hashmap into account for resizing it, not even the hashmaps content itself, making it technically not even optimized for already filled hashmaps... (merging 2 hashmaps to make it clear)
(Love when you have like a map of 2k elements and you want to insert another 2k elements, => SHITTY SLOW xD)
Edit: Isn't that a bug?
> I think your implementation is great! Why would you see no need for it? :)
Well even when Valhalla shows up i am pretty sure the implementation need for custom frameworks is there, though they will massively shrink, by a factor of 72...
Though it would be nice if java could do it on its own and not require such large custom implementations.
Also thank you :)
@@Speiger > Edit: Isn't that a bug?
Let me ask!
@@JosePaumard to be fair, that is a tiny bug, but the moment you have to insert multiple maps into each other that can get painful... Because you can't batch the resizes.
@@JosePaumard Oh yeah i found a way/exploit to implement a expand method into HashMaps/Sets without actually requiring to override the existing implementations.
Simply implement a abstractMap that gives a wrong size parameter and has a empty iterator.
Thanks to that the putAll/addAll method will simply grow the array and not add any elements or crash.
I love exploits!
awesome explanation, thank you ☺️
Great explanation 😊
14:58 creating iterator contributes to huge allocation increase in my application. That's why I had to hardcode to ArrayList and use get(index) instead
EDIT:
16:18 and yes, JVM creates all those iterators. I can see it in JFR recording. It also results with many additional GC cycles
Where did you get that coffee mug from? :)
I got it years ago on a booth at JavaOne. Not sure you can find it anywhere...
Thanks for the video! Minor correction: At 20:15, you said that adding an element to an ArrayList is O(log(n)). That's not correct. The worst case complexity is O(n), but the amortized complexity is still O(1).
How do you compute O(n)?
Example for worst case complexity: When the ArrayList currently has capacity 100 and size 100 and I want to add another element, all 100 elements have to be copied to a new array. In general: In case the array is full, adding an element is O(n) because all n elements have to be copied.
@@jcsahnwaldt Yes, but it happens only once in a while, and the number of time it happens decreases with n. Thus the O(log(n)) complexity.
@@JosePaumard Yeah, O(log(n)) feels like it should be right, but it's actually O(1).
Let's do the math... Let's say g is the growth factor for each resizing. If we add g^k elements one ofter the other, we have to resize/copy the array k times, and the sizes of the copies are 1+g+g^2+...+g^k = (g^(k+1)-1)/(g-1), which we can approximate as g^(k+1)/(g-1)=g^k*g/(g-1), which is the final size g^k multiplied by the constant factor g/(g-1). So to add an element we have to copy g/(g-1) elements on average. That's a constant, i.e. O(1).
I'm not sure I explained this very well, but the Javadoc for ArrayList also says: "The add operation runs in amortized constant time, that is, adding n elements requires O(n) time." docs.oracle.com/en/java/javase/21/docs/api/java.base/java/util/ArrayList.html
@@jcsahnwaldt I'm sorry to disagree. The growth factor is 2, that is, everytime you grow the internal array, you double it's capacity, up to a certain point of course. So if you need to add N elements, to determine the number of times you grew the array, you need to find the smallest k that satisfy N
Sso nice video!
15:04 Shouldn't be for(var v : ints){ (without = 0)?
That one case when linkedlist performs better than arraylist is when you pick arraydeque instead then
It seems to me that head/tail recursion is a good use case for linked lists... but it is sad the lack of tail recursion in java...
The coffee doesn't go down in your cup between sections 🤣
At 2:51, shouldn't it be O(n) for Array and O(1) for LinkedList for "Inserting middle"? And "Inserting first" for Array should be O(n)?
> O(1) for LinkedList for "Inserting middle"
Unless you use ListIterator which is already positioned at the specified index, add(int index, E element) first needs to reach that index, which is O(n).
Yes to the first part -- inserting an additional element into the array list anywhere before the end is O(n) with n elements after the insert.
Sry , I didn't understand why if a = 10, error is < 1%
Thanks
For all folks saying "ArrayDeque would have performed better", I ran a benchmark of adding 10,000 elements at the end of list without specifying any initial capacity (consider it as a stack) and here are the results
Benchmark Mode Cnt Score Error Units
ArrayList avgt 5 40593.366 ± 267.998 ns/op
ArrayDeque avgt 5 45781.272 ± 1182.283 ns/op
LinkedList avgt 5 29307.404 ± 32.218 ns/op
LinkedList performs much better than anything else
Ummm... So essentially nothing has changed in 25 years... because all your results were also true in Java 1.2....
So, what if the LinkedList not only had the previous and next elements stored in each node, but also the element +/- 10 elements away and +/- 100 elements away and so on? That way it can take shortcuts to wherever you need to access. Each element would be less space efficient, but much more time efficient in theory.
Actually, after writing this, I did a bit of research and discovered the concept of Skip Lists, which seems to be more of less what I just proposed.
And there he goes mentioning it at 8:16 lol
Skip lists makes the O(n) access time to O(log(n)). Which is better! But it consumes more memory.