You are becoming one of my favourite tech youtubers really quickly. If I might make a small suggestion for your production, it would look better if you could get your camera a little further away so the wide lens does not distort things so much. This is awesome!
I dunno, I kinda like the distortion! It gives me nostalgia for those The Dog calendars, or when MTV was good. It's kind of a channel signature that this guy will show up with a honey-smooth voice and a head the size of a paper shredder, just to drop some knowledge on us. On the other hand, the doctors diagnosed me with space madness, so my opinion may not be typical.
Haha. You are an excellent teacher. Well done. (I taught CS at the college level for quite a few years.) You might, though, add a caveat to "hash maps are quite simple to implement" to say "simple hash maps are quite simple." You no doubt know that all but cryptographically secure hash functions admit construction of large data sets that all end up in the same hash bucket, turning a naively implemented table into a list. Many war stories of DOS attacks boil out to this fact. Current libraries seem to use balanced trees as closed hash buckets as an antidote.
This was great! The way you teach from first principles is super rewarding because, as a viewer, having the opportunity to anticipate problems and even consider solutions to them ahead of time really makes the concepts stick when you ultimately address them. I've really enjoyed every video of yours I've watched so far!
That is so wonderful to hear. For me personally, the best teachers I've had would always have us work through problems together, so I could see their thought process around solving each step, rather than just telling me the answer. It's even better when it's something the teacher doesn't know the answer to either, and they're not afraid for you to see them make mistakes and struggle a little bit 😁
Sharing the general consensus that you are a fantastic teacher and are becoming one of my favorite programming RUclipsrs for learning interesting concepts. This was fascinating!
Omg the first video that mentions the relation to hash's and arrays, as well as memory. The questions I always ask after watching any other video on hashmaps
That's amazing. It's anecdotal but a lot of the great programmers I know had parents who were either in CS or CS adjacent (like electrical eng) and taught their kids core concepts from a young age 🙂
@@nicbarkeragain yes 30 years in IT and the core concepts are what I want to impart to my son. You channel is excellent and a go to resource which I shall recommend.
Your educational videos are incredible - I came across your work (as many I'm presuming) through Clay, but these mini lectures are gold! Very clear style, super easy to follow and very well explained.
10:26 you forgot to increase the division result ;) That aside, this channel has been such a wonderful find. When you talk about a topic you go from start to finish. Even if you already know about hashmaps, it's still fun to get a byte-sized explanation of everything. Thanks for keeping programming content awesome!
I went over this video like 5 times looking for tiny visual errors, and one always manages to sneak through 😅 And thank you so much for the kind words, when I was studying comp sci at university I always felt like there was too much jargon and the lessons had too many assumptions about prior knowledge, so I try to design these videos as to not leave anyone behind.
I did CS in university and still found it hard to grasp these concepts at first, it wasn't until I had actually built a hashmap myself in C that it really "clicked" for me 🙂
Amazing video! I'm massively impressed at how clear and concise you made this. In twenty minutes you managed to give the simplest and most complete explanation of how HashMaps work that I've ever seen. My knowledge on the internal workings of HashMaps has always been spotty with bits and pieces here and there, and this video filled in the gaps I think. Would have been nice to have this video a decade ago when I was in university!
Thanks so much Lars, I really appreciate that. I totally feel you re: data structures at uni, I try to make these videos from the perspective of what would have helped me the most back then 🙂
great content, just recently I am thinking how dictionary works, I did some study and learn that there is collision, bucket, clustering...etc but still doesn't quite get how it works. but this single video makes everything crystal clear and easy to understand. thanks a lot.
That's great to hear! I personally find programming topics that are a bit "academic" tend to use lots of jargon (data structures are a great example). I personally prefer simple terms and visual examples if possible, I'm not good at memorising words 😅
I don't know if you have any experience, but Ive been trying for a long time now to create a programming language, and Im stuck on backend codegen, either using LLVM or custom IR but I have no idea how to approach this as resources are scarce. You are really good at breaking things down, so it would be a blessing! Thanks for the amazing content.
I will absolutely do some videos on compilers, interpreters and general lower level concepts 🙂 In the meantime, I have heard that the pikuma compiler course is really great!
Very well put together! I might not have learned anything about hash maps in this video, but I did learn a lot about how to teach someone how they work 😄 This brings me back to me teaching myself how hashmaps work and implementing one in C because I needed it for my C project during my masters (mind you, I was studying physics, not CS, so barely any formal training in data structures and algorithms 😅). That must have been 10 years ago. Since then I left physics to become a software dev.
No worries at all, glad it could clear some things up for you! I still remember the first time I implemented a hash map in C myself and thought "wait, it's only 10 lines of code?" 😁
Before this I've seen j_blow explain linear probing to solve collisions but I still don't understand how you look up the item, this video is very helpful.
Honestly i too never thought about linear probing concretely since i attended uni a decade ago, this was a fantastic refresher and i think gave me an even better intuition for it than when i initially learned it. Thanks 👏
The thing about hashtables is the trade offs are very situational so what you need in one situation might be very different than an other, which is why you might be inclined to write your own hashmaps in the first place, aside from as a learning exercise. I always bound my open addressing. For example if you're using linear probing then don't just probe the whole buffer. Again this is situational and depends on the size of the buffer and the size of the hash map entries etc etc but let's say you're storing some u32s that's too say 32 bit unsigned integers. Maybe scan forward a couple cache lines worth, stopping when you see an unpopulated cell. Then fall back to close addressing, maybe another hashmap or an array or a series of arrays. Ideally you've got a good hash function with pretty uniform distribution and you won't get so much clustering but even a pretty crappy hash function can be powerful and making a good hash function is difficult. So assuming some clustering. But you really need to measure these things and if you're getting a lot of clustering and a lot of collisions you probably want to do some work on your hash function. Aside from that the table size is important. The size you choose can lead to clustering, it's typical to use prime numbers for table lengths. There's lots of strategies and I'd be lying if I'd measured the merits of all the ones I've tried let alone anything else, but you probably want to go with something cache friendly or rehash using the data from the cell collision as salt. These things are easy to over engineer.
Actually, it's probably easier now to make a good hash function on modern pcs if you have access to the aes instructions. The performance characteristics are pretty good from my understanding.
Ho god! The title of this one is an automatic thumb up! I wanted to understand (and not learn) HashMaps for a long time :D ie I wondered about collision, and search speed Thnaks a lot!
I built a hashmap for my scripting language's table type. I had watched a video about a new algorithm that reduces clustering called the robin hood method, IIRC. Basically, an element that is shifted due to collisions is "poorer" than those in the correct spots. So, when clustering occurs, the elements that are poorer than the old ones get placed, and thr richer ones continue with the shifting. In effect, the shifting is balanced between elements. What interests me most about your video was the load factor though... mine might be too high.
Hey, Nic, enjoying your videos and appreciate the work you put in and the knowledge you're sharing. Just wanted to ask: are there situations where it's actually a good idea to implement your own hashmap rather than using a language's built-in hashmap, or is the act of implementing one more of a learning exercise? Thanks
That is a fantastic question. It really depends on the context 🙂 The disadvantage that language designers have when implementing a standard hash map is that they need to pick something that can have decent performance and stability with any possible input and usage, which limits them to making much more conservative and defensive choices about the design. It's possible to design a simple hashmap that works amazingly well for your particular use case if you know lots of information about how it's going to be used (frequency of lookup vs insert vs delete, range of expected inputs, tolerance for worst-case performance etc). It also really helps knowing which design was chosen for the standard hash map in your language (java uses closed addressing / multi array whereas python dictionaries use open / one array)
15:00 problem with the linear probing explanation (and the lookup code): imagine the array is 100 elements long, the last 5 elements are taken, and a user signs up with a name whose hash mod 100 = 94. they're gonna linear probe until the end of the array, and then fail even through there's still 95 slots left in the array. instead of looping from index to len(array), they should loop from 0 to len(array), and look up indices at (index + i) % len(array), so that once it reaches the top of the range the linear search will wrap around to the bottom.
A "dictionary" is the more abstract concept. Where a Tree can be a dictionary a Hash table can be a dictionary, even a list can be a dictionary (not a particularly good one). Basically a "dictionary" has to satisfy some lookup operations and a key/value type interaction. A Hash map is basically Java's way of saying Hashtable after they already used the term Hashtable in their earlier version. But a hash table is a specific way to implement dictionaries that consists of computing an index based on the key. Tree algorithms such as AVL, Splay, Red-Black and most especially B-Trees are other implementations for dictionaries that can satisfy the key/value relationship but also give you other features such as order. Thought I'd clarify these terms for those that may be confused by different terminology across languages.
Databases like postgres use binary tree indexes by default, but they do have hash as an option 🙂 see postgresql.org/docs/current/indexes-types.html for more info, you'll note you can choose HASH as an index type there.
I know how they work... but I've never looked into how they actually work (implementation wise,) as I'm a bit intimidated by all the hash functions, etc., that are involved, and the stuff you would need to do, to prevent hash collisions, etc. Also, I'm a bit confused at how you would be able to ensure that you can store every single key, as the numbers you get back from the hash function are very large, and will vary wildly with the smallest change to the key. Thank you for making this video.
Another great video! Hear me out on this particular example tho. Could you not "just" assign an index to the user data by the order in which they're created?
That is a clever way to make sure that the users are all packed in tightly, but funnily enough, what you're describing is an array! The trick is then, when someone random tries to log in with a username, how do we figure out the index that we originally assigned to them with no additional information? The nice part about a hash function is that it always generates the same number based on the same input 🙂
Would it be inefficient to have a small array in each hash slot? You'd always have to iterate over those to find an item. I don't know if that could still be better than linked lists, unless those arrays tended to be too big.
You definitely could do that 🙂 the tradeoffs would be: the arrays would have to be allocated individually somewhere on the heap rather than in 1 seperate array like a linkedlist, because you can't predict how many items will end up in a bucket ahead of time. You could also run out of space in these arrays too so you'd have to account for resizing, which could be tricky because resizing changes the pointer to where the array is.
Question, If users is an array and newUser is an object and were setting newUser.username which is the property of the object to a new object newUser then did newUser have to exist in the array of objects to begin with or does it create a new object in that array? I just don't quite understand what is going on there. Or am i overthinking this and were just setting a property and value at the same time, and since we have access the the object from some point prior we can set both at the same time?
I'm assuming you're talking about the line at 1:57. users is not an array, it is a hashmap: an object that, like @nicbarkeragain said, has its own syntactic sugar for adding elements to it in this language. Internally, the hashmap does use an array to store the references to User objects, but additional metadata about the hashmap will be stored in the hashmap object, like the array size and how many elements are set in the array so the hashmap can know when to grow the array to avoid collisions (see 16:07).
They may seem distinct on the surface, but where are the items in the linked list stored? A good way to start thinking through the problem is: what are the base memory primitives provided to you by the computer, and by extension, the OS? It's very easy to fall into the mindset of thinking that APIs like `new Node()` or `malloc(sizeof(node))` return you a singular block of memory that is somehow distinct and disconnected from other blocks, but all malloc & co are doing underneath is mapping the memory you asked it for into an array provided by the OS. Even pointers themselves are really just indexes into the array that is your program's virtual address space 🙂
Hashmaps oddly resembles Javascript's Literal Object data type. However, Javascript has its own dedicated Map object type which raises the question what are the differences between Literal Object and Map objet in JS?
You'll notice I strategically left out mod handling of negatives, it's a bit of a can of worms and not really relevant to what we're using mod for here, but definitely good to know the rules in the language you're using! 🙂
Wow didn't knew hashmaps use mod under the hood. could you make a video on set associative cpu caches sometime in the future?? i am sure there i have seen some mods there too
This video makes me even more annoyed that all my university classes were taught in Java, since it obscures all of the pointers under the hood… storing the linked list nodes in a separate array is a very natural solution, but Java just makes instantiating new objects feel so easy. I gotta try re-doing that data structures class in zig…
2:39 umh no, linked list is most decidedly NOT build over the array in ANY sense. linkedlist is exact OPPOSITE of an array. stack: okay, binary tree: okay, queue: okay linked list: HECK NEVER
You are becoming one of my favourite tech youtubers really quickly. If I might make a small suggestion for your production, it would look better if you could get your camera a little further away so the wide lens does not distort things so much. This is awesome!
Thanks 🙂 That's a good point, I will see if I can get the camera sitting a little further back!
@@nicbarkeragain Also, when moving the camera further away, you can still zoom in to preserve the size on screen, but then not have the distortion.
I dunno, I kinda like the distortion! It gives me nostalgia for those The Dog calendars, or when MTV was good. It's kind of a channel signature that this guy will show up with a honey-smooth voice and a head the size of a paper shredder, just to drop some knowledge on us.
On the other hand, the doctors diagnosed me with space madness, so my opinion may not be typical.
This is the best channel I had luck finding early. From data driven design video, to clay and this. Thank you so much, Nic
You're very welcome, and thank you for being so supportive! 😁
Haha. You are an excellent teacher. Well done. (I taught CS at the college level for quite a few years.) You might, though, add a caveat to "hash maps are quite simple to implement" to say "simple hash maps are quite simple." You no doubt know that all but cryptographically secure hash functions admit construction of large data sets that all end up in the same hash bucket, turning a naively implemented table into a list. Many war stories of DOS attacks boil out to this fact. Current libraries seem to use balanced trees as closed hash buckets as an antidote.
Haha yes, absolutely. Everything is simple until you add untrusted input, then all hell breaks loose 😁
At this point I am just binge watching your videos... your content is out of this world
This was great! The way you teach from first principles is super rewarding because, as a viewer, having the opportunity to anticipate problems and even consider solutions to them ahead of time really makes the concepts stick when you ultimately address them. I've really enjoyed every video of yours I've watched so far!
That is so wonderful to hear. For me personally, the best teachers I've had would always have us work through problems together, so I could see their thought process around solving each step, rather than just telling me the answer.
It's even better when it's something the teacher doesn't know the answer to either, and they're not afraid for you to see them make mistakes and struggle a little bit 😁
Wow…you sir are such a great teacher. Explanations and illustrations on point.
Thank you! I try to emulate the best teachers I've had in the past, I'm glad it comes across 🙂
Sharing the general consensus that you are a fantastic teacher and are becoming one of my favorite programming RUclipsrs for learning interesting concepts. This was fascinating!
Omg the first video that mentions the relation to hash's and arrays, as well as memory. The questions I always ask after watching any other video on hashmaps
A few weeks back, I went through the theory and built a hashmap with my son. This video is right on point and very clear. Excellent work.
what an excellent parent you are, I wish my parents had done this with me. How old is your son?
That's amazing. It's anecdotal but a lot of the great programmers I know had parents who were either in CS or CS adjacent (like electrical eng) and taught their kids core concepts from a young age 🙂
@@fejiberglibstein thank you. He's 17 and started when he was 16.
@@nicbarkeragain yes 30 years in IT and the core concepts are what I want to impart to my son. You channel is excellent and a go to resource which I shall recommend.
You’re genuinely one of the best places for tech explanation and philosophy I’ve seen on RUclips. Love your videos so much, keep up the good work!
That is so kind of you to say, I really appreciate it. I feel like I'm just starting to find my feet with it which is exciting!
Your educational videos are incredible - I came across your work (as many I'm presuming) through Clay, but these mini lectures are gold! Very clear style, super easy to follow and very well explained.
10:26 you forgot to increase the division result ;)
That aside, this channel has been such a wonderful find. When you talk about a topic you go from start to finish. Even if you already know about hashmaps, it's still fun to get a byte-sized explanation of everything. Thanks for keeping programming content awesome!
I went over this video like 5 times looking for tiny visual errors, and one always manages to sneak through 😅
And thank you so much for the kind words, when I was studying comp sci at university I always felt like there was too much jargon and the lessons had too many assumptions about prior knowledge, so I try to design these videos as to not leave anyone behind.
Everything you explained was clear to me before and still I feel you taught me something. Its just so nice how you present these things
First time here, really appriciate the way you explain things. Good stuff.
Thank you, it's tricky to balance assumed knowledge vs explaining everything, I'm trying to get better at it each attempt!
Subscribed. I could watch an entire DAS playlist from you Nic. Very nice explanation!
Absolutely brilliant explanation of hash map. Thank you
format + content + delivery = awesome fricken channel! Keep it coming!
Thank you, I will 😁
You have a very good way of explaining these concepts without being overly technical or jargon heavy. Looking forward to more videos.
Yep man, it’s one of the best explanations I ever heard
Great video! As someone who's a hobbyist who didn't do cs in college, I liked this and found it easier to catch on.
I did CS in university and still found it hard to grasp these concepts at first, it wasn't until I had actually built a hashmap myself in C that it really "clicked" for me 🙂
Dude you are killing it with these videos. Amazing explanations and visualizations. Keep it up!
Thanks man I appreciate that, I'm really enjoying making them so lots more on the horizon 😁
While I understand HashMap's, seeing their implementations visually represented really well, adds another dimension to understanding. Thanks!
This was an amazingly didactic video, congratulations, way better than other educational formats.
Thank you, I put a lot of effort into it so I'm glad it shows 😁
Amazing video! I'm massively impressed at how clear and concise you made this. In twenty minutes you managed to give the simplest and most complete explanation of how HashMaps work that I've ever seen. My knowledge on the internal workings of HashMaps has always been spotty with bits and pieces here and there, and this video filled in the gaps I think. Would have been nice to have this video a decade ago when I was in university!
Thanks so much Lars, I really appreciate that. I totally feel you re: data structures at uni, I try to make these videos from the perspective of what would have helped me the most back then 🙂
Nic, your channel is incredible!
Your way of teaching is so good that made me try web development again
It's nice to see your uploads becoming more regular, great content!
I'm hoping to have a much more frequent schedule this coming year 🙂
This video is so well done. I'm glad I found this chanel
Now I understand *why* hashmaps are so much faster. Thank you.
Definitely one of the best explanations I've found. Great video!
You explain things so well man, excited for more vids
You are about to rise to the top my friend. This is excellent.
great content, just recently I am thinking how dictionary works, I did some study and learn that there is collision, bucket, clustering...etc but still doesn't quite get how it works. but this single video makes everything crystal clear and easy to understand. thanks a lot.
That's great to hear! I personally find programming topics that are a bit "academic" tend to use lots of jargon (data structures are a great example). I personally prefer simple terms and visual examples if possible, I'm not good at memorising words 😅
Thanks for the explaining of hash map collision. Great work!
I don't know if you have any experience, but Ive been trying for a long time now to create a programming language, and Im stuck on backend codegen, either using LLVM or custom IR but I have no idea how to approach this as resources are scarce. You are really good at breaking things down, so it would be a blessing! Thanks for the amazing content.
I will absolutely do some videos on compilers, interpreters and general lower level concepts 🙂
In the meantime, I have heard that the pikuma compiler course is really great!
Very well put together! I might not have learned anything about hash maps in this video, but I did learn a lot about how to teach someone how they work 😄
This brings me back to me teaching myself how hashmaps work and implementing one in C because I needed it for my C project during my masters (mind you, I was studying physics, not CS, so barely any formal training in data structures and algorithms 😅). That must have been 10 years ago. Since then I left physics to become a software dev.
Man your content and explanation are great, keep up the good work!
Thank you, I really appreciate that! I'm enjoying making these videos so will definitely continue 🙂
This was an awesome video! I'd been wondering lately how HashMaps work under the hood, it was like you read my mind!
Thanks for the video :)
No worries at all, glad it could clear some things up for you! I still remember the first time I implemented a hash map in C myself and thought "wait, it's only 10 lines of code?" 😁
I'm gonna have my data struct final exam in 2 hours. Great timing!
Good luck! I know you'll do great!
but in your example code at 14:30: inside the for loop you keep looking up array entries at index, when the counter variable is named i
Yeah I thought that looked a little strange
Another great video ! I hope to see more from you Nic, you have got a knack for explaining things
Before this I've seen j_blow explain linear probing to solve collisions but I still don't understand how you look up the item, this video is very helpful.
That's great to hear! I often need to see things explained in a few different ways before it finally clicks for me 🙂
Honestly i too never thought about linear probing concretely since i attended uni a decade ago, this was a fantastic refresher and i think gave me an even better intuition for it than when i initially learned it. Thanks 👏
The thing about hashtables is the trade offs are very situational so what you need in one situation might be very different than an other, which is why you might be inclined to write your own hashmaps in the first place, aside from as a learning exercise.
I always bound my open addressing. For example if you're using linear probing then don't just probe the whole buffer. Again this is situational and depends on the size of the buffer and the size of the hash map entries etc etc but let's say you're storing some u32s that's too say 32 bit unsigned integers. Maybe scan forward a couple cache lines worth, stopping when you see an unpopulated cell. Then fall back to close addressing, maybe another hashmap or an array or a series of arrays. Ideally you've got a good hash function with pretty uniform distribution and you won't get so much clustering but even a pretty crappy hash function can be powerful and making a good hash function is difficult. So assuming some clustering. But you really need to measure these things and if you're getting a lot of clustering and a lot of collisions you probably want to do some work on your hash function.
Aside from that the table size is important. The size you choose can lead to clustering, it's typical to use prime numbers for table lengths.
There's lots of strategies and I'd be lying if I'd measured the merits of all the ones I've tried let alone anything else, but you probably want to go with something cache friendly or rehash using the data from the cell collision as salt. These things are easy to over engineer.
Actually, it's probably easier now to make a good hash function on modern pcs if you have access to the aes instructions. The performance characteristics are pretty good from my understanding.
Very easy to follow, thx for the knowledge!
Ho god! The title of this one is an automatic thumb up!
I wanted to understand (and not learn) HashMaps for a long time :D
ie I wondered about collision, and search speed
Thnaks a lot!
No worries, I'm so glad the video could help you understand!
I built a hashmap for my scripting language's table type. I had watched a video about a new algorithm that reduces clustering called the robin hood method, IIRC.
Basically, an element that is shifted due to collisions is "poorer" than those in the correct spots. So, when clustering occurs, the elements that are poorer than the old ones get placed, and thr richer ones continue with the shifting. In effect, the shifting is balanced between elements.
What interests me most about your video was the load factor though... mine might be too high.
enjoying your content Nic. Good stuff
Hey, Nic, enjoying your videos and appreciate the work you put in and the knowledge you're sharing. Just wanted to ask: are there situations where it's actually a good idea to implement your own hashmap rather than using a language's built-in hashmap, or is the act of implementing one more of a learning exercise? Thanks
That is a fantastic question. It really depends on the context 🙂
The disadvantage that language designers have when implementing a standard hash map is that they need to pick something that can have decent performance and stability with any possible input and usage, which limits them to making much more conservative and defensive choices about the design.
It's possible to design a simple hashmap that works amazingly well for your particular use case if you know lots of information about how it's going to be used (frequency of lookup vs insert vs delete, range of expected inputs, tolerance for worst-case performance etc). It also really helps knowing which design was chosen for the standard hash map in your language (java uses closed addressing / multi array whereas python dictionaries use open / one array)
Good to hear the Aussie pronunciation of data after living abroad for some time 😁
Love your code / videos. Sending support!!
what a great channel ❤❤❤
❤️
15:00 problem with the linear probing explanation (and the lookup code): imagine the array is 100 elements long, the last 5 elements are taken, and a user signs up with a name whose hash mod 100 = 94. they're gonna linear probe until the end of the array, and then fail even through there's still 95 slots left in the array. instead of looping from index to len(array), they should loop from 0 to len(array), and look up indices at (index + i) % len(array), so that once it reaches the top of the range the linear search will wrap around to the bottom.
Fantastic video! What do you use to create the animations?
Thank You so much for making this
You're very welcome!
A "dictionary" is the more abstract concept. Where a Tree can be a dictionary a Hash table can be a dictionary, even a list can be a dictionary (not a particularly good one). Basically a "dictionary" has to satisfy some lookup operations and a key/value type interaction.
A Hash map is basically Java's way of saying Hashtable after they already used the term Hashtable in their earlier version. But a hash table is a specific way to implement dictionaries that consists of computing an index based on the key.
Tree algorithms such as AVL, Splay, Red-Black and most especially B-Trees are other implementations for dictionaries that can satisfy the key/value relationship but also give you other features such as order.
Thought I'd clarify these terms for those that may be confused by different terminology across languages.
Other uses for hash tables that are NOT dictionaries include Sets, where there is only a key but no value.
Thank you! Very good explanation! Can you tell how do you do such good visualization? Is it via programming or some other tool?
I believe there may be an error in the code at 14:30. inside the for loop, i should be used to index array, not index
are these what databases use?
Databases like postgres use binary tree indexes by default, but they do have hash as an option 🙂 see postgresql.org/docs/current/indexes-types.html for more info, you'll note you can choose HASH as an index type there.
@nicbarkeragain thanks for the explainer!
Hey great content! Did you code the animation or did you use a software :))
I know how they work... but I've never looked into how they actually work (implementation wise,) as I'm a bit intimidated by all the hash functions, etc., that are involved, and the stuff you would need to do, to prevent hash collisions, etc. Also, I'm a bit confused at how you would be able to ensure that you can store every single key, as the numbers you get back from the hash function are very large, and will vary wildly with the smallest change to the key.
Thank you for making this video.
This is so good! How do you create your animations?
Great video!
This is my best find in 2025 on YT.
Another great video! Hear me out on this particular example tho. Could you not "just" assign an index to the user data by the order in which they're created?
That is a clever way to make sure that the users are all packed in tightly, but funnily enough, what you're describing is an array! The trick is then, when someone random tries to log in with a username, how do we figure out the index that we originally assigned to them with no additional information? The nice part about a hash function is that it always generates the same number based on the same input 🙂
Would it be inefficient to have a small array in each hash slot? You'd always have to iterate over those to find an item. I don't know if that could still be better than linked lists, unless those arrays tended to be too big.
You definitely could do that 🙂 the tradeoffs would be: the arrays would have to be allocated individually somewhere on the heap rather than in 1 seperate array like a linkedlist, because you can't predict how many items will end up in a bucket ahead of time. You could also run out of space in these arrays too so you'd have to account for resizing, which could be tricky because resizing changes the pointer to where the array is.
18:21 are we going to discuss fallback data?
Thumbs up, before playing the vid!
Question, If users is an array and newUser is an object and were setting newUser.username which is the property of the object to a new object newUser then did newUser have to exist in the array of objects to begin with or does it create a new object in that array? I just don't quite understand what is going on there. Or am i overthinking this and were just setting a property and value at the same time, and since we have access the the object from some point prior we can set both at the same time?
That's just a shorthand syntax used in some languages for accessing a hashmap, it's equivalent to something like map.set("key", value); 🙂
I'm assuming you're talking about the line at 1:57.
users is not an array, it is a hashmap: an object that, like @nicbarkeragain said, has its own syntactic sugar for adding elements to it in this language. Internally, the hashmap does use an array to store the references to User objects, but additional metadata about the hashmap will be stored in the hashmap object, like the array size and how many elements are set in the array so the hashmap can know when to grow the array to avoid collisions (see 16:07).
i've been wondering how hashmaps handle collisions. thanks
nice high quality content!
2:46 I don't understand how linked lists are built over top of arrays. They're completely distinct from arrays!
They may seem distinct on the surface, but where are the items in the linked list stored?
A good way to start thinking through the problem is: what are the base memory primitives provided to you by the computer, and by extension, the OS?
It's very easy to fall into the mindset of thinking that APIs like `new Node()` or `malloc(sizeof(node))` return you a singular block of memory that is somehow distinct and disconnected from other blocks, but all malloc & co are doing underneath is mapping the memory you asked it for into an array provided by the OS.
Even pointers themselves are really just indexes into the array that is your program's virtual address space 🙂
Hashmaps oddly resembles Javascript's Literal Object data type. However, Javascript has its own dedicated Map object type which raises the question what are the differences between Literal Object and Map objet in JS?
A Map in JavaScript remembers the insertion order of key-value pairs.
Thank you so much
So, so good
3:30 don't forget the signage: X mod Y has sign(Y), X rem X has sign(X)
Wrong.
This depends on the programming language used.
All of them have different rules for negative numbers.
You'll notice I strategically left out mod handling of negatives, it's a bit of a can of worms and not really relevant to what we're using mod for here, but definitely good to know the rules in the language you're using! 🙂
Wow didn't knew hashmaps use mod under the hood. could you make a video on set associative cpu caches sometime in the future?? i am sure there i have seen some mods there too
I have plans for some more videos that get down into the nuts and bolts of how the CPU works, so stay tuned 😁
@@nicbarkeragain hell yeah
1:40 wait, any reason for that particular color breakup of diction + ary? does it indicate to its arity being related to how things are dictated?
This video makes me even more annoyed that all my university classes were taught in Java, since it obscures all of the pointers under the hood… storing the linked list nodes in a separate array is a very natural solution, but Java just makes instantiating new objects feel so easy. I gotta try re-doing that data structures class in zig…
I think an array of collision list is good.
Hehe, "knull" in Swedish means "fuck". Great video regardless! Thanks
Oh no. TIL 😅
Explained Simply = Explained Simply who don't get boring because of math )
There's a little less magic in the world now... Maybe I should not have learned this 😢
oh hey im some_guy
8:54 too much blabbering, i am not ur target audience. have success with ur channel. cheers.
2:39 umh no, linked list is most decidedly NOT build over the array in ANY sense. linkedlist is exact OPPOSITE of an array.
stack: okay, binary tree: okay, queue: okay
linked list: HECK NEVER