“Throw a hash map at it and hope for the best” I gave my friend that advice when we were both looking for a job our senior year of college. He told me a few weeks later that he was in an interview and had no idea what to do so he pulled out the hash map quote. The interviewer didn’t even bother having him implement it, they just said alright you sound like you know the approach we can continue
Nice ... in my advanced graduate algorithms course now ... half-way to MS in CS at age 63. If you're in CS you will have to know all of those. Basic data structures. Great vids as always.
I am in my final year of my CS degree and I am going back to relearn a lot of basics because I half assed my way to where I am but I have grown to like the program and want to be competent. Tho don’t tell my boss
The funny thing is... first we learn about array, then we learn about how the sequential scans are inefficient and then we learn about all kinds of better data structures with lower classical complexity bounds, then at some point we start to learning about modern CPU architures, cache hierarchies, out of order execution and loop carried-dependencies... and then at some point we come a full circle and realize that the flat array tends to win anyway, simply because our modern CPUs make dump sequential scans through flat arrays very efficient while all that smarter pointer-chasing stalls on having to wait on cache misses all the time to figure out where to find the next cache miss... and if the flat array doesn't quite cut it, then most of the time the answer isn't a better data-structure, but rather a better algorithm to utilize the flat array. :)
Consider these actionable insights from the video: 1. Learn about arrays to store collections of data of the same type. 2. Explore linked lists when you need to efficiently insert or delete elements in a list. 3. Understand stacks if your project requires managing data in a last-in-first-out manner. 4. Utilize queues to process data in a first-in-first-out order, like managing tasks. 5. Implement hash tables to store and retrieve data using key-value pairs for fast access. 6. Master trees for representing hierarchical data and conducting efficient searches. 7. Dive into heaps when you need to find the minimum or maximum value efficiently. 8. Study graphs to represent and analyze relationships between objects or entities.
linked lists are rarely the best solution. "array lists" which are Lists implemented using fixed size arrays, perform much better with the cache, use a lot less RAM, and are in general faster for almost all use cases.
@@epokaixyz arrays are great if you're not going to be altering their size often. Once you start altering the array size (especially increasing it) you want to consider other options. There's a few hidden actionable items in here when you consider the parameters and limitations he points out as well.
When implementing hash tables, consider ways to handle overflow conditions when using open addressing (not chaining). Also consider ways to increase utilization (that is efficiency of storage in terms of the number of slots in use vs. the capacity of the has table), and their limitations. What are the implications if you know all of the necessary keys and values in advance (i.e. after the initial creation, only the search function is ever used).
I remember watching 1 or 2 of ur vids 8 months ago and i lost interesst and couldnt find the value.. now after hardcore 8 months of studying and practicing programing, system management and recently math.. i think this is the most valuable YT channel ive seen.. obv im not comparing it to freecode camp.. but obv ur not in the same nieche .. just went trough clearing the nonvaluable yt channels ive subscribed too and u are one of the fews i left there.. super cool, inspiring, relatable, and informative ofc!! :)
A little clarification: time complexity on stack, queue, tree, hashtable and so on, is amortized time complexity. lets say we implement it by single linked list (persistent) it will truely be O(1), since we only add a head in the front of the list. But if we implement it as an array, when the array are full or has a item count less some faction of the length of the array, we need to resize the array.. this i a O(n) operation, but done cleverly it we can make sure that it will average out over k operations. If the length of the array are n, where n is a poser of 2. Let say we resize when the array are full by double its internal size. Then for k pushes it while still be amortized O(1). If k < n then The stack does not need to realize and we have only added an item which is a O(1) operation If k=n then At the last pushed item, the stack will trigger a resize that cost n moves of item to the bigger array. Since k=n we have that it cost 2*k, hence O(2*k/k) = O(1) (big O notation dont care about constants) If k>n then We are in the k=n case in k/n times, since they are O(1) times a constant it is still O(1) amortized. (The argument is that log2(k)/k < 1 => O(1)) Let t = k mod n, since t < n we have that the last t pushes are in the first case and we have that it for n pushes has an amortized runtime of O(1). Now what about shrinking? If the stack resize downward if the count of items < n/4 (a quarter of the length of the array) we shrink it to be of length n/2. If k < n/4 then For the k pop operations it will not resize, and it will be constant. If k=n/4 then Then for the first operation it will move k items in the resize, and for the k-1 rest of operations it will run in O(1) time So we have that O((2*k+k-1)/k) = O((3*k-1)/k) since constants and constant factors are discarded we have O(k/k) time which is O(1) amortized. For k>n/4 we have the kinda the same argument as for the push operation. But the truely smart thing here is we never end up in a place where it resizes up and down between two operation independent on which sequence and operations. Since if we have pushed and it just resized the item count are n/2 since we have doubled the size, and if we follow it by a pop, we have that n/2 > n/4, and for a resizing after a pop we have that the count of item are = n/2 and if we follow it by a push it will not be full, nor will it lead to a resize again after a pop. In the corner case n
Much of the time, queues and stacks are given a fixed size, and they return an overflow or underflow condition when their limits are reached. Or they can be implemented as linked lists, in which case they're never resized and their performance characteristics take on those of the underlying memory pool (which is often a heap).
@paulsander5433 implementing them as linked list can be a performance wise bad decision, it can lead to fractured memory patterns where items of the list are placed fare apart in memory, which lead to constantly flushing the cash.. memory lookup are still one of the most expensive operations for the computer counted in cycles. On the other hand, moving a continuous portion of memory from one place into another (resizing), has specialized instructions dedicated to it. When i assume the size of the underlying array to be a power of 2, it is not just to simplify code or reasoning, but because instructions which moves data around function in powers of 2 too. Yes for most applications this is not a buttleneck, but for some real life applications where the queue, stack or other structures are heavily used it can be a big one. If you run a server on a Google/Amazon or any other third party data center, it will add to cost.. since it can be the difference of if a single instance of the server application can handle 10k or 100k active users, and it is way cheaper to rent memory storage (memory mapping) than to rent cpu power. And for simple data structures like stacks queues and heaps, it is not that hard to implement. It would take me around 10-15 minutes. Depend on the language used of cause 😅 It is why functional language like haskell, standard ML, Ocaml and F# has imperative or OO features. Because persistent data structures while easy to reason about are costly in performance in most cases. In most language it would be implemented as the Array.Resize(stack, size) method. Which take advantage of those instructions i talk about. By using a power of 2 you simple code array.Resize(stack, stack.size >) (bit shift) dependent o it is resized up or down.
Great comprehensive video that is very understandable even if you are a beginner. This should be a must watch for beginner or even intermediate software developers to make sure they know the fundamentals. It also should be helpful for CS students that try to get started and get a good overview. Also the examples are nice although sometimes brief which lies in the nature of the video so no bad aspects.
Are you planning on making a "The making of : Git" ? I'm kind of curious and your documentaries are.... Honestly the only kind that can keep my interest. The history. the visuals. Specially the speech pattern . can't explain it , its not just the enunciation on words , or the steady speed between , not too slow and not too fast ,but like you know what you are talking about. Not just a fact blurb , so you can logically follow the line of thought behind the product. Love it man.
Moved from website admin (database maintenance mostly with csv uploads) now learning C++. Amazing how often I solved issues with applying a hash table (room allocation, contact details, availability and scheduling) without realizing what it was called.
I like how you mention variations of some of the data structures. But you didn't mention the double-ended variety of the queue, or "dequeue" (pronounced like "deck"). It works like both a stack and a queue, where you can insert, delete, and peep at both ends.
Two things to check with your solder: 1. Is it rosin core? If not, then you're going to have a hard time forever, so I would recommend upgrading. 2. Lead free solder is harder to work with as well. I use leaded solder and if you're concerned about safety then I would recommend using a mask. But I don't like lead free because it gets cold too quick
Array's are not always fixed length. You can dynamically declare the length, and resize the array. in C, this would be malloc, realloc, free. In C++, this would be vector container.
I'm a web developer who delivers Amazon packages. One of the station supervisors at my sorting station used to build software as a service. I've met doctors who drove cabs, and chefs who used to be web developers, too. People who stay in one job all their lives are missing out on life.
You can make a resizable array queue with amortized O(1) enqueue complexity, you just have to increase the size enough every time you're doing so. For example increasing the size to double the previous size will be O(1) on average since you are not increasing it most of the time
You can have arrays of structures/derived types which contain multiple data types. I use that quite a lot. You can also resize arrays when needed. There are ways to minimize efficiency issues caused by resizing (i.e. do it sparingly). And you can mimic some of the behaviors of other structures assuming a fixed maximum size, for MOST problems.
On queues, my favorite data structure > What governs traffic on the internet? Queues! What governs process management in a computer? Queues! What governs your check-out and pay trivia at the supermarket? Queues! Who invented queuing theory? A Dane named Erlang - An Erlang is the international standard unit of traffic - and traffic in general is modeled using? queues! I love queues! /A very happy Dane
I've got a very visual mind: a tree is a specific type of graph, and if you think about it, so is a linked list. I love graph theory, I'm not very good at it, but I do love it. I enjoyed your comment about naming the same thing upteen different ways just to be confusing 😀
One thing never mentioned with hash maps is the performance of them, they are a heavy algorithm and will most likely slower than an array for looking up a value. One advantage arrays have is the ability to use multiple threads to divide the array up into smaller blocks to search through, but you can escape the time needed to calculate the hash and these usually don’t scale as well and can’t take advantage of multithreading. But these are essential tools for all computer science.
it maybe worth mentioning arrays could form columnar set of data which itself could function as a queue or stack and each row of an tabular data structure could be a set of array of different ( or same) data types each row can also function as a queue or stack ( if there is some specific pattern for some specific processing need )...so there could be various combinations of those data structures ..and beside first in line get served first operation the first in line can also quit not get served but yes so can be tge last in line quiting, in fact, any body in the line can quit anytime, and the next person advance one position up. I am also guessing police or emergency dispatch may have priority rating so first caller may not get served first...so maybe like graphs with weighted option queue may have weighted option ? Could we also consider a totally random pool of data as an unstructured data structure or is that an oxymoron ?
I just started learning python the other day and this information I feel like is something I’m going to have to learn depending on what I want to do I really appreciate this video this is good information !
Nice video. It might be that experience with data structures and algorithms is one of the key differentiators between more and less productive programmers.
I've majored in half computer science and half statistic, been on and off programming for 7 years because it's not my main job. Having to relearn everything from beginning because somehow you don't remember any of this feels weird.
In .NET, all these structures are implemented via an Array by the base class library authors. Queue, Stack, List, LinkedList, etc., are just constructs to make their usage easier for the average .NET developer. Of course, BCL authors also put optimizations in the BCL constructs that make their usage faster than it would be for you to implement just an array, like via pointers and other methods not appropriate for most to use.
IMO you can reduce this down to hashtable, array, and graph. You can derive the others from these three. Thumbs up from me though. I certainly use all these at my job!
There's an asterisk on Linked Lists nowadays because most processors REALLY love their locality, a Linked List is almost always going to lose to a regular array, even if you relocate the whole thing. Exceptions are VERY VERY large lists and the Producer/Consumer pattern, where they still make sense.
"Generic" trees and many other data structures have the same drawback. Just learn how to manage memory efficiently, and locality will increase significantly.
be careful making statements about performance (big o) in modern processors: true arrays are super performant in many use cases due to cpu cache use. data locally may trump big-o analysis
Stacks and Queues and Graphs aren't "data structures" they are Abstract Data Types. You can implement a stack with a LinkedList, an array or something else. Same as queues. For graphs you can implement them as a collection of nodes, an adjacency matrix, or with adjacency lists. ADTs are the definition of operations such as Stacks and Queues, Data Structures are the ways we implement them.
they are indeed data structures as well. They are a way to structure data. Graphs live more on the edge of that definition, but they are a way to organize data. As I said for graphs, you're right that there are different implementations of them which is why they live on the edge.
@@hawks3109 they aren't "ways to structure data" they are a definition of an API that is LIFO or FIFO. The way to structure the data is via an array, a linked list or a tree implementation. Those are the data structures. Lots of confusion between Data Structure (the way to structure data, the implementation) and the Abstract Data Type (the list of operations that define the interface of the operation)
first arrays : creates a python list with python ints, wich in itself is an object , that is a whole tree of fuctionallity that o has a hashmap pointing to objects (in this case a python int) in c an array would be the sum of he size of its elements, in python its not :), not dunking on the whole video , i like it actually just saying , that while python might look like pseudocode and makes it easy to explain stuff , be carefull not to actually cause beginners to have the wrong idea or assumption that has to be unlearned at some point, in this case i would have gone with plain old C, as for these thins C is actually really easy to understand , it also would have forced you to do what you said in the definition : set its size to a fixed number of elements
Btw collisions rate in a hash map are dependent on the hash function and the range of buckets compared to the size of the hashkey. Maybe an advanced point to make in this video 😅
What questions do you have? Or what do you mean by corresponding questions? I have a master's in CS and work in a FAANG company so I may be able to answer your questions
Hey Forest! I’ve been following your videos for a while now, and I really love your content. I would greatly appreciate it if you could make a video about Spring Boot.
Name and phone number is literally obfuscation of the idea of hash map. The telephone number is not a value, it's hash of a person's identity in its most basic sense.
I wonder how indexing with imaginary complex numbers will work for an array....anybody specializing in hash function research done statistical probability research on graphing on w x y z axis the correlations between hash function vs size of data set vs growth rate of data set and rate of access ?
So a queue is essentially an array where you only access the first or last element.Then if we make an array and only use it in this exact way, why wouldn't it be equally efficient with insertion and deletion? Also a hash table is essentially a double array. Instead of accessing an element by index, you now have to access it by a key referencing an index referencing an element. How is this more efficient exactly? How come the array isn't more efficient or at least equal compared to both queues and hash tables?
Adding an element at the end of an array is a bit costly because you need to maintain the contiguity of your elements. If the next memory emplacement after your current last element is free, then that's fine, but if it's not, you'll have to allocate a new contiguous block of memory long enough for all your elements and copy all the old elements plus the new one in this new emplacement. That makes "adding an element at the end of an array" a very inefficient operation in O(n). There are ways around that, see dynamic arrays for extensible arrays where the addition of an element is amortised O(1), but at this point, you're not just using a plain old array! And deleting the first element repetitively raises a similar problem of reclaiming the freed memory from time to time if you want to avoid a memory leak, which you probably want to, given the tendency to use queues in some long running processes. For hash table, they're essentially arrays (probably a kind of dynamic arrays, customised for this usage) underneath. The essential distinction is how you access them, the primitives you're using to manipulate them so that finding the value associated to a key is constant time (in relation to the number of keys in the hash table).
@@chaddaifouche536 Thanks for your answer! Does that mean that dynamic arrays have both the benefits of arrays and queues then? So you would really only need to care about the differences between them if you are working on legacy code. Otherwise you would just always use dynamic arrays. Did I get that right? And about hash tables, does that mean that if you need to sort or somehow rearrange the data, you don't actually sort or rearrange the values, only the keys? I can see how that would be useful over arrays if I was working with large data elements.
@@MrPellenisse Well, you're kinda right, in certain contexts where you don't care that much about performance, and kinda wrong in scenario where performance is critical… In general, if you want peak performance, you should use the data structure that fulfils your needs exactly, and no more, because any extra facilities are probably paid for in lesser performances in some operations, even if it's not reflected in the big O, or more memory usage. For instance, dynamic arrays need to reserve more memory than what they're actually using, and keep track of the length of reserved memory and the parts of the array underneath that are really used. In many (most) cases, you don't really care for the small difference in performance that would entail. For instance, Python took exactly your stance when it decided to provide "list" as its standard "array" data structure, since Python's "list" are in fact heterogeneous dynamic arrays with extra bells and whistles! On the other hand, when someone *needs* to do numeric computing in Python, using the standard lists is a terrible idea and the right decision is probably to use something like the numpy arrays (that are real typed arrays, not dynamically typed dynamic array like list). You may have orders of magnitude better performance this way. -- About hash tables, I don't think you've grasped their main advantage yet : the main "advantage" of an hash table compared to arrays is that instead of numeric indices that *have* to be consecutive, you can use any subset of an immutable type as keys (most frequently strings) and access the corresponding value in O(1). They're basically arrays where the "indices" are the names of your customers, for example. You may maintain an array of keys separately and order it as you wish but that's independent of the fact you're using an hash table (you may do the same thing with the indices of an array, after all). Hash tables are sometimes called dictionary, or associative arrays, reflecting their functionality rather than their implementation (dictionaries can also be implemented with binary search trees with O(log n) consultation and insertion instead).
@@chaddaifouche536 Thanks for your explanation! It does clarify a few things. So then you are saying that a hash table can't be accessed by an index value at all, but ONLY by its key? Is that correct? Otherwise the hash table would have both indexes AND keys pointing to the same value, which is how I thought they worked. It seems to me that hash tables would still only be useful for large datasets where you need to access values out of order. Usually when working with arrays, you use them exactly because you actually want to iterate through every single element in some loop anyways, making index values mandatory and keys simply redundant.
It’s worth noting, that Python also allows storing different data types in a single list. In fact, this capability is common to all dynamically typed languages.
And many statically typed languages which have a large runtime (like go for instance) have the same ability, in go you can just do this: var some [4]interface{}
this is not actually true. python lists (and other containers) store pointers to any python type. to the user it seems as though a list can store multiple type simultaneously but in reality the type it stores is always a pointer to a python object. storing fixed size data is very important because you can easily calculate it’s memory location from the index and the list start. if you cannot guarantee the size it’s a linear search. for example see why you cannot just index into UTF-8 strings
@@soothsayer1interfaces in the golang runtime are always pointers, for the same reason noted above. if you want an array with multiple types of elements, the way without using a pointer is a union type. the rust enum implementation is one such example
@@simpletongeek a union is a block of memory that can be interpreted as multiple types at the same time. so you can have a 4 byte int and a byte 4 array simultaneously. an interface is a pointer to a block of memory. the interface is also an agreement about the methods that can be run on it. this way you can call the same methods without knowing which type it is. but the important bit, is that an interface is only one of it’s types while a union is simultaneously all of it’s types
Couldnt you just argue that an array of different objects is probably an array of pointers to those objects? So its still a single data type, pointers.
Writing linked lists is a good programming exercise but you should NOT use them in production code because they have terrible cache locality. Having your working set in the CPU cache is critical for performance.
For example...what kind of data structure optimizes research of gigantic object of black holes...vs research of sub atomic particle or even massless photron vs biochemical vs flyid dynamic. Etc etc
elements. elements is the word you were looking for when discussing array size. Arrays have a fixed number of elements, defined when the array is created.
particular programming language biased. arrays arent always fixed width or only store one datatype. anyhow, everything is a linked list, with a feature or two added.
It's BS an array in particular only contains one type of data. That depends on the language. I'd go so far to say that every data structure has the same property in that regard. In a strongly-typed language all data structures only store one type of data, and in a dynamically-typed one (like your python example), they usually hold every type.
@@hansjhansen2217 I certainly won't. It's not informative for anyone that got past page 30 in the tutorial of whatever programming language they started with, and even misleading in some regards, see my comment above. What do you think is wrong about my statement? In Python an "array" (which btw is not an array) can hold every kind of object. In C++ any datastructure can only contain the type with which the template was initialised.
@@gehirndoper He is stating the principel of an array. That different program languages “expands” the definition doesn’t change what he is saying is true.
“Throw a hash map at it and hope for the best”
I gave my friend that advice when we were both looking for a job our senior year of college.
He told me a few weeks later that he was in an interview and had no idea what to do so he pulled out the hash map quote. The interviewer didn’t even bother having him implement it, they just said alright you sound like you know the approach we can continue
Sounds about right 😂😂😂😂😂
I Also wowed my first job/boss by basically using a hashmap where it was vaguely applicable lol.
😂💀
Hashmaps saved my life more than once lol
(Furiously takes notes for interview)
Nice ... in my advanced graduate algorithms course now ... half-way to MS in CS at age 63. If you're in CS you will have to know all of those. Basic data structures. Great vids as always.
Wow, great. It's commendable. Do you feel anything different than say you were studying in your 20s? Keep it up, sir.
I am in my final year of my CS degree and I am going back to relearn a lot of basics because I half assed my way to where I am but I have grown to like the program and want to be competent.
Tho don’t tell my boss
I tried in my 40's but I had too much going on to focus on 3d calculus..
What is ms and cs?
Lmao dude same@@dalitsobanda7658
As a graduate of computer science 30 years ago, I wish I had a lecturer like this. You have a knack for educating people. Well done!
The funny thing is... first we learn about array, then we learn about how the sequential scans are inefficient and then we learn about all kinds of better data structures with lower classical complexity bounds, then at some point we start to learning about modern CPU architures, cache hierarchies, out of order execution and loop carried-dependencies... and then at some point we come a full circle and realize that the flat array tends to win anyway, simply because our modern CPUs make dump sequential scans through flat arrays very efficient while all that smarter pointer-chasing stalls on having to wait on cache misses all the time to figure out where to find the next cache miss... and if the flat array doesn't quite cut it, then most of the time the answer isn't a better data-structure, but rather a better algorithm to utilize the flat array. :)
Consider these actionable insights from the video:
1. Learn about arrays to store collections of data of the same type.
2. Explore linked lists when you need to efficiently insert or delete elements in a list.
3. Understand stacks if your project requires managing data in a last-in-first-out manner.
4. Utilize queues to process data in a first-in-first-out order, like managing tasks.
5. Implement hash tables to store and retrieve data using key-value pairs for fast access.
6. Master trees for representing hierarchical data and conducting efficient searches.
7. Dive into heaps when you need to find the minimum or maximum value efficiently.
8. Study graphs to represent and analyze relationships between objects or entities.
linked lists are rarely the best solution. "array lists" which are Lists implemented using fixed size arrays, perform much better with the cache, use a lot less RAM, and are in general faster for almost all use cases.
Great summary!
I use stacks for parsing structured, nested data. It's about the only time I use them, but then they're invaluable.
@@epokaixyz arrays are great if you're not going to be altering their size often. Once you start altering the array size (especially increasing it) you want to consider other options.
There's a few hidden actionable items in here when you consider the parameters and limitations he points out as well.
When implementing hash tables, consider ways to handle overflow conditions when using open addressing (not chaining). Also consider ways to increase utilization (that is efficiency of storage in terms of the number of slots in use vs. the capacity of the has table), and their limitations. What are the implications if you know all of the necessary keys and values in advance (i.e. after the initial creation, only the search function is ever used).
This is an amazing tutorial!! you break the concepts down so simply and lay them out in a very digestible manner
I remember watching 1 or 2 of ur vids 8 months ago and i lost interesst and couldnt find the value.. now after hardcore 8 months of studying and practicing programing, system management and recently math.. i think this is the most valuable YT channel ive seen.. obv im not comparing it to freecode camp.. but obv ur not in the same nieche .. just went trough clearing the nonvaluable yt channels ive subscribed too and u are one of the fews i left there.. super cool, inspiring, relatable, and informative ofc!! :)
Having context makes all the difference for videos like this!
You sound and intonate almost exactly like Dave from Dave's Garage - it's amazing. You explained the data structures very clearly and concisely.
This is the most digestible explanation of these concepts i’ve ever seen. Hats off!
A little clarification: time complexity on stack, queue, tree, hashtable and so on, is amortized time complexity. lets say we implement it by single linked list (persistent) it will truely be O(1), since we only add a head in the front of the list. But if we implement it as an array, when the array are full or has a item count less some faction of the length of the array, we need to resize the array.. this i a O(n) operation, but done cleverly it we can make sure that it will average out over k operations. If the length of the array are n, where n is a poser of 2. Let say we resize when the array are full by double its internal size. Then for k pushes it while still be amortized O(1).
If k < n then
The stack does not need to realize and we have only added an item which is a O(1) operation
If k=n then
At the last pushed item, the stack will trigger a resize that cost n moves of item to the bigger array. Since k=n we have that it cost 2*k, hence O(2*k/k) = O(1) (big O notation dont care about constants)
If k>n then
We are in the k=n case in k/n times, since they are O(1) times a constant it is still O(1) amortized.
(The argument is that log2(k)/k < 1 => O(1))
Let t = k mod n, since t < n we have that the last t pushes are in the first case and we have that it for n pushes has an amortized runtime of O(1).
Now what about shrinking?
If the stack resize downward if the count of items < n/4 (a quarter of the length of the array) we shrink it to be of length n/2.
If k < n/4 then
For the k pop operations it will not resize, and it will be constant.
If k=n/4 then
Then for the first operation it will move k items in the resize, and for the k-1 rest of operations it will run in O(1) time
So we have that O((2*k+k-1)/k) = O((3*k-1)/k) since constants and constant factors are discarded we have O(k/k) time which is O(1) amortized.
For k>n/4 we have the kinda the same argument as for the push operation.
But the truely smart thing here is we never end up in a place where it resizes up and down between two operation independent on which sequence and operations. Since if we have pushed and it just resized the item count are n/2 since we have doubled the size, and if we follow it by a pop, we have that n/2 > n/4, and for a resizing after a pop we have that the count of item are = n/2 and if we follow it by a push it will not be full, nor will it lead to a resize again after a pop.
In the corner case n
Much of the time, queues and stacks are given a fixed size, and they return an overflow or underflow condition when their limits are reached. Or they can be implemented as linked lists, in which case they're never resized and their performance characteristics take on those of the underlying memory pool (which is often a heap).
@paulsander5433 implementing them as linked list can be a performance wise bad decision, it can lead to fractured memory patterns where items of the list are placed fare apart in memory, which lead to constantly flushing the cash.. memory lookup are still one of the most expensive operations for the computer counted in cycles. On the other hand, moving a continuous portion of memory from one place into another (resizing), has specialized instructions dedicated to it. When i assume the size of the underlying array to be a power of 2, it is not just to simplify code or reasoning, but because instructions which moves data around function in powers of 2 too.
Yes for most applications this is not a buttleneck, but for some real life applications where the queue, stack or other structures are heavily used it can be a big one. If you run a server on a Google/Amazon or any other third party data center, it will add to cost.. since it can be the difference of if a single instance of the server application can handle 10k or 100k active users, and it is way cheaper to rent memory storage (memory mapping) than to rent cpu power.
And for simple data structures like stacks queues and heaps, it is not that hard to implement. It would take me around 10-15 minutes. Depend on the language used of cause 😅
It is why functional language like haskell, standard ML, Ocaml and F# has imperative or OO features.
Because persistent data structures while easy to reason about are costly in performance in most cases. In most language it would be implemented as the Array.Resize(stack, size) method. Which take advantage of those instructions i talk about. By using a power of 2 you simple code array.Resize(stack, stack.size >) (bit shift) dependent o it is resized up or down.
@@KamillaMirabelle Yes, that is all correct. Like I said, they take on the performance characteristics of the underlying memory pool. 🙂
Dude, no way I wasn't going to subscribe! What I should've learned years ago about programming I just learned in 17 minutes! Thanks good sir!
Great comprehensive video that is very understandable even if you are a beginner. This should be a must watch for beginner or even intermediate software developers to make sure they know the fundamentals.
It also should be helpful for CS students that try to get started and get a good overview. Also the examples are nice although sometimes brief which lies in the nature of the video so no bad aspects.
Thanks!
Are you planning on making a
"The making of : Git" ?
I'm kind of curious and your documentaries are.... Honestly the only kind that can keep my interest. The history. the visuals. Specially the speech pattern . can't explain it , its not just the enunciation on words , or the steady speed between , not too slow and not too fast ,but like you know what you are talking about.
Not just a fact blurb , so you can logically follow the line of thought behind the product.
Love it man.
Moved from website admin (database maintenance mostly with csv uploads) now learning C++. Amazing how often I solved issues with applying a hash table (room allocation, contact details, availability and scheduling) without realizing what it was called.
🤯 the best “array” explanation ever. You make code logic feel… logic, thinking as “storage” makes total sense for me. Thanks!
I like how you mention variations of some of the data structures. But you didn't mention the double-ended variety of the queue, or "dequeue" (pronounced like "deck"). It works like both a stack and a queue, where you can insert, delete, and peep at both ends.
Two things to check with your solder:
1. Is it rosin core? If not, then you're going to have a hard time forever, so I would recommend upgrading.
2. Lead free solder is harder to work with as well. I use leaded solder and if you're concerned about safety then I would recommend using a mask. But I don't like lead free because it gets cold too quick
Array's are not always fixed length. You can dynamically declare the length, and resize the array. in C, this would be malloc, realloc, free. In C++, this would be vector container.
I chuckled a few times at this, especially the vertex/node joke. Subscribed.
Wow, I wasn't expecting the RUclips algorithm to suggest I learn about data structures from a gas station clerk.
😂
I'm a web developer who delivers Amazon packages. One of the station supervisors at my sorting station used to build software as a service. I've met doctors who drove cabs, and chefs who used to be web developers, too.
People who stay in one job all their lives are missing out on life.
You're a savage 😂😂😂
@@PhilLesh69you missed the joke buddy 😂
@PhilLesh69 -- I didn't say he wasn't knowledgeable, it was a good video. But, come on, the dude gives vibes like he regularly goes ATV'n.
You can make a resizable array queue with amortized O(1) enqueue complexity, you just have to increase the size enough every time you're doing so. For example increasing the size to double the previous size will be O(1) on average since you are not increasing it most of the time
You can have arrays of structures/derived types which contain multiple data types. I use that quite a lot. You can also resize arrays when needed. There are ways to minimize efficiency issues caused by resizing (i.e. do it sparingly). And you can mimic some of the behaviors of other structures assuming a fixed maximum size, for MOST problems.
On queues, my favorite data structure >
What governs traffic on the internet? Queues!
What governs process management in a computer? Queues!
What governs your check-out and pay trivia at the supermarket? Queues!
Who invented queuing theory? A Dane named Erlang - An Erlang is the international standard unit of traffic - and traffic in general is modeled using? queues!
I love queues! /A very happy Dane
He died in 1929 before modern computers. Very interesting, I never knew this.
I've got a very visual mind: a tree is a specific type of graph, and if you think about it, so is a linked list. I love graph theory, I'm not very good at it, but I do love it. I enjoyed your comment about naming the same thing upteen different ways just to be confusing 😀
I really enjoyed this video, thanks. It reminded me how much I know and also connected some vertices in my mind..
One thing never mentioned with hash maps is the performance of them, they are a heavy algorithm and will most likely slower than an array for looking up a value. One advantage arrays have is the ability to use multiple threads to divide the array up into smaller blocks to search through, but you can escape the time needed to calculate the hash and these usually don’t scale as well and can’t take advantage of multithreading. But these are essential tools for all computer science.
it maybe worth mentioning arrays could form columnar set of data which itself could function as a queue or stack and each row of an tabular data structure could be a set of array of different ( or same) data types each row can also function as a queue or stack ( if there is some specific pattern for some specific processing need )...so there could be various combinations of those data structures ..and beside first in line get served first operation the first in line can also quit not get served but yes so can be tge last in line quiting, in fact, any body in the line can quit anytime, and the next person advance one position up. I am also guessing police or emergency dispatch may have priority rating so first caller may not get served first...so maybe like graphs with weighted option queue may have weighted option ? Could we also consider a totally random pool of data as an unstructured data structure or is that an oxymoron ?
I just started learning python the other day and this information I feel like is something I’m going to have to learn depending on what I want to do I really appreciate this video this is good information !
This was kind of fascinating for me, I need to find some example labs for each in a couple of languages to satisfy my own curiosity now.
Great video. Thanks for sharing.
My guy finished a day in the coal mines and came home to teach data structures! (can I get a link to that flannel? :D )
3:15 why are you using Python for the array examples when Python uses lists?
He explains there's a difference between arrays and lists literally seconds before that lmao
Nice video. It might be that experience with data structures and algorithms is one of the key differentiators between more and less productive programmers.
I've majored in half computer science and half statistic, been on and off programming for 7 years because it's not my main job. Having to relearn everything from beginning because somehow you don't remember any of this feels weird.
Thanks for sharing this
In .NET, all these structures are implemented via an Array by the base class library authors. Queue, Stack, List, LinkedList, etc., are just constructs to make their usage easier for the average .NET developer. Of course, BCL authors also put optimizations in the BCL constructs that make their usage faster than it would be for you to implement just an array, like via pointers and other methods not appropriate for most to use.
This is so amazing, Thank you.
IMO you can reduce this down to hashtable, array, and graph. You can derive the others from these three.
Thumbs up from me though. I certainly use all these at my job!
Imagine saying data instead of data.
data is just easier to say than data.
Anyone who says data is just mad that data is the more common pronunciation
I hate you
There's an asterisk on Linked Lists nowadays because most processors REALLY love their locality, a Linked List is almost always going to lose to a regular array, even if you relocate the whole thing. Exceptions are VERY VERY large lists and the Producer/Consumer pattern, where they still make sense.
"Generic" trees and many other data structures have the same drawback. Just learn how to manage memory efficiently, and locality will increase significantly.
Excellent presentation. You have a new subscriber (me).
Nice video!
You should have done the animation for the Stack segment in PostScript... A stack based language. :)
Thanks for the informative video, btw whats that liquid cooler in your rig?
I do learn quite a lot from your great tutorials 📖💻 Thank you for providing such great content 😊
You look like you smell like wood
His name is ForrestKnight, seems plausible
🤣🤣🤣🤣🤣🤣
My father was a lumberjack. Gasoline and balsam. Its a magical smell.😂
I'm getting deep oak vibes, with hints of moss.
I get it, a smoky wood too
You are a good teacher. I'm a noob, and I understood everything
Javascript does have standard arrays. Uint8Array Float32Array, etc.
be careful making statements about performance (big o) in modern processors: true arrays are super performant in many use cases due to cpu cache use. data locally may trump big-o analysis
Data Structures by Matthew Mcconaughey. Alright, Alright, Alright. great video
Man this is a great video
ladies and gentlemen i think my brain committed suicide; but it's the exact information i need to solve a problem so it's well worth
Stacks and Queues and Graphs aren't "data structures" they are Abstract Data Types. You can implement a stack with a LinkedList, an array or something else. Same as queues. For graphs you can implement them as a collection of nodes, an adjacency matrix, or with adjacency lists.
ADTs are the definition of operations such as Stacks and Queues, Data Structures are the ways we implement them.
they are indeed data structures as well. They are a way to structure data. Graphs live more on the edge of that definition, but they are a way to organize data. As I said for graphs, you're right that there are different implementations of them which is why they live on the edge.
@@hawks3109 they aren't "ways to structure data" they are a definition of an API that is LIFO or FIFO. The way to structure the data is via an array, a linked list or a tree implementation. Those are the data structures.
Lots of confusion between Data Structure (the way to structure data, the implementation) and the Abstract Data Type (the list of operations that define the interface of the operation)
where did you buy that keyboard?
first arrays : creates a python list with python ints, wich in itself is an object , that is a whole tree of fuctionallity that o has a hashmap pointing to objects (in this case a python int)
in c an array would be the sum of he size of its elements, in python its not :), not dunking on the whole video , i like it actually just saying , that while python might look like pseudocode and makes it easy to explain stuff , be carefull not to actually cause beginners to have the wrong idea or assumption that has to be unlearned at some point, in this case i would have gone with plain old C, as for these thins C is actually really easy to understand , it also would have forced you to do what you said in the definition : set its size to a fixed number of elements
Trees can also be used for creating file systems
Btw collisions rate in a hash map are dependent on the hash function and the range of buckets compared to the size of the hashkey. Maybe an advanced point to make in this video 😅
Python " arrays" are actually lists. And they can contain a mixture of data types.
I love your stile :D
Great video.
I'm fairly sure the heap in JVM has nothing to do with a heap data structure.
Could you create a video with corresponding questions to each data structure?
What questions do you have? Or what do you mean by corresponding questions? I have a master's in CS and work in a FAANG company so I may be able to answer your questions
Thx!
Hey Forest! I’ve been following your videos for a while now, and I really love your content. I would greatly appreciate it if you could make a video about Spring Boot.
Been diving into Elixir recently, and that language will have you question how you've thought about data structures before. I know the pop
yo i have a mustache and my name is forrest. i subbed. did you also start with “hey my neighbors” 😂
Wasn’t that Python array example actual a list?
Python list is not an array. But Numpy array is.
Jesus Christ, you just explained 3-5 chapters of our verbose DSA book to me. Thank you.
Could you do a video about the job market situation currently?
Why do you need him to tell you that the job market is ass?
No visual for heaps?
Thumbnail had a graphic that I assume is a reference to state machines, a sub study of graphs, but no mention of it in the video?
I think ur talking about hash map
Name and phone number is literally obfuscation of the idea of hash map. The telephone number is not a value, it's hash of a person's identity in its most basic sense.
I wonder how indexing with imaginary complex numbers will work for an array....anybody specializing in hash function research done statistical probability research on graphing on w x y z axis the correlations between hash function vs size of data set vs growth rate of data set and rate of access ?
Why do you call the Python list array?
So a queue is essentially an array where you only access the first or last element.Then if we make an array and only use it in this exact way, why wouldn't it be equally efficient with insertion and deletion? Also a hash table is essentially a double array. Instead of accessing an element by index, you now have to access it by a key referencing an index referencing an element. How is this more efficient exactly? How come the array isn't more efficient or at least equal compared to both queues and hash tables?
Adding an element at the end of an array is a bit costly because you need to maintain the contiguity of your elements. If the next memory emplacement after your current last element is free, then that's fine, but if it's not, you'll have to allocate a new contiguous block of memory long enough for all your elements and copy all the old elements plus the new one in this new emplacement.
That makes "adding an element at the end of an array" a very inefficient operation in O(n). There are ways around that, see dynamic arrays for extensible arrays where the addition of an element is amortised O(1), but at this point, you're not just using a plain old array!
And deleting the first element repetitively raises a similar problem of reclaiming the freed memory from time to time if you want to avoid a memory leak, which you probably want to, given the tendency to use queues in some long running processes.
For hash table, they're essentially arrays (probably a kind of dynamic arrays, customised for this usage) underneath. The essential distinction is how you access them, the primitives you're using to manipulate them so that finding the value associated to a key is constant time (in relation to the number of keys in the hash table).
@@chaddaifouche536 Thanks for your answer! Does that mean that dynamic arrays have both the benefits of arrays and queues then? So you would really only need to care about the differences between them if you are working on legacy code. Otherwise you would just always use dynamic arrays. Did I get that right?
And about hash tables, does that mean that if you need to sort or somehow rearrange the data, you don't actually sort or rearrange the values, only the keys? I can see how that would be useful over arrays if I was working with large data elements.
@@MrPellenisse Well, you're kinda right, in certain contexts where you don't care that much about performance, and kinda wrong in scenario where performance is critical…
In general, if you want peak performance, you should use the data structure that fulfils your needs exactly, and no more, because any extra facilities are probably paid for in lesser performances in some operations, even if it's not reflected in the big O, or more memory usage. For instance, dynamic arrays need to reserve more memory than what they're actually using, and keep track of the length of reserved memory and the parts of the array underneath that are really used.
In many (most) cases, you don't really care for the small difference in performance that would entail. For instance, Python took exactly your stance when it decided to provide "list" as its standard "array" data structure, since Python's "list" are in fact heterogeneous dynamic arrays with extra bells and whistles! On the other hand, when someone *needs* to do numeric computing in Python, using the standard lists is a terrible idea and the right decision is probably to use something like the numpy arrays (that are real typed arrays, not dynamically typed dynamic array like list). You may have orders of magnitude better performance this way.
--
About hash tables, I don't think you've grasped their main advantage yet : the main "advantage" of an hash table compared to arrays is that instead of numeric indices that *have* to be consecutive, you can use any subset of an immutable type as keys (most frequently strings) and access the corresponding value in O(1). They're basically arrays where the "indices" are the names of your customers, for example.
You may maintain an array of keys separately and order it as you wish but that's independent of the fact you're using an hash table (you may do the same thing with the indices of an array, after all). Hash tables are sometimes called dictionary, or associative arrays, reflecting their functionality rather than their implementation (dictionaries can also be implemented with binary search trees with O(log n) consultation and insertion instead).
@@chaddaifouche536 Thanks for your explanation! It does clarify a few things.
So then you are saying that a hash table can't be accessed by an index value at all, but ONLY by its key? Is that correct? Otherwise the hash table would have both indexes AND keys pointing to the same value, which is how I thought they worked.
It seems to me that hash tables would still only be useful for large datasets where you need to access values out of order. Usually when working with arrays, you use them exactly because you actually want to iterate through every single element in some loop anyways, making index values mandatory and keys simply redundant.
A queue is a special type of list, not array
Please keep the pictures of examples and descriptions longer.
Every developer should know what sets & bags are.
And Union-Find "reverse trees".
It’s worth noting, that Python also allows storing different data types in a single list.
In fact, this capability is common to all dynamically typed languages.
And many statically typed languages which have a large runtime (like go for instance) have the same ability, in go you can just do this: var some [4]interface{}
this is not actually true. python lists (and other containers) store pointers to any python type. to the user it seems as though a list can store multiple type simultaneously but in reality the type it stores is always a pointer to a python object.
storing fixed size data is very important because you can easily calculate it’s memory location from the index and the list start. if you cannot guarantee the size it’s a linear search. for example see why you cannot just index into UTF-8 strings
@@soothsayer1interfaces in the golang runtime are always pointers, for the same reason noted above. if you want an array with multiple types of elements, the way without using a pointer is a union type. the rust enum implementation is one such example
Why does that sound like Unions in C?
@@simpletongeek a union is a block of memory that can be interpreted as multiple types at the same time. so you can have a 4 byte int and a byte 4 array simultaneously. an interface is a pointer to a block of memory. the interface is also an agreement about the methods that can be run on it. this way you can call the same methods without knowing which type it is. but the important bit, is that an interface is only one of it’s types while a union is simultaneously all of it’s types
This video is great and makes these topics easy to understand, thank you so much Forrest!
Couldnt you just argue that an array of different objects is probably an array of pointers to those objects? So its still a single data type, pointers.
this isnt really a beginners guide but a "im already familiar with all of these terms and many other adjacent concepts"
I like this guy 😅
I don’t want to escape Java
This is what matthew maconeghey (?) would sound and look like teaching CS
Great video! Thank you sir !
Writing linked lists is a good programming exercise but you should NOT use them in production code because they have terrible cache locality. Having your working set in the CPU cache is critical for performance.
I liked this video a lot, really does remind me of my first year in CS
I wonder what we can do with a data structure of data structures ?
For example...what kind of data structure optimizes research of gigantic object of black holes...vs research of sub atomic particle or even massless photron vs biochemical vs flyid dynamic. Etc etc
elements. elements is the word you were looking for when discussing array size.
Arrays have a fixed number of elements, defined when the array is created.
particular programming language biased. arrays arent always fixed width or only store one datatype.
anyhow, everything is a linked list, with a feature or two added.
Unlike arrays linked lists don’t have any real world usage
Great explanation
It's BS an array in particular only contains one type of data. That depends on the language. I'd go so far to say that every data structure has the same property in that regard. In a strongly-typed language all data structures only store one type of data, and in a dynamically-typed one (like your python example), they usually hold every type.
Maybe Watch the video again
@@hansjhansen2217 I certainly won't. It's not informative for anyone that got past page 30 in the tutorial of whatever programming language they started with, and even misleading in some regards, see my comment above. What do you think is wrong about my statement? In Python an "array" (which btw is not an array) can hold every kind of object. In C++ any datastructure can only contain the type with which the template was initialised.
@@gehirndoper He is stating the principel of an array. That different program languages “expands” the definition doesn’t change what he is saying is true.
Python only has lists built in.
Bro sounds like programming's matthew mcConaughey. Subscribed until he's trying to sell me a lincoln.
Looks more like a dude that been in the garage workin on cars.
vectors/dynamic arrays kind of make linked lists obsolete.
Linked list were made for what arrays -- dynamic or not -- are not good at.
no data structures are not the fundamentals of programming nor of problem solving. They are simply ways to structure data.
Ah , yes, a nightmare for most of the computer science students😂
True golden nugget. In 17 minutes, you simplified what my professor couldn't in hours of lectures!!!