Another great vid. But, I also feel using Year as key to lookup schools is an odd example, and causes confusion for many... Not only for collision potential. Also, it's not a real world use case to say "I know Harvard was founded in 1636, so look it up by 1636 and see if Harvard is there". I know - it's just an example with 4 digits for simplicity... It's just easier to comprehend examples with real-world use cases.
exactly! as i went thru even i was like what if 1701 was used again for say standford but then the key is exhausted so how do we handle collision? plus this would fail over large data sets according to the example.
I was thining of something like a social security number as a key so that the demonstration made more sense to me. Entering it would search retrieve a possible name for example.
actually the example David uses in the lecture is more clear than this one, I understood the concept, but where is the code to do so? i still have to look for other resources to find out
Guys I have a question, because I don't quite understand the purpose of these videos. While an intro to "tries" seems really interesting, there isn't any parctice with them. In addition, there aren't any suggestions about freeing the nodes of a try... Am I being reasonable here?
I’m afraid this video, as with some other videos, contains assumptions that are subtle and may be not known to CS beginners. As some comments have pointed out, what happens if there is a collision of year? Maybe a reemphasis on the original assumption is able to make viewers clear or give a solution to how to solve a collision.
Normally such data structure is assumed to have bijective relationships, namely if x != y, then f(x) != f(y). However in the cases where you really want to make a tries such that it contains multiple values, you can always call realloc() at the last step to reallocate the string array size.
@@jaaan2914 The "onto" condition is implicitly assumed here, as the the x and f(x) come in pairs in the example of the data structure in the video (x = year, f(x)=university name), hence each f(x) is defined to be mapped upon.
I've seen this before, but as file structures and using a 2-3-4 method. Your first level has two choices, second level you have three choices; and third level you have four. Same concept, just different amounts of oaths. One thing that popped out at me is your use of ten digits perceisly describes the arm of the Strowger dialer. The only thing not depicted with this example is the relays to find the next available Strowger device, to signal the last digit in the number, and a few other cases. This also shows how using a digit for a single use (like 0 for Opertor or 1 for long distance) takes (10^x)-2 posable lines away. Man, I love computer science. Oh, and how would you handle collisions. Say two schools were founded in 1836? Simular tk a has table?
If the key is the same, then essentially you're replacing an item in your trie. The same would happen in a hash table. In a hash table, a collision is not referring to when you repeat a key. Repeating a key is simply replacing something. A collision in a hash table occurs when, even though the key is different, the hashing function maps it to the same storage location as another existing item. On a trie, if you search for a key like 1750 and 1890, they're never going to collide because you're going first to 1, then branching to 7 then branching to 5 then branching to 0.
@@AllanPichardo I think the question is referring to what happens if you need to store University A and University B in a trie if they were both founded in 1750? What solutions resolve that collision in a trie data structure?
@@goatdwarfs Yes, I see. In that case, I would suggest either not using a simple year as the key. Or perhaps, every node can contain, rather than one university name, a collection of universities. You can customize this to fit your specific use case.
I'm not CS guy, but I think its about having a hash-function that cannot collide. In this video the year represent the output of that hashfunction. I am thinking for example of using the ASCII int representation of each char in the string and user that as the key. (use separator character between each int or use fixed length with null padding so you know how the ints are grouped together in order to construct the 'roadmap' correctly) My approach would ofcourse use a lot more memory. Each node would hold an array of at least sizeof(int)*26 if we ignore casing.
This is not the way that Professor Malan described tries in the lecture, so I am confused. In the lecture, he never mentioned anything about key value pairs, but rather that a trie just had a array of the values that needed to be stored, and each index in the array was a letter plus a pointer to the another array.
@marjanp Practically, how are you going to guarantee uniqueness...If you were going to index all the universities by their date founded you will find collisions. University of Pittsburgh - 1787 Franklin & Marshall College - 1787 The larger the data set the more likely a collision. I just wanted to know the common way to deal with situations like this.
Because that number is constant. No matter how many new elements you insert into the trie, a particular element will always take the same amount of time to be looked up. This is different from something like a linked list where with each added element you need to iterate over an additional node (in the worst case) so the lookup time grows with the list size.
In essence, I feel this video makes easy things complicated and complicated things easy. For instance, tries is just a data structure that allows us to store multiplayer of information as long as we make sure that each element has a unique representation. It can have more branches than hash tables and arrays.
@@vaibhav3874 my main question was the university name and university year. University founding years might be the same, but in this video, it was assumed to be different. So that kind of compels to ask what happens when university founding years are different, how can you store the information in tries?
Are you going to regularly spam literally 50 videos? Deciding if I should ubsub. This isn't a petulant inquiry/threat, I genuinely need to know if this is going to be a regular occurrence and disruption for me. Thank you in advance.
How do you expect someone to understand the rules that you are saying in the beginning of the video if you haven't even mentioned a single example. This video is more appropriate for people who have a basic understanding of the trie data structure. Otherwise the first part of the video will only sound gibbersih.
god bless cs50
No kidding
Doug is great. CS50 is an excellent resource.
Another great vid. But, I also feel using Year as key to lookup schools is an odd example, and causes confusion for many...
Not only for collision potential. Also, it's not a real world use case to say "I know Harvard was founded in 1636, so look it up by 1636 and see if Harvard is there".
I know - it's just an example with 4 digits for simplicity... It's just easier to comprehend examples with real-world use cases.
exactly! as i went thru even i was like what if 1701 was used again for say standford but then the key is exhausted so how do we handle collision? plus this would fail over large data sets according to the example.
I was thining of something like a social security number as a key so that the demonstration made more sense to me. Entering it would search retrieve a possible name for example.
@@SlowDancer I similarly thought of the letters of alphabets!
actually the example David uses in the lecture is more clear than this one, I understood the concept, but where is the code to do so? i still have to look for other resources to find out
sir, what if there are two schools founded at the same year? Or should i use hash table instead
you can use letters as keys
and years as values
it will work
Guys I have a question, because I don't quite understand the purpose of these videos. While an intro to "tries" seems really interesting, there isn't any parctice with them. In addition, there aren't any suggestions about freeing the nodes of a try... Am I being reasonable here?
I’m afraid this video, as with some other videos, contains assumptions that are subtle and may be not known to CS beginners. As some comments have pointed out, what happens if there is a collision of year? Maybe a reemphasis on the original assumption is able to make viewers clear or give a solution to how to solve a collision.
Normally such data structure is assumed to have bijective relationships, namely if x != y, then f(x) != f(y). However in the cases where you really want to make a tries such that it contains multiple values, you can always call realloc() at the last step to reallocate the string array size.
@@fdc4810 If x != y, then f(x) != f(y) is necessary but not sufficient for bijectivity. What you're describing is injectivity.
@@jaaan2914 The "onto" condition is implicitly assumed here, as the the x and f(x) come in pairs in the example of the data structure in the video (x = year, f(x)=university name), hence each f(x) is defined to be mapped upon.
Thank u so much sir for this really insightful lecture.
I've seen this before, but as file structures and using a 2-3-4 method. Your first level has two choices, second level you have three choices; and third level you have four. Same concept, just different amounts of oaths.
One thing that popped out at me is your use of ten digits perceisly describes the arm of the Strowger dialer. The only thing not depicted with this example is the relays to find the next available Strowger device, to signal the last digit in the number, and a few other cases.
This also shows how using a digit for a single use (like 0 for Opertor or 1 for long distance) takes (10^x)-2 posable lines away. Man, I love computer science.
Oh, and how would you handle collisions. Say two schools were founded in 1836? Simular tk a has table?
Man Doug is such a great teacher. But why does the intro have to be so loud and his voice so soft. RIP Headphone users.
I'm not sure I understand. Wouldn't you get collisions using the year as a key? If so why use that as an example.
Can someone clarify for me?
If the key is the same, then essentially you're replacing an item in your trie. The same would happen in a hash table. In a hash table, a collision is not referring to when you repeat a key. Repeating a key is simply replacing something. A collision in a hash table occurs when, even though the key is different, the hashing function maps it to the same storage location as another existing item. On a trie, if you search for a key like 1750 and 1890, they're never going to collide because you're going first to 1, then branching to 7 then branching to 5 then branching to 0.
@@AllanPichardo I think the question is referring to what happens if you need to store University A and University B in a trie if they were both founded in 1750? What solutions resolve that collision in a trie data structure?
@@goatdwarfs Yes, I see. In that case, I would suggest either not using a simple year as the key. Or perhaps, every node can contain, rather than one university name, a collection of universities. You can customize this to fit your specific use case.
I'm not CS guy, but I think its about having a hash-function that cannot collide. In this video the year represent the output of that hashfunction. I am thinking for example of using the ASCII int representation of each char in the string and user that as the key. (use separator character between each int or use fixed length with null padding so you know how the ints are grouped together in order to construct the 'roadmap' correctly) My approach would ofcourse use a lot more memory. Each node would hold an array of at least sizeof(int)*26 if we ignore casing.
Your arrows and colorful boxes are pretty but without some related code it does nothing to help me understand how to utilize it.
How about if I want to insert another university that founded in 1636? Do I have to insert it in 0 after I reach Harvard, so it make a list?
This is not the way that Professor Malan described tries in the lecture, so I am confused. In the lecture, he never mentioned anything about key value pairs, but rather that a trie just had a array of the values that needed to be stored, and each index in the array was a letter plus a pointer to the another array.
Because, the array's index was the key present there
What happens if you have a different school founded in 1636 after you've inserted Harvard?
He says the key (year) is guaranteed to be unique. Otherwise it would overwrite previous value with the new one.
marjanp okay so in his example he gaurentees uniqueness but in the real world what happens?
The key has to be unique.
@marjanp Practically, how are you going to guarantee uniqueness...If you were going to index all the universities by their date founded you will find collisions.
University of Pittsburgh - 1787
Franklin & Marshall College - 1787
The larger the data set the more likely a collision.
I just wanted to know the common way to deal with situations like this.
It's a bad example. founding year isn't guaranteed unique. You have guaranteed unique keys in real life like SSN, ISBN etc.
I wish the intro music wasn't so loud.
Cs50 is best channel on utube.for that student's who r studying.
Explained brilliantly ! Thanks for sharing
thanks for CS50 Team
thanks doug ily!
Why does he say chaining is more preferable to probing as a collision strategy?
If I want the number 123 but I dont want to store 12? How do that?
Thank you. Very helpful.
What happens if you have collision?
Why is he saying trie is constant time if it depends on the size of the number?
Because that number is constant. No matter how many new elements you insert into the trie, a particular element will always take the same amount of time to be looked up. This is different from something like a linked list where with each added element you need to iterate over an additional node (in the worst case) so the lookup time grows with the list size.
@@lew162 thank you!
These CS50 are great videos but you should number them so we know what order to watch them....duh
THANK YOUR Doug Lloyd
A subbox spam that I am actually happy to see, thanks for all those videos
Is this related to Bloom filters?
They share a use case, though tries are deterministic as an indicator of set membership and bloom filters are not.
Thanks
In essence, I feel this video makes easy things complicated and complicated things easy. For instance, tries is just a data structure that allows us to store multiplayer of information as long as we make sure that each element has a unique representation. It can have more branches than hash tables and arrays.
What are you on about? I'm new to CS and this was perfectly clear to me.
@@vaibhav3874 my main question was the university name and university year. University founding years might be the same, but in this video, it was assumed to be different. So that kind of compels to ask what happens when university founding years are different, how can you store the information in tries?
@@vaibhav3874 I don't mean this video series is not good. In fact, they are very helpful. I'm just saying it's confusing for me sometimes.
@@jingtao1181 It's just a simple example to help understand. In the real world you wouldn't use keys which enable collisions so easily
You should lookup the replies on the comment Michael Neas has putted in this video.
It's over-complicated. Trie is a combination of multi-rooted trees.
View this video for easier visualization: ruclips.net/video/-urNrIAQnNo/видео.html
You miss one word, you wont be able to get through the rest of the video.
thanks sir
i thought tree and tries were two different data structures😂😂
they are actually
Unfortunately, this was not a good example for tries 😕
Maut in cs50 lul...
The explanations were maddeningly verbose and repetitive in this video.
Are you going to regularly spam literally 50 videos? Deciding if I should ubsub. This isn't a petulant inquiry/threat, I genuinely need to know if this is going to be a regular occurrence and disruption for me. Thank you in advance.
Omg, the first world horror!
@@zerosandones701 lol
How do you expect someone to understand the rules that you are saying in the beginning of the video if you haven't even mentioned a single example. This video is more appropriate for people who have a basic understanding of the trie data structure. Otherwise the first part of the video will only sound gibbersih.
this is for edX platform where you're supposed to watch the videos in a specific order
and the lecture that proceeds this touches on tries as well