Understanding and implementing a Hash Table (in C)

Поделиться
HTML-код
  • Опубликовано: 30 сен 2024

Комментарии • 505

  • @leokiller123able
    @leokiller123able 3 года назад +350

    The fastest guy to solve a segfault on earth x)

    • @ramprasad7
      @ramprasad7 2 года назад +5

      Absolutely.

    • @dimi5862
      @dimi5862 2 года назад +11

      It's not hard lol, you just get used to it after some time in c

    • @leokiller123able
      @leokiller123able 2 года назад +23

      @@dimi5862 yeah now I ve reached this level its been 10 month since the comment 😂

    • @julengirard8611
      @julengirard8611 2 года назад +4

      I'm dead 💀

    • @Ryan-xq3kl
      @Ryan-xq3kl Год назад +7

      there is no segfault, only bad code

  • @RustedCroaker
    @RustedCroaker Год назад +276

    For those who might get the idea of storing age as a data field literally, never do that! Store the date of birth instead, and calculate the age at the time of the output, otherwise your data will be invalid in less than a year.

    • @Amy-mo9ki
      @Amy-mo9ki Год назад +7

      thank you for reminding me

    • @urisinger3412
      @urisinger3412 Год назад +11

      Its better to check every frame if the year changed

    • @RustedCroaker
      @RustedCroaker Год назад +9

      @@urisinger3412 I'm sure you know that not everyone was born on January 1, right?

    • @urisinger3412
      @urisinger3412 Год назад +2

      @@RustedCroaker then ask for age in years and days

    • @RustedCroaker
      @RustedCroaker Год назад +15

      @@urisinger3412... or just date of birth as I said in the first comment.

  • @ayoubmentag9883
    @ayoubmentag9883 2 года назад +40

    In 11:44 in strncmp() function You need to put MAX_NAME instead of TABLE_SIZE ;
    Thanks a lot Jacob that was super useful :)

    • @LinuxJediMaster
      @LinuxJediMaster Год назад +3

      I instantly saw this as well lol.

    • @blankspace1959
      @blankspace1959 Год назад

      yeah was confused for a bit there as to why put the table size

  • @mario7501
    @mario7501 4 года назад +152

    Man, I’m surprised you don’t have a million subscribers yet. This is the best channel on programming out there. I am currently reading “the Linux programming interface” which is a massive book and I always find myself coming to your channel if I don’t quite understand a certain topic! I hope a lot more people will recognize the value you provide here!

    • @yuvaraj9268
      @yuvaraj9268 3 года назад +2

      Good move to go through Linux Programming Interface, which you can always refer to, for Programming on the Linux Platform.

    • @Daniel95221
      @Daniel95221 2 года назад +3

      He is held back by his loud keyboard ;)

    • @edwardmacnab354
      @edwardmacnab354 2 года назад

      @@Daniel95221 mario or jacob ? or both !

  • @otterowen2867
    @otterowen2867 3 года назад +36

    deym! You code really fast. I'm getting movie hacker vibes whenever I hear you typing the code

    • @soroushmasoodian
      @soroushmasoodian 3 года назад +3

      It’s fastforwarded

    • @Mythic_Echoes
      @Mythic_Echoes 3 года назад

      @@soroushmasoodian Nope, he utilized the text editor well and has fast hands.

    • @pasy_g
      @pasy_g 3 года назад +3

      he definitely fastforwards at some points

  • @ileanagheorghisor
    @ileanagheorghisor 3 года назад +91

    It's amazing how nicely you explained this! You didn't just dump some code on us, you explained the whole thinking process, making adjustments on the way, teaching us how to think the hashtable, not just how to copy it. I am so looking forward to watch your other videos, I really hope they will help me improve my data structures implementing abilities. Online college classes weren't too favorable for me and I am having a hard time doing my assignments in time.

    • @JacobSorber
      @JacobSorber  3 года назад +4

      Glad it was helpful!

    • @velim6597
      @velim6597 9 месяцев назад

      He did exactly the opposite. Started off good but messed it throughput the way. Not the best at teaching I guess, however it was a valiant effort to show people how to be shit at teaching.

  • @karimkohel3240
    @karimkohel3240 4 года назад +67

    i love that you started on data structures thank you so much this is helping me in my courses a lot

    • @JacobSorber
      @JacobSorber  4 года назад +8

      Glad I could help.

    • @cadenjosiah9515
      @cadenjosiah9515 3 года назад

      i dont mean to be off topic but does anyone know of a way to get back into an Instagram account??
      I was dumb forgot my password. I would love any assistance you can offer me

  • @smrtfasizmu6161
    @smrtfasizmu6161 2 года назад +18

    These videos make me fall in love more and more into programming. Although programming was my love at first sight. Great, clear and fun explanation. It was a pleasure to code along. As a beginner it was a miracle to me that you got only 1 segmentation fault (as I am not used to that lol).

  • @AshleyBaker75
    @AshleyBaker75 Год назад +1

    Everything is going so quickly I had to slow it down to 0.75x playtime, so then you sounded drunk so I thought I'd have a beer too. Now I'm watching Star Wars drunk.

  • @gowrimajayaramu1243
    @gowrimajayaramu1243 4 года назад +5

    Should the size in strncmp be 'MAX_NAME' instead of 'TABLE_SIZE', since we are limiting the strcmp to MAX_NAME?
    strncmp(hash_table[try]->name, name, MAX_NAME) instead of strncmp(hash_table[try]->name, name, TABLE_SIZE)?

    • @ChrisBNisbet
      @ChrisBNisbet 3 года назад +1

      Yes.

    • @midrangeshot8203
      @midrangeshot8203 2 года назад +1

      Mistakes like that can happen to anybody, as far as I understand yes, it should be MAX_NAME, since MAX_NAME is the limit to the number of characters of name, while TABLE_SIZE is the maximum number of elements in the hash table, it has nothing to do with the size of the name.

  • @funkykong9001
    @funkykong9001 4 года назад +89

    Thank for this great tutorial. For future videos, please give an additional second or two after writing a function to allow the viewer time to pause to see the code. It's extremely distracting with all the Visual Studio Code popups that cover the code you're writing as well and it's sometimes tough to find a split second to pause the video to see the code before you jump to something else.

    • @midrangeshot8203
      @midrangeshot8203 2 года назад +2

      I was coding in parallel with him and I had no problem pausing videos. I was suprirsed I wasn't getting segmentation faults all the time, but that's because every time I would wrote code which ended up being different than his, I would rewrite that part of the code lol

    • @sharontzu5
      @sharontzu5 2 года назад +1

      You can run the video slower!

    • @futuza
      @futuza 2 года назад +1

      Why though? You can just pause the video? Or the play speed.

    • @funkykong9001
      @funkykong9001 2 года назад +3

      @@futuza I usually do pause videos, but the cut happens quickly from when the last line of code is written and the next scene begins, so it's more challenging than should be to pause just in time

    • @RyanMartinRAM
      @RyanMartinRAM 2 года назад +1

      He probably wants you to pay for his Patreon to view the code.

  • @ayu12641
    @ayu12641 4 года назад +13

    The best tutorial so far on hash function in C. Thank you. How do we come up with an optimal hash function for the data structure? Is maximum randomness the target?

    • @JacobSorber
      @JacobSorber  4 года назад +2

      Usually. If you're trying to minimize collisions, then maximum randomness (and fast) is usually the goal.

    • @joseville
      @joseville 4 года назад +1

      Is there a procedure one can follow to find a good/optimal hash function? I usually use something like Effective Java's hashcode impl like the one mentioned in [1] and assume it's good enough.
      [1] stackoverflow.com/questions/113511/best-implementation-for-hashcode-method-for-a-collection

    • @austinhastings8793
      @austinhastings8793 3 года назад +2

      FYI: There is a thing called a "perfect hash." This is a hash that is tailored for a specific set of inputs, and will produce distinct values for each of them. If you know *all* the possible inputs in advance -- like when you are parsing language keywords -- then perfect hashes can be useful. (Ask the Duck about "gperf" for a software package that provides these functions.)
      For any other scenario, you are looking at doing the best you can with what you've got. Some "cryptographic hash" functions are really good at providing seemingly random output for small changes in input, but they are slow. Now you're trading off speed vs. behavior. Most commonly-used hash functions opt for some simple rules: use 2**n buckets, use prime numbers to multiply, etc.
      If you really want to find good hash functions, look at the source code for programming languages that provide "associative array" or "hash" or "dictionary" data types: awk, perl, ruby, python, java, c++. You can find hash functions that have been looked at and tweaked by a lot of people over many years, which hopefully provides a good performance vs. behavior.

    • @ingiford175
      @ingiford175 2 года назад +1

      @@austinhastings8793 Just one note. you do not use simply 'prime numbers', to multiply, you need a number that is relatively prime to the table size. If you have a size 100 table, you do not want to use 5 as your multiplier as it will map 100 objects into 20 spaces which will give more collisions. Now 3 or 7 in that case will be better.

  • @HansBezemer
    @HansBezemer Год назад +5

    My favorite "hash" structure is either an array (fixed size) or (when allocating) a binary tree. If I use an array I add a binary search. Sure, you keep on moving memory when inserting or deleting, but [a] searching is O(log n) and [b] the code is pretty straight forward. Binary trees also feature a O(log n) lookup, but due to the pointers, it requires about twice the memory. Both structures hardly suffer from performance degradation.
    My favorite hash is FNV-1a. It's got both 32bit and 64bit versions, easy to implement, very fast and collisions are very rare. E.g. costarring collides with liquid, declinate collides with macallums, altarage collides with zinke. I think you catch my drift. ;-)
    I did "classical" hash tables, but buckets are simply too much of a hassle - so I can't bother anymore. I find myself using the binary search/hashtable cross-over most of the time. Very often "good is good enough".

  • @atharvakulkarni2024
    @atharvakulkarni2024 Год назад +1

    Lol seems like you are typing and explaining for youself rather than the audience. Go a bit slow rather than showing your typing speed that won't help a viewer

  • @foobarbecue
    @foobarbecue 10 месяцев назад +1

    Darn, this seemed good, but I could not deal with the loud typing noises. Set off my misphonia and I had to stop watching.

  • @EhabSmH0
    @EhabSmH0 6 месяцев назад +1

    less amount of likes because of your fast speech, is someone running after you? 😃
    Thanks for your effort btw!

  • @joaogabrielonofre7297
    @joaogabrielonofre7297 4 года назад +8

    Thanks for the video, man! I'm doing Harvad's CS50 course, we have to use a hashtable in one of the problems and your explanation is much clearer and more in depth than the one provided in the course =) I have a silly question that is not relationed to hashtables itself, but I think will help me understand more of C in general: why in this video you used strnlen/strncmp and not the usual strlen and strcmp? What do the 'n' in the strnlen method stands for? I've read a bit of it online but couldn't understand it well.

    • @JacobSorber
      @JacobSorber  4 года назад +4

      Thanks, Joao. Glad it helped. For the question, check out this video about strings and security concerns. ruclips.net/video/7mKfWrNQcj0/видео.html

    • @demirmahir
      @demirmahir 2 года назад +1

      Good job noticing the "n" in strnlen! I was also wondering why MAX_NAME was passed to the function and came here to post the question in the comments. Glad I went trough them first.
      I am also doing the CS50 and try to go line by line with Jacob's hash first to better undestand the implementation of the data structures. His examples were quite useful troughout the course.
      The stnlen() function though cannot be used by the makefile CS50x implemented as default since strnlen seems not to be a part of the standard C99. You can compile the code differently of course, but for the sake of playing around in CS50x I'm not sure that is necessary. In real world implementation strnlen is quite important as Jacob explains in the video.

  • @elpatotengu31
    @elpatotengu31 4 года назад +14

    Love the video, but i absolutely love watching you code with that keyboard sound. Its so satisfying

  • @ssisispei
    @ssisispei 4 года назад +6

    Your thoughts is super fast! lol It's helpful for me to do cs50 problem set 5, thanks!

    • @megaraph5551
      @megaraph5551 3 года назад +1

      lmao, I'm also doing it for pset5. What do you think was the hardest pset you've done in cs50 (except for tideman)?

  • @RPBiohazard
    @RPBiohazard 2 года назад +7

    Wow, this is an excellent tutorial. Trying to brush up on my C and this content is exactly what I wanted!

  • @adityavikramsinha408
    @adityavikramsinha408 2 года назад +1

    it takes time, but someday ALOT of people will start subscribing to u

  • @drcl7429
    @drcl7429 Год назад +13

    best typing sound of all tutorials on youtube

  • @drimodramo6567
    @drimodramo6567 Год назад +1

    0:20 why would you mention a hash table to a non programmer person?

  • @Rohith_E
    @Rohith_E 4 года назад +3

    @12:24 in strncmp, third argument should be MAX_NAME

  • @ericxue3244
    @ericxue3244 Год назад +1

    it's not a constant time operation... it is linear with the size of the key

  • @mashaido
    @mashaido 4 года назад +8

    did I just hear "a hash function sounds like a party w/ a whole lot of weed being smoked?" hahaha

  • @rustycherkas8229
    @rustycherkas8229 2 года назад +1

    Not mentioned is that clients using this facility to retrieve & modify any 'person record' must NOT change the value(s) in the field(s) used by the hash function. Correcting the spelling of "Jane" to "June" would likely leave that person's record filed in the wrong location, never to be found again...
    Because delete() returns the ptr to the record, to modify safely is to delete, {modify} and re-insert.
    The "Open" version mistakenly used "TABLE_SIZE" for strncmp().
    Corrected to "MAX_NAME" in "linear probing" version.

  • @frozenfar7231
    @frozenfar7231 4 года назад +2

    this was a good video and thanks for the effort, but it was by far one of the hardest videos to follow for me, most of it comes from me but imo there are things that can improve the quality of videos in the future, first please close the file browser tab if it is not needed and capture scene is small so more of the code is visible, second please scroll a bit slow or all and all slow the explanation process so noobs like me can follow, and last but not least programming videos without source code imo are half useful and of course, you can opt to put that as a feature for a paid option but I think that would reduce the impact of your work.
    all and all thank you so much and keep up the good work.

  • @elgisss
    @elgisss 3 года назад +1

    Damn dude... You type fast... Wild west kinda fast... Anyways, havent checked your channel (but i will), if you havent done a video on how you learned to code that fast and what shortcuts you use, please do one on this.

  • @SokarEntertainment
    @SokarEntertainment Год назад

    Very lovely video, but the sound of that keyboard man... Its like a nails on a blackboard :O

  • @dj_mk_crazy
    @dj_mk_crazy Год назад +2

    As a C++ programmer I use the std::map and std::unordered_map for this purpose but this video will be useful for me if in some situations I would not be allowed to use them. Thanks!

  • @austecon6818
    @austecon6818 3 года назад +5

    This is so freaking great! Thank you for this! I'm a python programmer but I am always taking a peek at C because I have unaddressed urges to dabble in low-level programming sometime (maybe for C extensions to optimize my python projects). What a great way to learn C - making hash tables.

    • @JacobSorber
      @JacobSorber  3 года назад +2

      You're welcome. Let me know if there are other topics you would like to see on here.

    • @drkwrk5229
      @drkwrk5229 2 года назад

      C is amazing!

    • @BonelishOfficial
      @BonelishOfficial Год назад +1

      Scripting language programmers, including the unix shell, are a bit spoiled because the interpreters ships with excellent built-in hash-like data structures, like Bash associative arrays, Perl hashes, and Ruby and Python dictionaries. Even Windows Powershell ships with a fast associative array implementation. It's very useful, and practically mandatory, for certain algorithms in C to implement hashtables, and I wonder how the gcrypt library "stacks" up against simpler home-brewed hashing functions. You can't memoize a function without a implementing a hashtable in your program.

    • @gruchenstein9163
      @gruchenstein9163 Год назад +2

      the more i have experience in c and c++ the less i think about it as low level programming. There should be another term for that. Most of time when programming in C you are disconnected from cpu architecture.

  • @WarriorAjk
    @WarriorAjk 3 года назад +1

    2:40 HAHAHAHAHAHHAHAHAHAHAH xDDDDDDDDDDD

  • @48_subhambanerjee22
    @48_subhambanerjee22 2 года назад +1

    Thanks bro... Love from india

  • @rosiearasa9818
    @rosiearasa9818 3 года назад +2

    probably the only channel I watch at speed of .75! Thanks for the great tutorial!

  • @OneBoyIs2Big
    @OneBoyIs2Big 4 года назад +7

    Man I love the way you teach

  • @plankalkulcompiler9468
    @plankalkulcompiler9468 Год назад +1

    Nice video, but you go too much fast! Please slow down!

  • @TheCocoaDaddy
    @TheCocoaDaddy 4 года назад +3

    Great video! I have a question: in the calls to strncmp(), why do you pass TABLE_SIZE as the number of characters to compare? Thanks for posting!

    • @JacobSorber
      @JacobSorber  4 года назад +6

      Typo. Should be MAX_NAME. Sorry about that.

    • @TheCocoaDaddy
      @TheCocoaDaddy 4 года назад +1

      @@JacobSorber It's all good. I thought it was a buglet but wanted to make sure I wasn't missing anything. :) I've started watching more of your videos and am really enjoying them. :)

  • @rpfifer
    @rpfifer 4 года назад +1

    I noticed you're not malloc() ing any space for the nodes. Is it not necessary to do so or will that depend on the size of the table? I have to create a has table for a spell checker function that will hold 143,000 words.

    • @JacobSorber
      @JacobSorber  4 года назад +2

      Hi Reagan. The way I wrote this one, you're just passing in a pointer and the table is storing the pointer. That leaves allocation and freeing (however you decide to do it) up to the code that is using the hash table. So, in my example, all of the nodes were allocated as local variables in main. So, they were created when main was called, and they will disappear (go out of scope) when main returns. This is fine for a little example, since the hash table is only used in main, and I don't have very many nodes. If you're holding 145k nodes, you probably need to use malloc to allocate some memory. Otherwise, that could be a lot of typing. 🙂

    • @joseville
      @joseville 4 года назад

      Sounds like you're also on week 5 of cs50x lol

    • @rpfifer
      @rpfifer 4 года назад

      @@joseville Sure am! And I am drowning. Haha

  • @rtzstvn
    @rtzstvn 4 года назад +3

    great explanation, of a hash table by starting out with a simple example and building on it.!!!!

  • @sunstrumsharam5388
    @sunstrumsharam5388 3 месяца назад

    The 0xFFFFFFFFFFFFFFFFUL value in the macro is -1 right? I don't understand UL at the end, and why it's an hexadecimal value, instead of -1, I'm a bit confuse.

  • @MrMikeydrum
    @MrMikeydrum 3 года назад +2

    Been a while since i done hash tables, used a few before in work though, i do prefer external chaining think they were called buckets, quadratic is the way to go though, in large structures you can get bad clustering so what i used to do was have a variable quadratic based on the amount of objects with room to add more then just rehash when it filled up to much, the rehashing was inefficient but it did keep the structure performant for lookups and clustering to a minimum. Nice video

    • @JacobSorber
      @JacobSorber  3 года назад

      Thanks. Glad you enjoyed it. And, thanks for the added perspective.

  • @natecaine7473
    @natecaine7473 8 месяцев назад

    I'd rather you spent *more* time offline preparing this tutorial rather than watching you endless typing and rapid scrolling thru code. Better to show code side-by-side and discuss changes and improvement rather than clickity-clacky typing. I'm guessing 50% of the view time is watching you edit stuff. Then you could spend *more* time explaining the benefits of the various approaches.

  • @papasmurf9146
    @papasmurf9146 2 года назад

    Please do not do an April Fool's joke -- they are really hard to spot in November when I'm watching the video and they don't add anything of value to the watcher's knowledge base. One November I was watching a history RUclips channel and kept saying "there is no way they could know that" -- and it was an April Fool's joke. I no longer follow that channel like I used to -- not worth my time.
    I did a quick search for "goto" in the comments. In 35+ years of programming C/C++ I've never said to myself "I could use a 'goto' here." -- and that is coming from a FORTRAN and BASIC environment where goto-statements were liberally sprinkled throughout the code base. Yes, I've seen some clever code that uses goto-statements, but unless the goal is to highlight the use of goto, you can usually design around it.

  • @ΝίκοςΓλαβίνας
    @ΝίκοςΓλαβίνας 2 года назад

    Hey you information is onpoint BUT for begginers we cant really follow what you say and do. IF YOU ARE IN HURRY DONT MAKE VIDEOS . thank you

  • @thomasipad7719
    @thomasipad7719 4 года назад +1

    Couldn't you use another hash table with another hash function instead of a linked list at hash collisions?

  • @bytefu
    @bytefu Год назад

    Lol why the source code is only available for Patreon supporters? Don't you want more people to find bugs in your code? And, given it's non-trivial C code, there definitely are bugs, possibly sprinkled with UB. Can't be bothered to watch the video without the working code to back it up.

  • @CogoGaming
    @CogoGaming 2 года назад

    Really sad that you only explain 1/10 of the code and that the source code is paywalled. Makes this video one of many fails to thoroughly explain a difficult subject. This video only brings the viewer 1/3 the way.

  • @integral1191
    @integral1191 3 года назад +1

    weed? i have been thinking about hashbrowns this entire time.

    • @JacobSorber
      @JacobSorber  3 года назад +2

      Well, each mind goes its own way.

  • @Mohamed.U3
    @Mohamed.U3 8 месяцев назад

    Not gonna lie I feel so overwhelmed but I think that's because I am still a noob.
    Great video you explained everything so well I just have a skill issue, hope I'll get better.

  • @rayxr
    @rayxr 4 года назад +4

    Good sir, can you do a vid on Trie and comparing it to a hash table.

    • @JacobSorber
      @JacobSorber  4 года назад +1

      Yeah, sure. Tries are fun. I'll add it to the list and get to it when I can.

    • @AhmetKartal
      @AhmetKartal 3 года назад

      @@JacobSorber great video, thanks! but we are still waiting for tries.

  • @samplesandtests
    @samplesandtests Год назад

    instead of going through the list from the start, would searching for a free nearby spot but faster? eventually has spots fill up it would not be so lets say START_POS is the the position of the and we made the for loop look something like for(i=START_POS, j=START_POS; i

  • @don.hinton
    @don.hinton Год назад

    So, I guess I'm a bit old and stuffy, but I've never associated "hash" in "hash functions" with anything other than programming.

  • @ANTGPRO
    @ANTGPRO 2 года назад

    11:43 not TABLE_SIZE, but MAX_NAME

  • @shekharmaela2308
    @shekharmaela2308 4 года назад +2

    Hey Jacob, can you make a video explaining Canonical Huffman coding?
    I'm struggling to understand how the BITS and HUFFVAL tables (that's what they're called in JPEG anyway, aka BitLengthCount a 16 byte table and the Value table who's size is the summation of the BITS table) is created.
    Specifically what I'm failing to understand is how do you create each Huffman code from those tables?

    • @JacobSorber
      @JacobSorber  4 года назад +1

      Sounds fun. I'll add it to the list.

  • @TMerica199
    @TMerica199 Год назад

    Should your lookup function strncmp() be MAX_NAME instead of TABLE_SIZE?

  • @tricky2014
    @tricky2014 Год назад

    Sorry but the explanation of the hash function is technically incorrect. The hash function simply gives us a hash. We have to take it modulo the size of the backing array to get the slot number.

  • @cloudlee1027
    @cloudlee1027 2 года назад

    At 12:20 the third parameter of strncmp should be MAX_NAME not TABLE_SIZE right?

  • @mx-trp1307
    @mx-trp1307 2 года назад

    I would like to posit game respects game, but I will be honest and posit: "noob respects game." Thanks for the great video.

  • @vhpupuyoh
    @vhpupuyoh 4 года назад +2

    Very good and clear explanations. Thanks. One small note: please talk more slowly. For not native English speakers it is way too fast. Thanks again

    • @austinhastings8793
      @austinhastings8793 3 года назад

      If you click on the "Gear" icon, there is a "Playback Speed" option. Set it to 0.75 and try listening to it that way.

    • @vhpupuyoh
      @vhpupuyoh 3 года назад

      @@austinhastings8793 Thanks man! I know that option but it is not clear as the original

  • @charactername263
    @charactername263 2 года назад

    I really dislike your hash_table_insert function. You're accepting pointers, and doing nothing to ensure that they do not point to stack memory. You even give pointers to stack memory in your example, where you just create some person structs in main(). It doesn't break in your example because program execution halts after main, but it has the obvious potential for UB.
    Your hash_table_insert function should allocate memory for a person struct via calloc and then copy the pointed to data into the new heap memory, thus guaranteeing it is heap memory and not scoped lifetime stack memory like a variable defined inside a function.
    If you allocate memory per entry you'll have to free memory each time you delete an entry, so usually it's better to allocate blocks and just null out the value for empty keys or after removing. Since you're no longer storing a pointer to some memory as the value, you won't have the issue you had in the video where you cause a seg fault either.
    For more complex hash tables such as external chains etc then writing an allocator and/or memory manager into a preallocated block of heap becomes a good idea.
    I had written a longer comment that got into a bit of a ramble about considerations for the metal to reduce cache misses and allow for optimal algorithms, but it was completely overkill for the point I wanted to make. Especially since it extends way beyond the scope of your video. Honestly it's a good high level introduction to tables, but for a video going over the high level of a subject, C is just not the language to demonstrate with. Good C code requires a thought out design with respect for the machine and attention to the detail and nuances of what is being implemented. That - of course- extends beyond a high level overview of hash maps.

  • @tesses50
    @tesses50 2 года назад

    i thought this was a hash
    feaf351b9607fc67c5b277510c4eeb00 md5 hash for otavol imed spelt backwards

  • @edgeeffect
    @edgeeffect Год назад

    Technically, hash is a resin extracted from weed, so they wouldn't be smoking actual weed at a "hash function".
    But seriously, it's useful to see the low-level details of stuff we use "off the peg" in higher level languages... I haven't done hashing like this since college.

  • @4115steve
    @4115steve 7 месяцев назад

    why does the big o complexity not change? you had mentioned that hash tables are faster , then you had mentioned that it might not always be fast? I'm confused to why it would be slow

  • @bishopoftroy
    @bishopoftroy 2 года назад

    I think that if someone makes tutorials for basic things like these one would assume that the audience is at a beginner level so then why would you treat this as if you`re talking to experts. Way too fast, too few explanations and the code is a mess. Sorry.

  • @marcotroster8247
    @marcotroster8247 Год назад +1

    I'm really impressed how casually you're hacking this code. I was always afraid of implementing my own dictionary. Maybe I'll give it a try next time I need a dict. Well done!
    Of course you've still got lots of issues:
    - dynamic table size
    - table memory management (create / destroy / grow / shrink)
    - abstract linked list node as struct for carrying arbitrary data
    - create hash function for arbitrary data
    - ...

  • @Stephanie-rk7yg
    @Stephanie-rk7yg 4 года назад +1

    Im crying, "it sounds like it can be illegal in some states" LOL i paused to write this but I'm sure this video will have a great explanation

  • @zbubby1202
    @zbubby1202 Год назад

    Am I retarded? How is the modulus of 3516823649 not 9? Referring to "Mpho".

  • @silvermonad1
    @silvermonad1 4 года назад +2

    The other day I read the chapter in CLRS about hash tables and it left me quite confused at some points now everything is clearer thanks alot !

  • @jacobschmidt
    @jacobschmidt 3 года назад +1

    20:33 lol @ "when I'm concerned about performance I use linked lists". worse is better when it comes to caches

  • @FreemonSandlewould
    @FreemonSandlewould Год назад

    Woot! Was that sped up a bit? Strangely enough it seems like it makes it easier to understand. ( I'll be watching it again )

  • @k.c.sunshine1934
    @k.c.sunshine1934 Год назад

    Also going against hash tables is the fact that they go against the "locality of reference" assumption that modern computers rely upon.

  • @Rafa-tt8ee
    @Rafa-tt8ee Год назад +2

    You r a BIG TOP G.

  • @tinalukenda7040
    @tinalukenda7040 2 года назад

    in those 25 minutes i have learned more than in 2 weeks at uni! amazing, you have a new big fan spreading propaganda :D

  • @jimnoeth3040
    @jimnoeth3040 Год назад

    One comment, I don't know if someone else mentioned it, but for best randomization, the table size should be a prime number.

  • @Andrew-eg2pc
    @Andrew-eg2pc 4 года назад +1

    What if the value of hash table is another array instead if "int" or "char"? How to implement and avoid memory collision?

  • @walternyc
    @walternyc 4 года назад +1

    Thanks, very well explained. I am taking CS50 and new in C. In CS50 they allocate memory for each new node, so yours looks like a different implementation. What am i missing? The for loop setting the pointers to NULL avoids using malloc?

    • @hetsmiecht1029
      @hetsmiecht1029 3 года назад

      The memory allocation happens in main(), not in insert() I think.

  • @chacho1155
    @chacho1155 3 года назад

    This video would be better if you slowed down. There is no additional time after you type out a function to review it. This makes me not want to subscribe to your patreon.

  • @juliasolheim683
    @juliasolheim683 4 месяца назад

    Why would he mention a hash table in a conversation with someone who isn't a programmer XD

  • @sameerplaynicals8790
    @sameerplaynicals8790 2 года назад

    Why can't deleted nodes also be NULL? Why was another constant #defined for it? The insert or the lookup person shouldn't care about the node's status.

  • @alefalfa
    @alefalfa 8 месяцев назад

    Great video. Whats up with your typing speed dood? You the flash or something?

  • @hstivggfghyhgfg8359
    @hstivggfghyhgfg8359 Год назад

    If you make a remap function that check if the table is about to run out of spaces and remap everything to new bigger table

  • @SerjCrimea
    @SerjCrimea 10 месяцев назад

    In my understanding, the hash must be unique and point to the data ;) If it's not, then it's wrong :)

  • @demoosaad8579
    @demoosaad8579 4 года назад +8

    Initially, I was looking for a simple explanation for hash tables for my CS50 class; they barely touched on it and it felt a bit ambiguous; yet, you are making it almost water-clear simple, thank you good sir for this great content and deep knowledge.
    I actually took an excessive tour in your very informative channel that I completely forgot about the problem set.
    Love you

    • @JacobSorber
      @JacobSorber  4 года назад +1

      Thanks. I'm glad it helped.

  • @simonemicucci9222
    @simonemicucci9222 Год назад

    So, the best solution is bucketing to handle conflict and to not increase excessively time of search, cool i guess

  • @christopherknee5756
    @christopherknee5756 9 месяцев назад

    I hate it when you are trying to follow the logic BUT you here ... if you do this it might work ...

  • @drumman22
    @drumman22 3 года назад

    It is a bit fast paced and you kinda hardly give us time to digest the code or pause to look, but otherwise great video

  • @eliodecolli
    @eliodecolli Год назад +1

    What keyboard are you using? Great video btw 👍🏻

    • @fruud
      @fruud Год назад

      +1 Also wants to know

    • @JacobSorber
      @JacobSorber  Год назад

      Just my laptop keyboard

  • @zehaia
    @zehaia 2 года назад

    when you insert a new string in a deleted place, how can you know it doesnt have a duplicate in a further key?

  • @minneelyyyy
    @minneelyyyy 2 года назад

    this is a poor implementation of a hash function and really weird code in general. follow a damn style.

  • @tranminhtri7078
    @tranminhtri7078 2 года назад

    it's would be nice if the sound of keyboard typing more quiet in the video, sir

  • @odissey2
    @odissey2 6 месяцев назад

    If you need a hash function in C, then you need to use another programming language.

  • @youthfull3616
    @youthfull3616 Год назад +1

    Hello Jacob, I am already subscriber of your channel and I find it very informative everytime I watch your videos. Though i have been working in C from last 16 years, but still I learn something new every time I watch your videos. Keep up your good work and keep adding good stuffs as you have been doing, for your fans like me.

  • @MaxCoplan
    @MaxCoplan 2 года назад

    > "Every time I'm in a conversation with a non-programmer and mention hash tables"
    🤔

  • @mhalton
    @mhalton 2 года назад

    So, in short, hash tables are more work than linked lists or array with no performance gain.

  • @therandomchannel9226
    @therandomchannel9226 2 года назад +1

    hard to follow

  • @karem_darwesh
    @karem_darwesh Год назад

    I am in love with your Bald azz head man you are a genius u save my sorry azz a lot thanks boss

  • @chrischoir3594
    @chrischoir3594 Год назад

    tells "us" to focus yet he is the one making bad hash jokes. Stay on topic

  • @Fuzzworth2000
    @Fuzzworth2000 Год назад

    Mpho le Tebogo? lmao finally my people can shine... in English even

  • @yevheniibatiievskyi
    @yevheniibatiievskyi 3 года назад +1

    what is strnlen() function and what header I have to unclude for that? And whats the difference between strlen and strnlen? Thanks in advance.

    • @xrafter
      @xrafter 3 года назад +1

      String.h header file .
      The difference is that you can specify the n max size for the function even if the function didn't find the null terminator for some reason it will stop at n max
      Usually u use this function if you know the maximum size of a string.
      And there Is a difference between the length of the string and the maximum size