Should you avoid linked lists? (linked list vs arrays)

Поделиться
HTML-код
  • Опубликовано: 30 сен 2024

Комментарии • 89

  • @JacobSorber
    @JacobSorber  4 года назад +17

    I hope you all liked this comparison video. If there are other comparisons you would like to see explored, let me know.
    Also, one commenter noted a bug in the linked list. Lines 30 and 31 in linkedlist.c need to be swapped (oops). Sorry about that. I've fixed the posted code. Unfortunately, RUclips doesn't allow me to update existing videos. So, those of you getting the code from the video, just make that change before you try to use the second insert function (it currently isn't called).

    • @LinucNerd
      @LinucNerd 4 года назад

      Oohh.. I thought I had gone insane or something xD
      I probably spent an hour trying to figure out why your code was that way, as I thought it should be the opposite.
      Usually, if a more experienced programmer disagrees with you, you're in the wrong...
      So I tried sooo hard to figure out why I was wrong (even tho my code works).

    • @zxuiji
      @zxuiji 2 года назад

      Thought of another approach, keep 2 arrays, one for pointers/objects & another for deleted pointer/object indices, keep a count & total for both (that way can share the allocation behaviour via some functions that don't care what the arrays hold) and then just check if the deletion count is greater than 1 to do something like:
      objects[indices[deleted_count - 1]] = object

    • @fandibataineh4586
      @fandibataineh4586 25 дней назад

      hi jacob, thanks for all the great vids you post, your effort is much appreciated
      i know this video is more than 4 years old now, but i noticed that in main.c line 164 you wrote:
      long double traverse_time = tv_to_seconds(&myusage.ru_usage);
      i think you should have wrote:
      long double traverse_time = tv_to_seconds(&myusage.ru_usage) - last_time;
      also line 155 i think should have been:
      long double add_delete_time = last_time - start_time;
      am i wrong?
      can you please explain why computed the values this way?
      thanks again

  • @Brettyoke49
    @Brettyoke49 4 года назад +32

    Graduated from college last May, and was in your OS class about a year before that. Looks like your video making skills have improved! Hope you're doing well, thank you for your willingness and love for teaching, it made a lasting impact on me. Best wishes.

  • @lordadamson
    @lordadamson 4 года назад +28

    haha the ending was nice :D

    • @JacobSorber
      @JacobSorber  4 года назад +9

      Thanks. We had fun making it.

  • @brockdaniel8845
    @brockdaniel8845 8 месяцев назад +1

    Yes, you should avoid linked lists according to Bjarne Stroustrup: ruclips.net/video/SfkMiGFVhZo/видео.html

  • @TheArtikae
    @TheArtikae 11 месяцев назад +1

    It’s a bit silly to compare speed without optimizations. Especially for sequential access on an array. Compilers just love that kind of thing.

  • @tekniktdr
    @tekniktdr 3 года назад +4

    6:51 Is it me or insert_after_node() should have the statements in reverse order? I haven't programmed in C for a while, but I would say that there's a bug in this function.

    • @narnia1462
      @narnia1462 2 года назад +3

      i was going through the comment to see if someone already said it aha. newNode is just pointing to itself in this order i'm pretty sure

    • @rustycherkas8229
      @rustycherkas8229 2 года назад +3

      This function is "dead code" in this video.
      You're right about the bug, but this function is not used by the program, so the bug is hiding to bite the unsuspecting...

  • @jynx0riZ0r
    @jynx0riZ0r 2 года назад +2

    Thanks for the video. Thinking about your data structures is a good idea in general and I really like arrays (contrary to some folks especially in modern languages). However, I think the comparison is not really fair, particularly for linked lists because you added quite a bit of memory logic to the array functions. Similarly, you could also request 100 linked items in advance and flag them used, which would counter the memory locality argument a bit. I guess, if you'd pick an example for arrays, where you don't want to have unused gaps in your memory, it'd have performed worse in comparison to your implementation, not necessarily compared to the linked list, but maybe. I also suppose that realloc sometimes has to copy the old memory block if your memory space is fragmented, but I did not check on that. In the end, it always depends on the use case and it is good when you know what performs how under which circumstances, so thanks again.

  • @boxedowl
    @boxedowl 3 года назад +2

    Never gonna have a lot of cache...
    You say that but when I found out my wife's CAD workstation had as much Lvl3 cache as my first PC had main memory... I'm glad I was wearing my brown pants that day.

  • @MatthisF
    @MatthisF 4 года назад +6

    I was literally asking myself this question a week ago

  • @islandcave8738
    @islandcave8738 3 года назад +2

    Linked lists are better for updating a lot and mostly accessing sequentially. Dynamic arrays are better for random access but can be expensive to add to, but that is mitigated by having a bigger capacity then its size.

  • @enestastan7147
    @enestastan7147 4 года назад +4

    One of the best explanatory performance comparison video I have ever seen. Thank you so much for the delecate job.

    • @JacobSorber
      @JacobSorber  4 года назад +2

      You're very welcome! Glad it was helpful.

  • @sameerplaynicals8790
    @sameerplaynicals8790 2 года назад +1

    Yes, it doesn't matter if one links a node that is in the stack to a node in the heap, but one usually shouldn't because then things tend to get complicated. I am not saying it's not plausible but should be avoided in large projects. Please correct me if I'm wrong! Love your videos, by the way.

  • @benjaminshinar9509
    @benjaminshinar9509 4 года назад +3

    I crush the button like my code crushes the machine!

    • @JacobSorber
      @JacobSorber  4 года назад +3

      I bet sparks shot out when you did. :) Thanks!

  • @hayfahvytsen
    @hayfahvytsen 3 года назад +1

    Great video once again. One comment though, I know it's not the point of the video but it looks like insert_after_node() at 6:45 is not correct. Am I missing something?

  • @michaelkotthaus7120
    @michaelkotthaus7120 4 года назад +2

    A really good comparison. It encourages me to always consider different approaches for one task. The "guest appearence" surrounds this episode prodigiously. :-)
    For me - as a non-native speaker - this series helps me to improve my English skills.

  • @qwertyuiop-cu2ve
    @qwertyuiop-cu2ve Год назад

    I default to using a (resizable) array unless my usecase can take advantage of another datastructure's characteristics. With respect to your evaluation here I do have to point out that the remove function isn't really fair. Remove and insert are a pair performing opposite actions. If you don't do a search in the insert, the remove shouldn't either.
    What the linked list is really good at, and the array is terrible at, is inserting or removing elements somewhere in the middle of a long list/array (when you already know the index/pointer).
    I also notice you intermix terms remove/delete. The rule of thumb for naming I like to use, is to call it 'delete' when the operation includes freeing the memory associated with the element, and to call it 'remove' if you're just removing the element from the datastructure.

  • @aabdev
    @aabdev 4 года назад +1

    Dear Jacob Sorber,
    Does there exist a platform independent way to evaluate system Cache Size during run time according to indirect symptoms of program behavior?
    Regards,
    AB

  • @bastawa
    @bastawa 3 года назад +2

    I hope we are all on the same page...

  • @7th_CAV_Trooper
    @7th_CAV_Trooper Год назад

    Should test insertions on the array and then talk about which is slow. Arrays are only useful if you know the capacity up front, only append, and have read-heavy operations. Otherwise trees and linked lists are your friends.

  • @psionl0
    @psionl0 2 года назад +1

    One method of old was to maintain a parallel array of "next" pointers (integers). Simply chain the elements of the next array together and have a "freep" pointer to the first element. That way, allocating and deallocating a value element is simply a matter of adjusting the freep pointer. Assuming that the value and next arrays are in the same cache, this should be quite fast.

  • @ChocolateMilkCultLeader
    @ChocolateMilkCultLeader 4 года назад +3

    Something about this is so much fun to watch

  • @fabianmelo9672
    @fabianmelo9672 Год назад

    Ok, I have to ask... are you Matthew McConaughey brother?

  • @LegendOfMurray
    @LegendOfMurray 3 года назад +1

    great video!
    note that keeping things contiguous not only prevents page faults, but also improves performance by taking advantage of spatial locality in the caches, as usually the caching system stores the data at the requested address AND some of the neighboring data.

  • @orocimarosay1447
    @orocimarosay1447 4 года назад +1

    Nice video. I like that you have stressed its drawback well (I don't consider the fact that you have to hold one more pointer a drawback because memory is cheap but memory coherence is very important). I haven't used linked lists in my life tho not necessarily because they are slow but 99% of the time they are not useful over a vector (At least in the context of game development because I'm most accustomed to it than other areas).

    • @JacobSorber
      @JacobSorber  4 года назад +2

      Right, depending on what you're doing, arrays (or vectors, which are just wrappers around arrays) might be all you need.

  • @SuperCape
    @SuperCape 4 года назад +1

    What about: a red/black tree stored in a heap allocated dynamic array (malloc/realloc magic and such)?
    Is this ^ data structure end-game?

    • @SimGunther
      @SimGunther 4 года назад

      No, Priority R-Trees are the data structure endgame

  • @naughti_penguin2340
    @naughti_penguin2340 4 года назад +1

    linked list gang

  • @robertturner7090
    @robertturner7090 2 года назад

    You've got a beautiful little girl there mate. Loved the fun family ending. Have you gotten her to share that same passion for coding/CS? I've got a 4 and 6 yr old, but so far they are only interested in the end results...they are NOT impressed by my time to delivery or debugging skills.

  • @cnasarre
    @cnasarre 2 года назад

    Hi Jacob, I do know that this video is not that new but I just stumbled upon it today.
    First I did not get through all of the comments and maybe my comment will be redundant but one way to improve LL efficiency could be to keep each 'deleted' block for future use hence not 'wasting' time to deallocate to reallocate later on. Not fancy, already know but still IMO worth mentioning.

  • @zoranjuras6777
    @zoranjuras6777 2 года назад

    Wow! I adore your wonderful offspring!
    Love both of you, keep going!

  • @ctobi707
    @ctobi707 4 года назад +1

    hahaha that ending. your daughter is adorable.

    • @JacobSorber
      @JacobSorber  4 года назад

      Yeah, she's a lot of fun. Thanks.

  • @yjc149
    @yjc149 2 года назад

    great!

  • @anon_y_mousse
    @anon_y_mousse 2 года назад

    How I think of it, if you don't care about the ordering and don't need to maintain the ordering, then use a hash table with the data stored as arrays internally. An array of chains, each chain is an array, and deletions are a simple swap to the end and decrement. If you need the data in a sorted order, and obviously when that's not often, just call the sort function at that time. If it needs to be sorted all the time, either use a tree or a hybrid hash tree. I hardly ever, if ever at all anymore, use a plain array or a plain linked list to store any significant amount of data. I hope you've taught your daughter C because that would be awesome.

    • @JacobSorber
      @JacobSorber  2 года назад +1

      What kind of father would I be, if I didn't teacher my kids C?

  • @amrtcpp6203
    @amrtcpp6203 4 года назад

    Thanks a lot for this wonderful explanation, please if you make some corrections to the code, could you put a link to the git repository if possible?

    • @JacobSorber
      @JacobSorber  4 года назад +1

      You're welcome. Video code (including corrections) is available through Patreon.

  • @mahoneytechnologies657
    @mahoneytechnologies657 2 года назад

    Link List, the Basis of Forth!

    • @psionl0
      @psionl0 2 года назад

      They are the basis of Lisp. The basis of Forth is the stack.

  • @overclucker
    @overclucker 4 года назад

    How about pairing an array of pointers and an array of values? Would it ever be cheaper to move around addresses instead of values and still keep the data blocked together?

  • @trichomaxxx
    @trichomaxxx 4 года назад

    I wonder if people tried to tackle this page fault problem for other structures with graphs/trees that also have a lot of indirections.

    • @JacobSorber
      @JacobSorber  4 года назад +1

      Yeah, it's an issue for any linked data structure.

  • @Adrian-Carstea
    @Adrian-Carstea 4 года назад +4

    at 6:50, inside the function at line 29, you should first do line 31 than 30

    • @JacobSorber
      @JacobSorber  4 года назад +1

      You're right. Good catch. Fortunately, I didn't use that code in main. I'll get it fixed in the posted code. Thanks.

    • @Adrian-Carstea
      @Adrian-Carstea 4 года назад +2

      @@JacobSorber Keep up the good job, thanks for what you're doing.

  • @abhaypratapsingh4225
    @abhaypratapsingh4225 4 года назад

    Thanks a ton ! You are doing a really great job.

  • @ABaumstumpf
    @ABaumstumpf 4 года назад

    I have so far not ever felt the need to use linked-lists.
    They COULD be useful in scenarios where the data really represents a list, but normally you are better of using different data-structures.
    For larger sets of data it is basically not really usable as it does not benefit from sorting, does not benefit from hashing or cache locality either.
    And for your array implementation - why set the values individually instead of using the available operations? You could use memset for example.
    Using c++ and the library functions with a make that also takes advantage of the hardware you have and it might even change the array to vectorised operations - checking several numbers concurrently.

    • @sevakantonyan9833
      @sevakantonyan9833 3 года назад

      You will never need to use linked list.

    • @marc-andrebrun8942
      @marc-andrebrun8942 10 месяцев назад

      @@sevakantonyan9833 then maybe one day you will dive in functional programming, like in LISP or scheme, and you will understand how to use linked list.

  • @thomasthomas8049
    @thomasthomas8049 4 года назад

    You could try defining the elements of your linked list as an array, and have each element's pointer point to the next one in the array. That way, you still have the flexibility of a linked list, whilst most elements are close to each other in memory. Or so I hope.

    • @JacobSorber
      @JacobSorber  4 года назад +2

      True, but keep in mind that that only ensures that they start out with good locality. If you are removing nodes from the list and adding them back in, or moving nodes around, then things can still get messed up. You're also still storing the next pointers, and dereferencing them as you go through the list.

    • @rustycherkas8229
      @rustycherkas8229 2 года назад

      You're right, but by having 'next' pointers when using an array for storage suggests that you're concerned about the sequence of the data (for some reason). This example shown doesn't care about the sequence of the data stored, just that it is stored and it can be found again.
      The 'sequence' of the linked list, here, is just the sequence of each node being added as needed, but the data is meaningless.

  • @jg9193
    @jg9193 4 года назад

    Excellent video, as always. But man, this would have been good to know sooner...

    • @JacobSorber
      @JacobSorber  4 года назад +2

      I know. Sorry. I make them as fast I can find the time.

    • @jg9193
      @jg9193 4 года назад +2

      @@JacobSorber Don't apologize; your videos are very good quality. I've learned so much about C from you in the past few days.

  • @jgcooper
    @jgcooper 3 года назад

    lol that outro

  • @joemacdonnagh6750
    @joemacdonnagh6750 4 года назад

    Can you save a linked list to a file if the member data is pointer to an object on the heap?

    • @JacobSorber
      @JacobSorber  4 года назад +1

      You could write a linked-list, as is, to a file (you would need to save both the data and the links), but it's probably not what you want to do. At least, when you load it back into memory, you would have to make some adjustments because the heap won't look the same next run. So, you can't use those old pointers you saved, at least not as is.

    • @joemacdonnagh6750
      @joemacdonnagh6750 4 года назад

      @@JacobSorber Thanks , I follow your videos they are brilliant!

  • @SuperCape
    @SuperCape 4 года назад

    ies
    /thread

  • @ronjeremy1232
    @ronjeremy1232 4 года назад

    what if we use it for background processes in a shell

    • @JacobSorber
      @JacobSorber  4 года назад

      I'm not understanding the question. Running it as a background process shouldn't change the performance of the program, just how it interacts with the shell.

    • @ronjeremy1232
      @ronjeremy1232 4 года назад

      @@JacobSorber like if you were implementing your own shell program, would a linked list be a good way to keep track of them (background processes)?

    • @zeektm1762
      @zeektm1762 4 года назад

      @@ronjeremy1232 A linked list COULD be used as a way to handle what processes are runningg but would only be used as a list of running processes and not their context state or queue.

  • @IndellableHatesHandles
    @IndellableHatesHandles Год назад

    In short:
    Need to add elements a lot? Consider a linked list.
    Otherwise, always use contiguous lists.

  • @robertkiestov3734
    @robertkiestov3734 4 года назад

    3 strikes within 14 seconds. Instant-dislike.

  • @NoOne-uz4vs
    @NoOne-uz4vs 4 года назад

    6:35 - Line 17. If you run out of memory, malloc will return NULL and in line 19 you'll get a segmentation fault for dereferencing a null pointer. Therefore, shouldn't you check whether malloc returns NULL?

    • @oguz6869
      @oguz6869 4 года назад

      not in POC code

    • @JacobSorber
      @JacobSorber  4 года назад +2

      Yeah, in production code you should check to make sure malloc didn't return NULL. This code exists simply to provide a performance comparison.

    • @NoOne-uz4vs
      @NoOne-uz4vs 4 года назад

      @@JacobSorber Oh ok. Btw I'm a computer science student and your videos are an excellent complement. Thank you so much. Anyway, cheers from Brazil xD

    • @ABaumstumpf
      @ABaumstumpf 4 года назад

      Welll..... yeah no. Malloc does not actually return NULL just cause you are out of memory - that depends on the OS and linux - by default - uses an optimistic approach and only really allocates the memory when it is used.

    • @NoOne-uz4vs
      @NoOne-uz4vs 4 года назад

      @@ABaumstumpf What do you suggest we do then? I mean, how do we use malloc safely without checking for null?

  • @jarvenpaajani8105
    @jarvenpaajani8105 4 года назад +1

    Just allocate whole bunch of memory at once, no one is going to notice it. After all nowadays we are using text editors that take half of a gigabyte of ram...

  • @thomasipad7719
    @thomasipad7719 4 года назад

    Your daughter is gorgeous!!!