Getting Sorted & Big O Notation - Computerphile

Поделиться
HTML-код
  • Опубликовано: 23 ноя 2024

Комментарии • 541

  • @bumblinggimp
    @bumblinggimp 10 лет назад +597

    "you'd have to be quite a good programmer to work out how to do something that runs that badly..."
    Fantastic quote!

  • @ShootMyMonkey
    @ShootMyMonkey 11 лет назад +208

    An interesting consideration is the mechanism of storage. At one defense contractor I worked at, we had machines with no real RAM to speak of, and even transient data would be stored on tape media. Because of that, a ping-pong bubble sort was always the fastest method in practice because it dealt with adjacent elements (didn't have to deal with the seek times). Because of the linear seek times, it saved time by actually doing work as the tape wound one way or the other.

  • @Computerphile
    @Computerphile  11 лет назад +54

    Really glad you are liking it - there are PLENTY more to come!
    PS: Tell your mates about us!
    >Brady

  • @tapiok
    @tapiok 10 лет назад +108

    Nice that you included bogo sort!

  • @Computerphile
    @Computerphile  11 лет назад +13

    Indeed it was not me... Most of the computerphile videos are made by Sean Riley (we always say who made the film in the full video description if you want to double check)
    I'll still be making a few myself and come along for the odd interview (just because I like being involved!) - but Sean is the main man and you can usually assume he is the an behind the camera/edit!
    >Brady

  • @tobywong2657
    @tobywong2657 8 лет назад +241

    To this day,he is still random sorting...

  • @jayvkman
    @jayvkman 5 лет назад +31

    Good video, but I'd like to challenge Alex's theory at 4:50 that smaller lists are "nearly sorted and therefore BubbleSort is faster" when he's using uniform random. The data rather suggests a tipping point between bubble and merge somewhere 20~40 elements, that's around when you'd run out of Level 1 CPU cache (64 bytes). That most likely explains why bubble sort is initially much faster, because doing 1000 swaps on 32 elements in L1 cache is less CPU work than a handful recursive method calls in the JVM each allocating 2 arrays and copying data.
    It should remind everyone that Big-O is just theory ignoring how computers really work. These days cache misses dominate performance, unlike in the 90's when multiplications were slow and memory was fast.
    I'd also like to point out that his approach to micro-benchmarking Java code is prone to incorrect data. For small input sizes he is running in interpreted mode (JIT requires 10k iterations) also you don't want to call `nanoTime()` in a tight loop, it's an expensive call and thus will dominate cost for smaller input sizes. Look into JMH next time you want to benchmark in Java.

  • @Mike10131994
    @Mike10131994 11 лет назад +1

    Quick sort does this:
    1) Choose a random pivot, the pivot is used to compare the numbers of the list
    2)Create 2 lists, "greater" and "less"
    3)Go through each number (except the pivot) in the list. If its greater than the pivot, add it to "greater". If it's less than or equal, add it to "lesser"
    4)Recursion! Basically, repeat 1-3 for each and every "greater", append the pivot, then append the "less" list after you apply steps 1-3. It's hard to explain, but relatively easy to code

  • @diedie5
    @diedie5 11 лет назад +8

    This video would have helped me a lot back when I was in simple algorithms. You should do more videos on algorithms. Maybe Quicksort, I always thought it was beautiful how well it worked.

  • @govindsharma9871
    @govindsharma9871 5 лет назад +30

    It's astounding to see how common it is for these professors or engineers to always have a Rubik's cube on their desk.

  • @Computerphile
    @Computerphile  11 лет назад +16

    Thanks... Sean, unlike me, knows how to use After Effects...
    >Brady

  • @Skeluz
    @Skeluz 11 лет назад +1

    I can't wait for more on sorting. I know it very well but it is very nice to hear someone explain it so well in just 10 minutes.

  • @JPlatGaming
    @JPlatGaming 11 лет назад +6

    So far I've liked all the Computerphile videos, but I think this is one of the better ones. I'm a 16 year old self taught programmer and I think this channel is a great way of introducing new people to computing. Great job Brady!

  • @Mike10131994
    @Mike10131994 11 лет назад

    If he were to explain it with pictures like he did these, it would be incredibly easy to understand. Sure, the concept of a pivot and recursion might be slightly more difficult than these concepts, but that's why we're here!

  • @orbital1337
    @orbital1337 11 лет назад

    Just a few months ago we were doing sorting algorithms in my CS course and I was bored enough to implement 20 different sorting algorithms and benchmark them. Here's my top 8 algorithms for 1 mil elements: 1. Bit-adaptive radix sort (can do 200 mil numbers in 2 seconds on my PC) 2. Flashsort 3. Introsort 4. Mergesort 5. Quicksort 6. Shellsort (very simple to implement) 7. Heapsort 8. Smoothsort (look it up, it's really cool).

  • @WannabeTheAwesomest
    @WannabeTheAwesomest 11 лет назад

    That is the most mindblowing computer-related thing I've heard in a long time.

  • @severynkozak9131
    @severynkozak9131 11 лет назад

    Buublesort doesn't rely on a list's last element being the largest. Every time you parse the list (pass over every element), the largest element within the unsorted portion (the left) is moved into the sorted portion (the right). Thus, the series is sorted by ordering values in descending order of magnitude (backwards).

  • @thresholdofdave
    @thresholdofdave 11 лет назад

    That paper they write on takes me back to my primary school days. Nice touch. Well done

  • @thecassman
    @thecassman 11 лет назад

    Makes a huge huge difference in real programming. The software i am coding has an "industry standard" to meet which is to process 1 million records of 200 fields inside 15 minutes.
    15mins = 900 seconds to do 1 million, which is less than 1ms per record.
    Shaving fractions of ms off of processing times is incredibly important at times.

  • @Computerphile
    @Computerphile  11 лет назад +1

    It will be covered soon! >Sean

  • @XenoxHawk
    @XenoxHawk 11 лет назад

    Algorithm complexity is something I have wanted to learn about for a while, and this video has given me a basic look. Keep the good work up!

  • @lunafoxfire
    @lunafoxfire 11 лет назад

    I really liked how long and in-depth this episode was. I noticed that most of the Computerphile episodes were pretty short.

  • @Niosus
    @Niosus 11 лет назад

    You can look it up yourself if you want. The idea behind the algorithm is what matters, the actual implementation in code is basically turning what he said in code. An algorithm shouldn't be seen as a piece of code, but as a general strategy to solve a problem.

  • @Rpleech1
    @Rpleech1 11 лет назад

    Please don't stop putting videos up on Computerphile!
    This is my favorite channel Brady!

  • @MasterHigure
    @MasterHigure 11 лет назад

    Bubble sort is the simplest one to implement, and a good introduction to algorithms in general. Also, it serves as an excellent example of how different algorithms which at a first glance might both seem more or less efficient to untrained eyes actually perform very differently.

  • @MySchari
    @MySchari 11 лет назад

    Thank you for making this video. This shows that the audience is really important to you guys.

  • @rageagainstthebath
    @rageagainstthebath 11 лет назад

    Gnome sort, as far as I can recall, is about making one step back after the switch, and it seems to make more sense than starting from the beginning.

  • @iabervon
    @iabervon 11 лет назад +1

    I want to see radix sort, and a discussion of how you can beat the theoretical limits if you're willing to break the rules. (If you have a limited number of values, you can sort really fast by putting the cards in piles by number and never comparing them with each other.) A lot of the biggest improvements in CS come from solving the problem you actually have to solve, rather than the general case.

  • @GuilhermeTeixeira
    @GuilhermeTeixeira 3 года назад

    This channel is so precious

  • @phyyl
    @phyyl 11 лет назад

    we could compute the time to compute for each method, take the best one before sorting!! what an awesome sorting algorithm!

  • @kenkku
    @kenkku 11 лет назад

    So to answer your question, in terms of how algorithms are usually compared, no. Algorithms are analyzed with a slightly simplified way to calculate the number of actions to take - an algorithm that does 2*n actions and an algorithm that does 20*n actions are considered to have the same running time (n operations). The other one is 10 times slower, but it is ALWAYS 10 times slower - unlike in the case of bubble sort and merge sort, where bubble sort just gets worse and worse.

  • @kenkku
    @kenkku 11 лет назад

    The speed is affected by the number of operations done. Questions and actions both take a constant amount of time. The time taken by different actions and questions might not be same, but the same kind of action always takes the same time. Algorithms can be analyzed to find out the minimum, maximum and average number of actions needed. Even though merge sort might make less actions than bubble sort, it's initially slower due to the actions taking more time, but with large inputs it still wins.

  • @MrAskolein
    @MrAskolein 11 лет назад

    The best sorting algorithm is using recursivity. The video was excellent. Your channel is getting great

  • @xtieburn
    @xtieburn 11 лет назад

    For those wanting a lovely visual way of seeing sorting algorithms the appropriately named sorting-algorithms site has them all for comparison. Its problem sizes cant get too large but its one of the best references for various cases I know of.

  • @DFYX
    @DFYX 11 лет назад

    Btw: thumbs up for including BogoSort. There's probably only one sorting algorithm that's even crazier: Intelligent Design Sort. It says something like "Look at your data. Some higher being has decided that this is the order you need. Therefore consider it sorted" Intelligent Design Sort uses O(1) time (constant time) in all cases.

  • @CanOzmaden
    @CanOzmaden 11 лет назад +1

    I would like to see more on programming languages, their history, pros and cons, basic abilities of each, thank you!

  • @yannickbernaerts2325
    @yannickbernaerts2325 11 лет назад

    Indeed, one extra statement isn't going to effect the performance in a negative manner. There're btw a lot of little tweaks you could use to speed up bubblesort (in some occasions). You could go from left to right and then switch and sort from right to left and back from left to right and so on.

  • @alcesmir
    @alcesmir 11 лет назад

    Recursive thinking doesn't always require recursive functions. Mergesort for example can easily be rewritten using loops, even though the algorithm uses recursive concepts (as in reducing a problem to simpler subproblems).

  • @iabervon
    @iabervon 11 лет назад

    In fact, "Timsort", which is the built-in library sort in a bunch of languages, is a combination of merge sort and insertion sort (which is kind of a better-organized bubble sort, which does the same number of comparisons, but only O(n) swaps). It's particular worthwhile to have great performance for lists that are either already in order or nearly in order.

  • @IceMetalPunk
    @IceMetalPunk 11 лет назад

    "n" is the number of items in the list. The overall efficiency is the function of "n" that tells you the total number of steps in the algorithm. For Bubble Sort, worst-case it has to swap every pair of items on every pass. Whenever it does, it needs to do another pass. So that's n passes and n swaps each--n^2. Merge sort splits the list in half each time, so the total number of splits is log-base-2 of n. Then it compares each element together for each pair, so that's n comparisons. So n x log n.

  • @ManojGudi
    @ManojGudi 11 лет назад

    I never understood merge sort till today. Thanks!

  • @Uejji
    @Uejji 11 лет назад

    Bubble sort will always be relevant, because it's easy to remember, easy to write and relatively quick on extremely small datasets, along with having extremely low overhead.

  • @Computerphile
    @Computerphile  11 лет назад +5

    Hi, we will put the links in the description when they are available - at the time of writing this, the two videos 'teased' at the end are not yet available and the annotation simply says 'coming soon' - hope that helps >Sean

  • @DFYX
    @DFYX 11 лет назад

    That could take a while. They use a lot of different techniques combined and a lot of them need a quite solid understanding of how data structures, parallel computing and networks work. Have a look at lectures like "Advanced Data Structures" and "Parallel Algorithms" if you're still interested.

  • @SkyeWintrest
    @SkyeWintrest 11 лет назад

    I call Radix sort above the sorting algorithms brought up in this video. It works for pretty much all numbers that a computer can use, and also would work for characters (since those can be cast as integers).
    Of course, it won't work when you can only use a method that compares two objects at once since it doesn't work in that way, but still.

  • @Computerphile
    @Computerphile  11 лет назад

    Quicksort will be covered soon! >Sean

  • @MrMagic1163
    @MrMagic1163 11 лет назад

    Every one of these algorithms should be taught, together wit it's advantages and disadvantages. And yes, bubblesort has it's advantages, but not for most developers.
    - it is a stable sort
    - uses no extra memory
    - algorithm is very small and easy
    And it surely should not be about if an algorithm is more understandable. This is about the hard facts about different solutions to a problem, you need to understand that, and then you should use the best solution for your situation.

  • @crazedvidmaker
    @crazedvidmaker 11 лет назад

    i think its important to note that in mergesort, the lists arent split in the way that he split them. the first half is sorted before the second half moves at all.

  • @papalevies
    @papalevies 11 лет назад

    Nicely explained video. Computing science concepts like Big O notation and sorting are really interesting.

  • @Garrison86
    @Garrison86 2 года назад

    Wow! Great demonstration illustrating Big O notation

  • @meanbark1976
    @meanbark1976 11 лет назад

    It would be awesome if someone come up with English subtitles for these videos (just like numberphile!). Because English is not my mother tongue and you all got funny accents. Thanks for the channel!!

  • @gorgolyt
    @gorgolyt 11 лет назад +1

    Rumour has it he is still there to this day.

  • @landonkryger
    @landonkryger 11 лет назад

    You should do another one that shows why n*lg(n) is the best you can hope for in a sorting algorithm. Then show why/how counting sort can beat it with O(n) but isn't usable on many sorting problems.

  • @MrShysterme
    @MrShysterme 11 лет назад

    People don't watch these videos to learn sorting algorithms or serious computer science. Please look at the general flow of the channel. Perhaps they want to show the most intuitive, yet rather slow, algorithm and then compare it to others. It is so intuitive, it is the one I reinvented when I first started programming as a kid and needed to sort a list, but no one had told me yet that you could reuse the code of others.

  • @gellarieb88
    @gellarieb88 11 лет назад

    algorithms and data structures exam next friday. those sorting videos are pretty good for understanding. thanks, brady :)

  • @AbhishekSingh-dt6br
    @AbhishekSingh-dt6br 3 года назад

    I appreciate the effort you are putting to explain this.

  • @Charlie2531games
    @Charlie2531games 11 лет назад

    What about a 2^b^n sort? It checks what the range of values for the data type is, generates a list of random numbers equal to the size of the list, and checks if it both has the same values as the original list, and makes sure that it's sorted. It would probably be called the Coincidence Sort.

  • @Niosus
    @Niosus 11 лет назад

    Yes! Many databases use techniques very similar to what's used in heapsort. If you keep your heap structure you can search, add and remove log(n) (worst case) time. Well it's not exactly the same way, but it is extremely similar and uses the same idea behind it. As a sorting algorithm it loses out against mergesort in the real world, but the idea of using a tree structure get used all over the place.
    Look up red-black trees, they're quite ingenious ;)

  • @akowalz
    @akowalz 11 лет назад

    Oh man, you did sorting without going through Quicksort! That's the most famous algorithm in computer science!!! Great video regardless, too bad this wasn't here last semester when I was taking Data Structures and Algorithms.

  • @PixelOutlaw
    @PixelOutlaw 9 лет назад +3

    I don't understand why merge sort is always given an initial recursive decomposition step. You can form the initial base lists by simply collecting the elements 2 at a time with the last pair as a single item should the number of elements be odd.

  • @autom4
    @autom4 11 лет назад

    also, they probably wanted to explain some other sorting algorithms first, because quicksort uses one of these O(n²) algorithms when its current part is under a certain size, because as you saw, n² can be quicker than others when the size is low enough... at least that's what I remember from my algorithms and datastructures classes :)

  • @seigeengine
    @seigeengine 11 лет назад

    yes, you can do something like if (#elements < someNumber) {bubblesort;} else {mergesort;} if that's what you mean, whether initially, or at some small enough list size to speed up the lower collections being sorted. Hybrid searches are a good way of speeding the process up, but are generally harder to implement, require more knowledge, and aren't necessarily worth it unless you're doing a lot of sorting.

  • @championkoppie
    @championkoppie 11 лет назад

    There is also parallel sorting
    Shuffle the stack at random. Then look if the stack is sorted.
    If it isn't sorted, you go to a Parallel Universe where it is sorted.

  • @Rubberbandman1991
    @Rubberbandman1991 11 лет назад

    Yes, its easy enough to tell on its own but you can make an algorithm that lets you determine when to use bubble or merge sorting, then place the list in the correct one.

  • @mcvoid1
    @mcvoid1 11 лет назад

    It's good to see some actual computer science in computerphile. Also, quicksort rules merge sort drools!

  • @terrahyde217
    @terrahyde217 9 лет назад +1

    Sorting algorithms are tough to make O(n!), but there are algorithms for other tasks that are worse than O(n!) by a long shot. Things like O(n^n) have come up in my own mathematical research (case generation and resolution for a problem).

  • @stormthehouse
    @stormthehouse 11 лет назад

    Xkcd actually talked about that a bit, how google tends to fudge the numbers.
    To anyone interested in reading about it the article is posted in the xkcd blog archives under February 2011

  • @Vulcapyro
    @Vulcapyro 11 лет назад

    There are many many algorithms that have steps that seem spread like a tree. If an algorithm runs in some sort of logarithmic time, guessing it uses a divide-and-conquer approach somewhere is probably accurate.

  • @gulllars
    @gulllars 11 лет назад

    A bit later on after other videos, a video on the for loop and it's OO sibling foreach, and how it leads to spatial and temporal locality in memory, and how it can be unwound to be executed in parallel depending on branching with speculative execution could be interesting. There's easily 5 minutes there, and if you take your time to cover it in some depth with simple explanation you could get 15 minutes that would be interesting ;)

  • @IstasPumaNevada
    @IstasPumaNevada 11 лет назад

    That is correct, because it will always do the same number of steps for the same sized list (illustrated by the best and worst case scenarios both being n x logn).

  • @DjVortex-w
    @DjVortex-w 11 лет назад +5

    Bubble sort is a zombie. It will never die, no matter how many times you try to kill it. It always comes back.
    At the end of the universe, when the heat death is almost complete, and there's almost nothing left in the entire universe... someone will be there teaching bubble sort. It will never, ever, ever die. Even though it's the most horrible sorting algorithm that anybody has ever devised.

  • @TASDeign
    @TASDeign 11 лет назад

    That's entirely up to the implementation details. If you implement it with a DFS, then you're right. But you could just as easily implement a BFS that would sort in the manner he described.

  • @godwinikape8888
    @godwinikape8888 2 года назад

    Thank you so much for this video. It really helped me understand the concepts of these algorithms.

  • @AndreiMuntean0
    @AndreiMuntean0 11 лет назад

    This was awesome! Do one on quick sort too! Also do one where you answer the question "How does Google return search results from billions and billions of websites almost instantly?"

  • @XiremaXesirin
    @XiremaXesirin 11 лет назад

    It's not precisely the same thing, but there is an algorithm called Timsort which optimizes the algorithm to search for "runs" of the list that are semi-sorted, and cuts down on the number of steps in performing the sorting. In best case, it runs at O(n) time and in average case, it gets very close to O(n), but is still slightly slower at O(nlogn). It's worth noting that Information Theory has proven that no comparison-based sorting algorithm can run faster than O(nlogn) in average case.

  • @SapphireCrook
    @SapphireCrook 11 лет назад

    This was nice. Would've been nicer to have had this before my workshop about collections and sorting, but whatever.
    Shufle sorts are the best kind of sorts, because at least you had fun!

  • @PJSimply
    @PJSimply 11 лет назад

    If you want to try a fast and easy to implement sorting algorithm, take a look at comb sort. It is a bad-known algortihm, basicly BubbleSort but with a decreasing gap beetween numbers, with same complexity as quick sort !

  • @Viniter
    @Viniter 11 лет назад

    Actually from a complexity standpoint it doesn't really matter whether you write an algorithm in recursive or iterative form. You can have Merge Sort in recursive form and it will still have complexity O(n*log n). Iterative implementation usually performs better, but the difference doesn't scale.

  • @Friedeggonheadchan
    @Friedeggonheadchan 11 лет назад

    Excellent video! Please continue on this route here.

  • @Turcian
    @Turcian 11 лет назад

    Actually the best running times in "real life" are obtained by algorithms that combine two(or possibly more) sorting algorithms like quick-sort and heap-sort(the combination is called intro-sort), or merge-sort and insertion sort(that's called Tim sort).

  • @AlexanderTrefz
    @AlexanderTrefz 11 лет назад

    That you still have to query with incredible speed. even with indices you need multiple systems in place to make that search fast, search is still one of the hardest problems in the industry because it all revolves around cache invalidation, which is super hard.

  • @Artificial-Insanity
    @Artificial-Insanity 11 лет назад

    Wow, this is exactly what I thought you should do next.

  • @comochinganconesto
    @comochinganconesto 11 лет назад

    [100, 100, 3, 1]

  • @bidule22222
    @bidule22222 11 лет назад

    The kraft paper has been replaced by the classic and very appropriate dot-matrix printing paper. Excellent!

  • @MalliMania
    @MalliMania 11 лет назад

    Sorting Algorithms are a thing of beauty!

  • @AngusPearson
    @AngusPearson 11 лет назад

    I was just about to comment about it being called bogo sort, but he got it in! I'm happy now.

  • @MorRobots
    @MorRobots 11 лет назад

    You should talk about linked list sorting using bidirectional lists.
    its a supper supper fast sort for large sets of data.
    you push your values to the right of left of existing values by using link swapping and you keep track of both ends of the list and the half and quarter way marks depending on how big the lists gets, using those points you can quickly reduce the number of comparisons.

  • @awirstam
    @awirstam 8 лет назад +54

    How does a computer "know" that 3 is a lower number than 4?

  • @MrWazzup987
    @MrWazzup987 11 лет назад

    i'M new to programing so looking at the end code with knowledge of purpose and process is very helpful

  • @JerehmiaBoaz
    @JerehmiaBoaz 11 лет назад

    Basically (without algebra) with bubblesort you have to run through a list of n items n times in the worst case and every run you have to compare n items, so O(n) = n * n or n ^ 2.
    With mergesort you have to compare all n items every run too, but because you split the list in two every run you only have log2(n) runs, so the complexity is n * log2(n), but in in big O notation you're not interested in the base of the logarithm so O(n) = n * log n

  • @ZackNagaich
    @ZackNagaich 11 лет назад

    You are correct, it is generally implemented with recursion.

  • @ashwith
    @ashwith 11 лет назад

    Amazing work with the animations! Makes merge sort so much easier to understand!

  • @Beer_Dad1975
    @Beer_Dad1975 11 лет назад

    Exactly - it has to be a pretty specific reason. The systems I work on use them to generate formula based pricing, where once price can be based on another price, based on another price etc - the recursion moves down through the price hierarchy - it would have been very hard to achieve the same result without it. Sure can be confusing trying to work out pricing errors though.

  • @autom4
    @autom4 11 лет назад

    oh no no, don't get me wrong, I completely agree that they could explain those too. Also agreeing that they are useful! I was just not sure if you realised that they are only for known compositions as you said :)
    Didn't mean to break down your comment, just trying to support, let others learn maybe some more new things :)
    sorry if my reaction seemed a little bit aggresive or something... :P

  • @ChitrankDixit
    @ChitrankDixit 10 лет назад

    nice video and explanation for those who are new to Algorithm Analysis

  • @Mike10131994
    @Mike10131994 11 лет назад

    But programming is one of the greatest and most significant portion of the study, so it very well should be something about it. Not necessarily tutorials, but maybe how compilers/assemblers work to translate what we say to what the computer understands.

  • @MatejNanut
    @MatejNanut 11 лет назад

    Did you know there's a logarithmic time algorithm for getting a certain Fibonacci number? I was amazed when I found out; I haven't even considered it can be done in better than linear time.
    Also, you can write it recursively in certain languages and still get linear time, by using memoization.

  • @ianfoote
    @ianfoote 11 лет назад

    That general idea can be a good strategy, though I don't know if that specific combination is used. I know that a combination of mergesort for large n, and insertion sort for small n is used by some programming languages (e.g. python) along with some other optimisations. Look up timsort if you're interested in the details.

  • @corsairrus
    @corsairrus 11 лет назад +1

    Actually, there is a slightly better version of bubble sort : you have to stop a loop as soon as you reach a position where you did the last swap from the previous loop.
    Still, algorithms like heap sort and quicksort are better (both goes in O(n*log(n)));
    But in the worst case (rare) quicksort is in O(n^2).
    BTW the last algorithm (with random permutation of objects) is actually called "shotgun sort".
    Are you also going to talk about insertion sort, enumeration sort and gnome sort?

  • @Vospi
    @Vospi 9 лет назад +12

    wow, that last algorithm is so amazing :)