Quick Sort - Computerphile
HTML-код
- Опубликовано: 24 июн 2013
- Quick Sort is a popular sorting algorithm, but how does it work? Alex continues our exploration of sorting algorithms with a quick look at quick sort.
Original 'get sorted' film: • Getting Sorted & Big O...
Cookies: • Follow the Cookie Trai...
Alex's code that generated the data for the tests on the original 'get sorted' video:
github.com/apinkney97/Sorts
Alex's graph of all the results from his tests on the original 'get sorted' video: eprg.org/allplots.pdf
/ computerphile
/ computer_phile
This video was filmed and edited by Sean Riley.
Computer Science at the University of Nottingham: bit.ly/nottscomputer
Computerphile is a sister project to Brady Haran's Numberphile. See the full list of Brady's video projects at:periodicvideos.blogspot.co.uk/...
Quicksort explained quickly.
You made it so simple compared to my professor !
God he's an angel. Why do professors make everything so needlessly complex. Thank u very much
OMG! Finally someone that simplifies this algorithm. THANK YOU!
3 minutes and it fixed me... you're awesome
gotta love the beauty of recursion
This does go some way to explaining how quick sort works, but doesn't really show why it's quick. At uni, what they did, was put together a little graphical demonstration, with roughly a hundred items, and had them bubble sort, and a couple of other sorts, then quick sort. Or rather, they ran it once with about ten items per list, and there was no real difference, in fact, quick sort wasn't all that quick. But once the number of items became non trivial, quick sort very quickly demonstrated it was orders of magnitude faster. It was quite striking how much faster it was than anything else.
Quicksort is also a beautiful recursive algorithm :)
While you can understand bubblesort after finishing intro to programming, not even necessarily object oriented programming, you need something like algorithms and data structures plus a lot of thinking to understand the beauty of the quicksort algorithm. Bubblesort is also beautiful in it's simplicity, and can still be nice with simple optimizations :)
This video explains quicksort very well and simply at a top level.
So I'm doing a uni project on sorting algorithms and asymptomatic notation, and I've been looking at code so long I was starting to forget what quicksort actually does, only that my recursive code worked. This video helped refresh me on quicksort so thank you!
Loving this new channel. Thank you Brady (and all those involved)!
Best explanation of the recursive nature behind the quicksort algorithm that I've found so far on RUclips. Thx well done.
this video explanation is so simple and has clear visualization than any other quicksort algorithm video i have ever watch
Fantastic fast, clear, visual explanation. I feel like so many overcomplicate these concepts. Thank you so much.
This is just brilliant.
Thank you guys for that, I was having a hard time with it before I check your video.
This is why I love this channel. Thank you!
Once, when I was young and designing sorting for a biology professor, to sort out his research data (ant populations) I took the most acclaimed sorting algorithm, quicksort at the time and wrote it into the program. And it promptly crashed due to stack overflow. Ok, it was back in the days of PC XT:s when both memory and stack were available in very limited supply. I switched to an non-recursive algorithm shellsort, which was quite sufficient. Problem solved.
Anyway, even later when the computer memory and stack were available in much larger chunks, I found out that the theoretically superior (recursive! Ooh how sexy) algorithm is not necessarily the fastest one in real world situations. I clocked both algorithms on real data. I was writing programs for real people doing real work, not for a theoretical situation. So I mostly shunned quicksort and used shellsort or mergesort instead.
Yes, the real-life situations are often like this: you have tens or hundreds of thousands of records (or millions) that are already sorted, and you want to add some additional elements (in the order of tens or hundreds) to these.
This is the best explanation on quick sort that I've found so far in the internet
been looking for quick sort tutorial for my mid test, and this is what I definitely need!
You should always choose the center-most element as the pivot for the off-chance the list is already (somewhat) sorted, and if the list is random then the center element is still as good of a choice as any other.
i feel like this is the most logical and well mannered argument on youtube.
well done, honestly.
youtube could use more of this.
BEST quick sort video EVER. FInally understand it 100% and MY MIND HAS BEEN BLOWN.
Thank you so much, this was a great help for my CS revision :)
This is one of the best tutorial on quick sort. Thank you.
Brilliant illustrated!
Best quicksort explanation I've seen.
Best explanation of quicksort I have found.
The best explanation on RUclips, thanks a alot !!
Well, YT less so. Most TV, defiantly. This channel, and it's relatives, actually take us step by step. Which is nice. As all levels can enjoy, just moving in and starting from the step they are happy with and progressing from there. :)
Computerphile? :) Cool!
I just discovered this channel. I think it's right down my alley.
The best Brady channel besides sixtysymbols !
This helped me out. Thanks!
Such a simple explanation! Thanks!
This Alex guy is really good at this.
Well done Alex!
that was an awesome tutorial so easy to understand and so simple damn!
I know how to code it, what I mean is: this channel is called computerphile, the really interesting part about the sorting is how the computer does it (for the people that does not know how it's done).
The way I learned Quicksort, you choose the center element or (if the number of elements is even) the one to the left of the center as the pivot element. Sure, in the worst case, that is always the highest element in the list, but that is much more unlikely than being fed a sorted list in which case choosing the center element gives you the best results.
Bubblesort and selection sort can be done in place, with a constant amount of memory used to keep up with the parameters of the algorithm. They are O(1) in space complexity.
Merge sort uses O(n) memory. You would think it would be more, but you can free up memory you aren't using as you go along.
Quick sort is also O(n) in space complexity, but there are some in-place implementations that reduce this to O(log n) (you still have to keep up with information about partitions).
Another optimization (for very large lists) is to use quicksort first to some depth (usually log(n)), and then use another sort such as merge sort, or one (that hasn't been talked about yet) such as heap sort or insertion sort (this one bad on big unsorted lists but is really good on partially sorted lists).
in cases where I have implemented quicksort, typically I have used the value from the middle of the list, mostly because it tends to behave better in cases where the list is already (or is nearly) sorted.
picking the first or last item will tend to result in the worst-case with nearly sorted lists (and/or trigger falling back to a slower sorting algorithm).
IIRC, once looking at a C library qsort, it grabbed the first/last/middle values, and picked the middle value of these.
Once again, excellent work with the animations! Please do the spanning tree algorithm! :-)
So does that mean one can use bubble sort until a certain number n, before switching over to merge sort? For an algorithm that accounts for both case advantages?
Beautiful explanation.
they did, there is a plot in the video description, the two quick sort algorithms are fastest, then comes merge sort, then heap sort and then all the other algorithms
Computers generally are built so you have to know the address of the memory you need to access before you can find out what it contains. It is like having to go to a person's address before you know who they are. Memory that can be accessed based on what it contains is more expensive, but it is used in some parts of the computer that need to be very fast (like the cache or virtual memory tables)
And now onto Randomized Quicksort! You choose the pivot at random, which, in practical uses gives better results than the one on the start or the end, or even the approximate middle.
really nice and simple explanation! :D thanks
mind blowned. Best explanation
The best quicksort tutorial
Thank you, that was precisely what I looking for!
These videos have been about comparison based sort in memory that is accessed by address instead of by content. If you can access by content or you do something other than compare numbers then it is a different kind of sort and you get different algorithms.
If you modify your algorithm to find the lowest element (rather than just blindly using 0, 1, 2, 3 etc), put that at the start, then find the next lowest, put that after it, etc until you get to the end of the list, you have a selection sort which runs in n^2.
I'm really liking these videos so far. I'm hoping we'll get videos on all sorts of different types of algorithms in the future.
Thank you! Now I can sort my books more easily
Pick three random items from the and sort them. The middle one becomes the pivot item. This produces a 14% improvement over the first/last/single-random performance for Quicksort because of balancing.
If you detect a duplicate in your three items, you can change the normal comparison operation for the next pass to optimize for three partitions (lt, =, gt) rather than the normal two partitions. This way, your duplicates will be in the same (middle) partition and not need any further sorting.
I haven't had to write many sorting algorithms lately, though I have been doing plenty of hand-sorting. For that I use a variation on a merge-sort.
Of course it is. If you're sorting a big database or anything that doesn't fit in 8gb. But what he meant is that it's interesting to know if an algorithm needs a separate list to move things to or is able to sort things in-place. If I remember correctly merge sort requires a second list of keys when quick sort sorts in-place with only the program stack as extra memory used. When sorting big loads of data you want no extra memory used. The less data the less important the extra memory usage is.
The "List" exists in memory, it is computationally impossible using standard computers to do comparisons on things in memory, they have to be in a "register" or using some special port to something like a MathCoProcessor/GPU to do "direct" (as far as progam is concerned) comparisons. So, reason they do one-and-one compares is due to need to load data into TWO separate registers of appropriate type and then compare them. Some advanced architectures allow for memory pointer+size comparisons.
Is there any video planed about the Bit Coin? I would love to hear about that.
In computer science, log(n) typically means the base 2 logarithm of n, which would be computed as log(n) / log(2) on a typical calculator. When an algorithm is log(n), it means that to make a linear increase in running time, you need an exponential increase in problem size. A simple way of understanding it is, the more items there are, the more effective the algorithm becomes.
Your answer covered me completely. ;)
Is the pivot value always the right most digit in the initial and sub arrays? Or can it be chosen as the left most digit or any other initially?
I remember this from my Decision 1 exam. Good times.
Randomizing doesn't completely avoid o(n^2), just makes it less likely and prevents sabotage. But there are ways to avoid it altogether by smart selection of the pivot.
you have n operations for each level of the pyramid.Thus, the height of the pyramid is very important, ranging from log(n) to n. This gives a best case of n.log(n) and a worst case of n². The calculation of the average case is a lot more complex, but it also results in n.log(n)
Great video!
Can you do more algorithms?
Memory consumption (IIRC):
Merge Sort: O(n)
Quick Sort: O(1)
Quick sort can be implemented "in place". This means you don't even need to allocate a second array to copy the values to.
I've written a quick sort, so I know how it is implemented. I recall it took a couple hundred lines to do it. But that was many years ago. I might write a shorter one now. I'm sure there are examples on the web.
Once you've sorted a list, it makes it a lot easier to search it. I tried to use one to make it easier to find a cutoff in an integration. You can sort tasks based on priority and then do them in a meaningful order w/out a searching for the next one. I'm sure others will have more uses.
That wildly varies. It mostly depends on which structure you use to represent data (arrays are often sorted with quicksort, linked lists with mergesort, heaps with heapsort etc..).
Lots of programming languages also use hybrid approaches. For example, Java and Python use an algortihm called timsort for sorting arrays, which is a mix between merge- and insertionsort, and also checks for already sorted sublists in your list to save time.
tl;dr a lot of them.
I literally sat a computing exam a week ago. This video would have been SO USEFUL then.
I have no idea what you mean by "stop when the index of the itemFromLeft > index of itemFromRight". if we're using the index of the items after they are swapped as in the second swap wouldnt we stop after the first swap?
thanks for the clear explanation.
reminds me of the days of decision maths, the different types of quick sort, boubble sort and bin packing
Knowing all the different sort algorithms is definitely the key to being a great computer scientist.
Why else would people go on about them so much?
(...continued) Granted, modern processors do do more things in parallel. Pipelines, superscalar architecture, etc., are tools they use to parallelize what is written in a non-parallel way. But none of it changes the algorithms themselves, which are still written essentially in the sequential paradigm. Even programs with parallelism have mostly sequential parts; otherwise it becomes difficult to make them work correctly. Not to mention parallel processing facilities are usually quite limited.
Oh great you could have put this video up before our computing exam.
in what scenario would one prefer quicksort to merge sort? What is the purpose of quicksort if merge sort exists?
everytime you transvert the list in quicksort the pivot is saved on the stack (stack alocation are fast), where mergesort have to allocate memory on the heap for the temporary lists...
even though they are both O(n log(n)), you should think of them more like
k1 * n * log(k2 * n)) where k1 and k2 are constants related to the computations done each step, k1 and k2 are simply smaller for quick sort...
If you only slightly alter his "idea", you get what's called a radix sort or counting sort. The fastest possible sorting algorithm, O(n) in every case. It's widely used in computer graphics and physics simulations, as we speak. The drawback is that it works only with plain numbers of reasonable size. You can't sort say strings of text with it.
@dkraid - it means the opposite of exponential. Exponential gets very big very quickly: 2, 4, 8, 16, 32, 64, 128, 512, 1024, 2048 etc. log gets big very slowly: 1, 1.58, 2, 2.32, 2.58, 2.80, 3, 3.16, 3.23, 3.46 etc.
In big O notation the actual base of the log isn't important because the log of all bases are a constant multiplier to each other.
You keep using green as if it was your favorite color much more than any other!
Quick sort's worst case can be mitigated by choosing a better pivot. Instead of just picking the left or right-most items, you can randomly sample x items from the list, and take the median of all sampled values. Obviously if you sample too much you might as well do a different sort, but 3-5 values is almost always enough.
Also, you can check to see if the list is sorted before beginning. This adds (+ n) to the complexity, which is insignificant for large n.
I may be imagining things, but I'm sure a while ago Numberphile or some similar channel did a video on it and showed why the previous one will get full.
It's not the only reason. You can easily use your own stack-like heap allocator, which will work for O(1), no problem. The bottleneck of modern computers is memory access. For better performance, you should use CPU-cache effeciently. Each recursive call of Quick Sort is a single pass through linear array of data, which is good. Merge sort is similiar, yet you will get much more cache-misses and your CPU will have to wait data from slow RAM.
It would work for smaller lists of numbers but say you had a large list(lets say 100 or so) the time it took would have a significant impact as with most sorting algorithms. But if you want to use a sorting algorithm you obviously use the one thats going to be the most efficient to sort your list of numbers.
I think the best way to understand how computers do calculations is basically to look at how a mechanical computer works, because digital computers work, in principle, the same way just that they use electric currents instead of physical objects.
Video on Heap Sort would be great!
Quick Sort is usually... quicker on real data. In practice, we Quick Sort array, until recursion depth is less then log(n) and then shift to something more consistent, like Heap Sort or Merge Sort. This bring us the speed of QS in average and help to avoid worst case scenario. It's calles Timsort, afaik.
You're considering only moving a card as an operation, but reading the rank of the card is an operation too. You still need to read all the value of each card for each level of the pyramid: more levels => more reads.
if the ideal is picking a pivot that is roughly in the middle of the range of values, would it be worth it from a time perspective, to run through the list first and make a track of the lowest and the highest value? then go through the list to find the item with the closest value to the "ideal" that you could work out from there and make that the pivot? it would add a few more steps and probably make the best case a bit worse (2n? 3n?) but it would make the worst case ALOT better..
Yes it is, you still need to transverse it to perform a task and that has costs involved.
One good software design principal is to find balance in resource usage, on consoles for example you can't just throw more memory in the system you must do the "best" with what you got.
+1, superb explanation, I am no sorting expert so I tried to guess why to the best of my ability, your explanation makes perfect sense, thanks :)... I am a python programmer so I rarely think about efficiency :D
You have it right. What is making it appear faster is the fact that you have fewer unique things than actual things. If you have m unique things and n things to sort, you have worst case of m*n. If m=n, then your back to n^2. In this case, I could beat both by taking a census and reproducing the list sorted. But both of these algorthims would only work if you knew what might be in the list or if finding that out was not too costly.
Basically: Just google "[language] beginner tutorial". You have to choose what language to start with, of course. I feel like a lot of people will disagree with me, but I'd recommend vb.net, because it doesn't have a ton brackets and stuff to worry about, so I feel like it's more similar to a human language than others, making it easier to learn for a beginner. Another advantage is that you only have to download one (1) program to get started.
in the other video, he talks about how the complexity of these algorithms is polynomial. Quicksort is n*log(n) just like Mergesort, but the other factors are very small, which makes it very fast. If you want to read about algorithms which have nice complexity classes look at Smoothsort and Timsort =)
That's the great thing about quicksort. It's quite easy to do it "in-place" (with only constant memory other than the input). While there are solutions to do mergesort in-place, they're rather complicated and don't give you O(n*log(n)) time. Bubblesort is of course in-place. Wikipedia has a great list at Sorting_algorithm#Comparison_of_algorithms
but keep in mind that the software is getting more and more complex and does stuff you couldn't think about 20 years ago.
I'm not saying all that performance goes into "quality - tasks" that enhance the experience (probably mostly programming inefficiencies), but the overall result of the power is performance where it counts (scientifically at least)
Learned this in Data Structures at university :D
ok here is something i was wondering for a long time: "How do I shuffle a deck of cards, that is sorted, so it looks most random?"
A video on processors would be great!!!
can you do a video on heap sort and sorts used in parallel computing?