This does go some way to explaining how quick sort works, but doesn't really show why it's quick. At uni, what they did, was put together a little graphical demonstration, with roughly a hundred items, and had them bubble sort, and a couple of other sorts, then quick sort. Or rather, they ran it once with about ten items per list, and there was no real difference, in fact, quick sort wasn't all that quick. But once the number of items became non trivial, quick sort very quickly demonstrated it was orders of magnitude faster. It was quite striking how much faster it was than anything else.
So I'm doing a uni project on sorting algorithms and asymptomatic notation, and I've been looking at code so long I was starting to forget what quicksort actually does, only that my recursive code worked. This video helped refresh me on quicksort so thank you!
Quicksort is also a beautiful recursive algorithm :) While you can understand bubblesort after finishing intro to programming, not even necessarily object oriented programming, you need something like algorithms and data structures plus a lot of thinking to understand the beauty of the quicksort algorithm. Bubblesort is also beautiful in it's simplicity, and can still be nice with simple optimizations :) This video explains quicksort very well and simply at a top level.
The way I learned Quicksort, you choose the center element or (if the number of elements is even) the one to the left of the center as the pivot element. Sure, in the worst case, that is always the highest element in the list, but that is much more unlikely than being fed a sorted list in which case choosing the center element gives you the best results.
Bubblesort and selection sort can be done in place, with a constant amount of memory used to keep up with the parameters of the algorithm. They are O(1) in space complexity. Merge sort uses O(n) memory. You would think it would be more, but you can free up memory you aren't using as you go along. Quick sort is also O(n) in space complexity, but there are some in-place implementations that reduce this to O(log n) (you still have to keep up with information about partitions).
Once, when I was young and designing sorting for a biology professor, to sort out his research data (ant populations) I took the most acclaimed sorting algorithm, quicksort at the time and wrote it into the program. And it promptly crashed due to stack overflow. Ok, it was back in the days of PC XT:s when both memory and stack were available in very limited supply. I switched to an non-recursive algorithm shellsort, which was quite sufficient. Problem solved. Anyway, even later when the computer memory and stack were available in much larger chunks, I found out that the theoretically superior (recursive! Ooh how sexy) algorithm is not necessarily the fastest one in real world situations. I clocked both algorithms on real data. I was writing programs for real people doing real work, not for a theoretical situation. So I mostly shunned quicksort and used shellsort or mergesort instead. Yes, the real-life situations are often like this: you have tens or hundreds of thousands of records (or millions) that are already sorted, and you want to add some additional elements (in the order of tens or hundreds) to these.
I know how to code it, what I mean is: this channel is called computerphile, the really interesting part about the sorting is how the computer does it (for the people that does not know how it's done).
Well, YT less so. Most TV, defiantly. This channel, and it's relatives, actually take us step by step. Which is nice. As all levels can enjoy, just moving in and starting from the step they are happy with and progressing from there. :)
You should always choose the center-most element as the pivot for the off-chance the list is already (somewhat) sorted, and if the list is random then the center element is still as good of a choice as any other.
I've never seen someone explain in the most efficient and effective manner. British - the most intelligent people in this universe. Teaching everyone in the world in a common language that most can understand.
Computers generally are built so you have to know the address of the memory you need to access before you can find out what it contains. It is like having to go to a person's address before you know who they are. Memory that can be accessed based on what it contains is more expensive, but it is used in some parts of the computer that need to be very fast (like the cache or virtual memory tables)
In computer science, log(n) typically means the base 2 logarithm of n, which would be computed as log(n) / log(2) on a typical calculator. When an algorithm is log(n), it means that to make a linear increase in running time, you need an exponential increase in problem size. A simple way of understanding it is, the more items there are, the more effective the algorithm becomes.
they did, there is a plot in the video description, the two quick sort algorithms are fastest, then comes merge sort, then heap sort and then all the other algorithms
That depends on the system you want to use to sort, and on the data you expect to get as input. Are you expecting totally random input? Will your input be ridiculously large, or will it be a rather small data set? Does your system have lots of memory and computing power? Do you have the opportunity to create some extra hardware, which will handle the sorting? Do you have access to a network that you could use for distributed sorting algorithms? >There is no "one size fits all" sorting in reality
Of course it is. If you're sorting a big database or anything that doesn't fit in 8gb. But what he meant is that it's interesting to know if an algorithm needs a separate list to move things to or is able to sort things in-place. If I remember correctly merge sort requires a second list of keys when quick sort sorts in-place with only the program stack as extra memory used. When sorting big loads of data you want no extra memory used. The less data the less important the extra memory usage is.
Another optimization (for very large lists) is to use quicksort first to some depth (usually log(n)), and then use another sort such as merge sort, or one (that hasn't been talked about yet) such as heap sort or insertion sort (this one bad on big unsorted lists but is really good on partially sorted lists).
If you modify your algorithm to find the lowest element (rather than just blindly using 0, 1, 2, 3 etc), put that at the start, then find the next lowest, put that after it, etc until you get to the end of the list, you have a selection sort which runs in n^2.
These videos have been about comparison based sort in memory that is accessed by address instead of by content. If you can access by content or you do something other than compare numbers then it is a different kind of sort and you get different algorithms.
And now onto Randomized Quicksort! You choose the pivot at random, which, in practical uses gives better results than the one on the start or the end, or even the approximate middle.
you have n operations for each level of the pyramid.Thus, the height of the pyramid is very important, ranging from log(n) to n. This gives a best case of n.log(n) and a worst case of n². The calculation of the average case is a lot more complex, but it also results in n.log(n)
Quick sort's worst case can be mitigated by choosing a better pivot. Instead of just picking the left or right-most items, you can randomly sample x items from the list, and take the median of all sampled values. Obviously if you sample too much you might as well do a different sort, but 3-5 values is almost always enough. Also, you can check to see if the list is sorted before beginning. This adds (+ n) to the complexity, which is insignificant for large n.
(...continued) Granted, modern processors do do more things in parallel. Pipelines, superscalar architecture, etc., are tools they use to parallelize what is written in a non-parallel way. But none of it changes the algorithms themselves, which are still written essentially in the sequential paradigm. Even programs with parallelism have mostly sequential parts; otherwise it becomes difficult to make them work correctly. Not to mention parallel processing facilities are usually quite limited.
Pick three random items from the and sort them. The middle one becomes the pivot item. This produces a 14% improvement over the first/last/single-random performance for Quicksort because of balancing. If you detect a duplicate in your three items, you can change the normal comparison operation for the next pass to optimize for three partitions (lt, =, gt) rather than the normal two partitions. This way, your duplicates will be in the same (middle) partition and not need any further sorting.
The "List" exists in memory, it is computationally impossible using standard computers to do comparisons on things in memory, they have to be in a "register" or using some special port to something like a MathCoProcessor/GPU to do "direct" (as far as progam is concerned) comparisons. So, reason they do one-and-one compares is due to need to load data into TWO separate registers of appropriate type and then compare them. Some advanced architectures allow for memory pointer+size comparisons.
Memory consumption (IIRC): Merge Sort: O(n) Quick Sort: O(1) Quick sort can be implemented "in place". This means you don't even need to allocate a second array to copy the values to.
It's not the only reason. You can easily use your own stack-like heap allocator, which will work for O(1), no problem. The bottleneck of modern computers is memory access. For better performance, you should use CPU-cache effeciently. Each recursive call of Quick Sort is a single pass through linear array of data, which is good. Merge sort is similiar, yet you will get much more cache-misses and your CPU will have to wait data from slow RAM.
in cases where I have implemented quicksort, typically I have used the value from the middle of the list, mostly because it tends to behave better in cases where the list is already (or is nearly) sorted. picking the first or last item will tend to result in the worst-case with nearly sorted lists (and/or trigger falling back to a slower sorting algorithm). IIRC, once looking at a C library qsort, it grabbed the first/last/middle values, and picked the middle value of these.
If you only slightly alter his "idea", you get what's called a radix sort or counting sort. The fastest possible sorting algorithm, O(n) in every case. It's widely used in computer graphics and physics simulations, as we speak. The drawback is that it works only with plain numbers of reasonable size. You can't sort say strings of text with it.
It would work for smaller lists of numbers but say you had a large list(lets say 100 or so) the time it took would have a significant impact as with most sorting algorithms. But if you want to use a sorting algorithm you obviously use the one thats going to be the most efficient to sort your list of numbers.
Randomizing doesn't completely avoid o(n^2), just makes it less likely and prevents sabotage. But there are ways to avoid it altogether by smart selection of the pivot.
I've written a quick sort, so I know how it is implemented. I recall it took a couple hundred lines to do it. But that was many years ago. I might write a shorter one now. I'm sure there are examples on the web. Once you've sorted a list, it makes it a lot easier to search it. I tried to use one to make it easier to find a cutoff in an integration. You can sort tasks based on priority and then do them in a meaningful order w/out a searching for the next one. I'm sure others will have more uses.
You have it right. What is making it appear faster is the fact that you have fewer unique things than actual things. If you have m unique things and n things to sort, you have worst case of m*n. If m=n, then your back to n^2. In this case, I could beat both by taking a census and reproducing the list sorted. But both of these algorthims would only work if you knew what might be in the list or if finding that out was not too costly.
@dkraid - it means the opposite of exponential. Exponential gets very big very quickly: 2, 4, 8, 16, 32, 64, 128, 512, 1024, 2048 etc. log gets big very slowly: 1, 1.58, 2, 2.32, 2.58, 2.80, 3, 3.16, 3.23, 3.46 etc. In big O notation the actual base of the log isn't important because the log of all bases are a constant multiplier to each other.
Yes it is, you still need to transverse it to perform a task and that has costs involved. One good software design principal is to find balance in resource usage, on consoles for example you can't just throw more memory in the system you must do the "best" with what you got.
Basically: Just google "[language] beginner tutorial". You have to choose what language to start with, of course. I feel like a lot of people will disagree with me, but I'd recommend vb.net, because it doesn't have a ton brackets and stuff to worry about, so I feel like it's more similar to a human language than others, making it easier to learn for a beginner. Another advantage is that you only have to download one (1) program to get started.
Quick Sort is usually... quicker on real data. In practice, we Quick Sort array, until recursion depth is less then log(n) and then shift to something more consistent, like Heap Sort or Merge Sort. This bring us the speed of QS in average and help to avoid worst case scenario. It's calles Timsort, afaik.
though your counter argument is a valid point, you gotta remember that we're only gonna get more populous, and as we start to live longer and prosper (had to say it), then we're gonna just build higher up, and eventually we'll hit the roof
My teacher tasked me to sort 1000000 random points of data using various sort algorithms. It took over 20 minutes to do it with selection sort. It's a matter of seconds with quick sort.
everytime you transvert the list in quicksort the pivot is saved on the stack (stack alocation are fast), where mergesort have to allocate memory on the heap for the temporary lists... even though they are both O(n log(n)), you should think of them more like k1 * n * log(k2 * n)) where k1 and k2 are constants related to the computations done each step, k1 and k2 are simply smaller for quick sort...
That's the great thing about quicksort. It's quite easy to do it "in-place" (with only constant memory other than the input). While there are solutions to do mergesort in-place, they're rather complicated and don't give you O(n*log(n)) time. Bubblesort is of course in-place. Wikipedia has a great list at Sorting_algorithm#Comparison_of_algorithms
Quicksort works in-place, which means you don't need extra memory. It also rarely ends up in the worst case. Most of the time it is as fast or faster than mergesort. Mergesort guarantees a certain time, quicksort doesn't but it saves you memory and is 99.9999% of the time at least as fast. Every algorithm has their advantages and disadvantages :)
That wildly varies. It mostly depends on which structure you use to represent data (arrays are often sorted with quicksort, linked lists with mergesort, heaps with heapsort etc..). Lots of programming languages also use hybrid approaches. For example, Java and Python use an algortihm called timsort for sorting arrays, which is a mix between merge- and insertionsort, and also checks for already sorted sublists in your list to save time. tl;dr a lot of them.
yes. the cost of a few comparisons is probably small relative to the cost of picking a bad pivot. it doesn't actually necessarily often cost that much, as often in these sorts of cases, the need for a (potentially expensive) conditional jump can be optimized away.
but keep in mind that the software is getting more and more complex and does stuff you couldn't think about 20 years ago. I'm not saying all that performance goes into "quality - tasks" that enhance the experience (probably mostly programming inefficiencies), but the overall result of the power is performance where it counts (scientifically at least)
in the other video, he talks about how the complexity of these algorithms is polynomial. Quicksort is n*log(n) just like Mergesort, but the other factors are very small, which makes it very fast. If you want to read about algorithms which have nice complexity classes look at Smoothsort and Timsort =)
So does that mean one can use bubble sort until a certain number n, before switching over to merge sort? For an algorithm that accounts for both case advantages?
Check Wikipedia. I think, you will find there precise number of comparisions and memory operations. For Quick Sort, you will see something like 3nlogn + 4n comparisions and 2nlogn copies. In Merge Sort, you will need, like 6nlog + 3n copies and nlogn + 2n comparisions. Even though these algorithms are the same in Big O notation, Quick Sort still requires 2 times less work to perform. I hope, this will make things clearer.
I have no idea what you mean by "stop when the index of the itemFromLeft > index of itemFromRight". if we're using the index of the items after they are swapped as in the second swap wouldnt we stop after the first swap?
yes. for most apps which get much past "dead trivial" memory use starts getting a lot more important. even for fairly trivial apps, newbie programmers will often do something seemingly trivial and then wonder where all the memory has gone. a few arrays here, a few strings there, and suddenly the app is out of memory...
Interesting. I guess the thing that's hardest to wrap my head around is how computers handle numbers. It's easy to understand a computer's mindlessness in relation to real-world concepts. It's much tougher to recognize that that carries over to numbers as well, since numbers are viewed as cold, concrete, and divested from reality, much like the computers that use them.
Big O notation hides the difference between (e.g.) 2n*log(n) and 6n*log(n). The second algorithm would be 3 times slower than the first, but both would be O(n*log(n)). Also, in the case of Quick Sort the worst case is very rare, especially if the pivot isn't chosen as naively as in the video.
The one thing I think this video didn't mention was that while quicksort's asymptotic complexity is similar to merge and bubble sort. Its worst case is far less likely to occur than in merge or bubble. So average case runtime will generally be faster for quicksort than merge or bubble.
For many small scale userland applications, it's not very important at all. However, many services, especially online services such as social networking sites and online search engines, deal with no less than petabytes of information. In these types of scenarios, if you can shave off even one byte of memory usage for each entry in a table, there's a potential savings of many terabytes of memory.
I may be imagining things, but I'm sure a while ago Numberphile or some similar channel did a video on it and showed why the previous one will get full.
I recently learned to my astonishment that finding the median can be done in O(n). Search term is "median of medians algorithm". I doubt that it's done very often though, since just randomizing the pivot is much easier and should almost always be faster. Plus, you need to select a pivot for that selection algorithm, too, so maybe you'd just delegate the problem.
It's all about memory and what's being processed. A program only processes one or two numbers at a time. As far as an algorithm is concerned, whatever instruction the computer is executing at the moment, the numbers that instruction are looking at are the only ones it "knows about" directly; the rest are stored away "out of mind", so to speak. Contrast the human brain which processes massive amounts of information simultaneously (and the processor is also the memory, distributed throughout).
I think the best way to understand how computers do calculations is basically to look at how a mechanical computer works, because digital computers work, in principle, the same way just that they use electric currents instead of physical objects.
O(n) for shuffling a list doesn't matter overall, as the shuffle + quicksort would be O(n + n * log n), which works out to be just O(n * log n). And Quicksort is generally faster than other algorithms on average. Picking a random pivot is better or equal to shuffling in any case. If you're working with something like a linked list, it's equivalent. With actual arrays it's O(1). I can't think of a reason why it would be preferable to shuffle a list as opposed to just picking a random pivot.
You're considering only moving a card as an operation, but reading the rank of the card is an operation too. You still need to read all the value of each card for each level of the pyramid: more levels => more reads.
Merge sort is also the fastest sorting method in real life (that I'm aware of). Whenever I have a stack of papers in random order that needs to be sorted, I sort small sets of three or more papers, then merge every two sets into one until the entire stack is in order. I'm so nerdy. O_o
Well, your point was valid. It was like I was saying not to use square wheels but you have suggested using pentagons and pentagons aren't squares. Neither is a good idea :)
Quicksort explained quickly.
You made it so simple compared to my professor !
God he's an angel. Why do professors make everything so needlessly complex. Thank u very much
OMG! Finally someone that simplifies this algorithm. THANK YOU!
3 minutes and it fixed me... you're awesome
This does go some way to explaining how quick sort works, but doesn't really show why it's quick. At uni, what they did, was put together a little graphical demonstration, with roughly a hundred items, and had them bubble sort, and a couple of other sorts, then quick sort. Or rather, they ran it once with about ten items per list, and there was no real difference, in fact, quick sort wasn't all that quick. But once the number of items became non trivial, quick sort very quickly demonstrated it was orders of magnitude faster. It was quite striking how much faster it was than anything else.
So I'm doing a uni project on sorting algorithms and asymptomatic notation, and I've been looking at code so long I was starting to forget what quicksort actually does, only that my recursive code worked. This video helped refresh me on quicksort so thank you!
BEST quick sort video EVER. FInally understand it 100% and MY MIND HAS BEEN BLOWN.
gotta love the beauty of recursion
Quicksort is also a beautiful recursive algorithm :)
While you can understand bubblesort after finishing intro to programming, not even necessarily object oriented programming, you need something like algorithms and data structures plus a lot of thinking to understand the beauty of the quicksort algorithm. Bubblesort is also beautiful in it's simplicity, and can still be nice with simple optimizations :)
This video explains quicksort very well and simply at a top level.
i feel like this is the most logical and well mannered argument on youtube.
well done, honestly.
youtube could use more of this.
Fantastic fast, clear, visual explanation. I feel like so many overcomplicate these concepts. Thank you so much.
The way I learned Quicksort, you choose the center element or (if the number of elements is even) the one to the left of the center as the pivot element. Sure, in the worst case, that is always the highest element in the list, but that is much more unlikely than being fed a sorted list in which case choosing the center element gives you the best results.
Bubblesort and selection sort can be done in place, with a constant amount of memory used to keep up with the parameters of the algorithm. They are O(1) in space complexity.
Merge sort uses O(n) memory. You would think it would be more, but you can free up memory you aren't using as you go along.
Quick sort is also O(n) in space complexity, but there are some in-place implementations that reduce this to O(log n) (you still have to keep up with information about partitions).
this video explanation is so simple and has clear visualization than any other quicksort algorithm video i have ever watch
Once, when I was young and designing sorting for a biology professor, to sort out his research data (ant populations) I took the most acclaimed sorting algorithm, quicksort at the time and wrote it into the program. And it promptly crashed due to stack overflow. Ok, it was back in the days of PC XT:s when both memory and stack were available in very limited supply. I switched to an non-recursive algorithm shellsort, which was quite sufficient. Problem solved.
Anyway, even later when the computer memory and stack were available in much larger chunks, I found out that the theoretically superior (recursive! Ooh how sexy) algorithm is not necessarily the fastest one in real world situations. I clocked both algorithms on real data. I was writing programs for real people doing real work, not for a theoretical situation. So I mostly shunned quicksort and used shellsort or mergesort instead.
Yes, the real-life situations are often like this: you have tens or hundreds of thousands of records (or millions) that are already sorted, and you want to add some additional elements (in the order of tens or hundreds) to these.
This is the best explanation on quick sort that I've found so far in the internet
I know how to code it, what I mean is: this channel is called computerphile, the really interesting part about the sorting is how the computer does it (for the people that does not know how it's done).
best explanation of quicksort hands down
Well, YT less so. Most TV, defiantly. This channel, and it's relatives, actually take us step by step. Which is nice. As all levels can enjoy, just moving in and starting from the step they are happy with and progressing from there. :)
Loving this new channel. Thank you Brady (and all those involved)!
You should always choose the center-most element as the pivot for the off-chance the list is already (somewhat) sorted, and if the list is random then the center element is still as good of a choice as any other.
Best explanation of the recursive nature behind the quicksort algorithm that I've found so far on RUclips. Thx well done.
I've never seen someone explain in the most efficient and effective manner.
British - the most intelligent people in this universe. Teaching everyone in the world in a common language that most can understand.
Computers generally are built so you have to know the address of the memory you need to access before you can find out what it contains. It is like having to go to a person's address before you know who they are. Memory that can be accessed based on what it contains is more expensive, but it is used in some parts of the computer that need to be very fast (like the cache or virtual memory tables)
The best Brady channel besides sixtysymbols !
In computer science, log(n) typically means the base 2 logarithm of n, which would be computed as log(n) / log(2) on a typical calculator. When an algorithm is log(n), it means that to make a linear increase in running time, you need an exponential increase in problem size. A simple way of understanding it is, the more items there are, the more effective the algorithm becomes.
they did, there is a plot in the video description, the two quick sort algorithms are fastest, then comes merge sort, then heap sort and then all the other algorithms
That depends on the system you want to use to sort, and on the data you expect to get as input.
Are you expecting totally random input? Will your input be ridiculously large, or will it be a rather small data set? Does your system have lots of memory and computing power? Do you have the opportunity to create some extra hardware, which will handle the sorting? Do you have access to a network that you could use for distributed sorting algorithms?
>There is no "one size fits all" sorting in reality
Of course it is. If you're sorting a big database or anything that doesn't fit in 8gb. But what he meant is that it's interesting to know if an algorithm needs a separate list to move things to or is able to sort things in-place. If I remember correctly merge sort requires a second list of keys when quick sort sorts in-place with only the program stack as extra memory used. When sorting big loads of data you want no extra memory used. The less data the less important the extra memory usage is.
This is why I love this channel. Thank you!
Another optimization (for very large lists) is to use quicksort first to some depth (usually log(n)), and then use another sort such as merge sort, or one (that hasn't been talked about yet) such as heap sort or insertion sort (this one bad on big unsorted lists but is really good on partially sorted lists).
This is just brilliant.
Thank you guys for that, I was having a hard time with it before I check your video.
If you modify your algorithm to find the lowest element (rather than just blindly using 0, 1, 2, 3 etc), put that at the start, then find the next lowest, put that after it, etc until you get to the end of the list, you have a selection sort which runs in n^2.
These videos have been about comparison based sort in memory that is accessed by address instead of by content. If you can access by content or you do something other than compare numbers then it is a different kind of sort and you get different algorithms.
And now onto Randomized Quicksort! You choose the pivot at random, which, in practical uses gives better results than the one on the start or the end, or even the approximate middle.
Best explanation of quicksort I have found.
This is one of the best tutorial on quick sort. Thank you.
This Alex guy is really good at this.
Well done Alex!
you have n operations for each level of the pyramid.Thus, the height of the pyramid is very important, ranging from log(n) to n. This gives a best case of n.log(n) and a worst case of n². The calculation of the average case is a lot more complex, but it also results in n.log(n)
Quick sort's worst case can be mitigated by choosing a better pivot. Instead of just picking the left or right-most items, you can randomly sample x items from the list, and take the median of all sampled values. Obviously if you sample too much you might as well do a different sort, but 3-5 values is almost always enough.
Also, you can check to see if the list is sorted before beginning. This adds (+ n) to the complexity, which is insignificant for large n.
The best quicksort tutorial
(...continued) Granted, modern processors do do more things in parallel. Pipelines, superscalar architecture, etc., are tools they use to parallelize what is written in a non-parallel way. But none of it changes the algorithms themselves, which are still written essentially in the sequential paradigm. Even programs with parallelism have mostly sequential parts; otherwise it becomes difficult to make them work correctly. Not to mention parallel processing facilities are usually quite limited.
Best quicksort explanation I've seen.
Pick three random items from the and sort them. The middle one becomes the pivot item. This produces a 14% improvement over the first/last/single-random performance for Quicksort because of balancing.
If you detect a duplicate in your three items, you can change the normal comparison operation for the next pass to optimize for three partitions (lt, =, gt) rather than the normal two partitions. This way, your duplicates will be in the same (middle) partition and not need any further sorting.
The "List" exists in memory, it is computationally impossible using standard computers to do comparisons on things in memory, they have to be in a "register" or using some special port to something like a MathCoProcessor/GPU to do "direct" (as far as progam is concerned) comparisons. So, reason they do one-and-one compares is due to need to load data into TWO separate registers of appropriate type and then compare them. Some advanced architectures allow for memory pointer+size comparisons.
Memory consumption (IIRC):
Merge Sort: O(n)
Quick Sort: O(1)
Quick sort can be implemented "in place". This means you don't even need to allocate a second array to copy the values to.
I'm really liking these videos so far. I'm hoping we'll get videos on all sorts of different types of algorithms in the future.
It's not the only reason. You can easily use your own stack-like heap allocator, which will work for O(1), no problem. The bottleneck of modern computers is memory access. For better performance, you should use CPU-cache effeciently. Each recursive call of Quick Sort is a single pass through linear array of data, which is good. Merge sort is similiar, yet you will get much more cache-misses and your CPU will have to wait data from slow RAM.
Brilliant illustrated!
in cases where I have implemented quicksort, typically I have used the value from the middle of the list, mostly because it tends to behave better in cases where the list is already (or is nearly) sorted.
picking the first or last item will tend to result in the worst-case with nearly sorted lists (and/or trigger falling back to a slower sorting algorithm).
IIRC, once looking at a C library qsort, it grabbed the first/last/middle values, and picked the middle value of these.
Knowing all the different sort algorithms is definitely the key to being a great computer scientist.
Why else would people go on about them so much?
If you only slightly alter his "idea", you get what's called a radix sort or counting sort. The fastest possible sorting algorithm, O(n) in every case. It's widely used in computer graphics and physics simulations, as we speak. The drawback is that it works only with plain numbers of reasonable size. You can't sort say strings of text with it.
It would work for smaller lists of numbers but say you had a large list(lets say 100 or so) the time it took would have a significant impact as with most sorting algorithms. But if you want to use a sorting algorithm you obviously use the one thats going to be the most efficient to sort your list of numbers.
Randomizing doesn't completely avoid o(n^2), just makes it less likely and prevents sabotage. But there are ways to avoid it altogether by smart selection of the pivot.
I've written a quick sort, so I know how it is implemented. I recall it took a couple hundred lines to do it. But that was many years ago. I might write a shorter one now. I'm sure there are examples on the web.
Once you've sorted a list, it makes it a lot easier to search it. I tried to use one to make it easier to find a cutoff in an integration. You can sort tasks based on priority and then do them in a meaningful order w/out a searching for the next one. I'm sure others will have more uses.
explained in 3 minutes what my professor couldn't explain in 30 minutes
I literally sat a computing exam a week ago. This video would have been SO USEFUL then.
You have it right. What is making it appear faster is the fact that you have fewer unique things than actual things. If you have m unique things and n things to sort, you have worst case of m*n. If m=n, then your back to n^2. In this case, I could beat both by taking a census and reproducing the list sorted. But both of these algorthims would only work if you knew what might be in the list or if finding that out was not too costly.
@dkraid - it means the opposite of exponential. Exponential gets very big very quickly: 2, 4, 8, 16, 32, 64, 128, 512, 1024, 2048 etc. log gets big very slowly: 1, 1.58, 2, 2.32, 2.58, 2.80, 3, 3.16, 3.23, 3.46 etc.
In big O notation the actual base of the log isn't important because the log of all bases are a constant multiplier to each other.
been looking for quick sort tutorial for my mid test, and this is what I definitely need!
Yes it is, you still need to transverse it to perform a task and that has costs involved.
One good software design principal is to find balance in resource usage, on consoles for example you can't just throw more memory in the system you must do the "best" with what you got.
Basically: Just google "[language] beginner tutorial". You have to choose what language to start with, of course. I feel like a lot of people will disagree with me, but I'd recommend vb.net, because it doesn't have a ton brackets and stuff to worry about, so I feel like it's more similar to a human language than others, making it easier to learn for a beginner. Another advantage is that you only have to download one (1) program to get started.
The best explanation on RUclips, thanks a alot !!
Quick Sort is usually... quicker on real data. In practice, we Quick Sort array, until recursion depth is less then log(n) and then shift to something more consistent, like Heap Sort or Merge Sort. This bring us the speed of QS in average and help to avoid worst case scenario. It's calles Timsort, afaik.
Computerphile? :) Cool!
I just discovered this channel. I think it's right down my alley.
though your counter argument is a valid point, you gotta remember that we're only gonna get more populous, and as we start to live longer and prosper (had to say it), then we're gonna just build higher up, and eventually we'll hit the roof
Thank you so much, this was a great help for my CS revision :)
My teacher tasked me to sort 1000000 random points of data using various sort algorithms. It took over 20 minutes to do it with selection sort. It's a matter of seconds with quick sort.
everytime you transvert the list in quicksort the pivot is saved on the stack (stack alocation are fast), where mergesort have to allocate memory on the heap for the temporary lists...
even though they are both O(n log(n)), you should think of them more like
k1 * n * log(k2 * n)) where k1 and k2 are constants related to the computations done each step, k1 and k2 are simply smaller for quick sort...
That's the great thing about quicksort. It's quite easy to do it "in-place" (with only constant memory other than the input). While there are solutions to do mergesort in-place, they're rather complicated and don't give you O(n*log(n)) time. Bubblesort is of course in-place. Wikipedia has a great list at Sorting_algorithm#Comparison_of_algorithms
Quicksort works in-place, which means you don't need extra memory. It also rarely ends up in the worst case. Most of the time it is as fast or faster than mergesort.
Mergesort guarantees a certain time, quicksort doesn't but it saves you memory and is 99.9999% of the time at least as fast.
Every algorithm has their advantages and disadvantages :)
That wildly varies. It mostly depends on which structure you use to represent data (arrays are often sorted with quicksort, linked lists with mergesort, heaps with heapsort etc..).
Lots of programming languages also use hybrid approaches. For example, Java and Python use an algortihm called timsort for sorting arrays, which is a mix between merge- and insertionsort, and also checks for already sorted sublists in your list to save time.
tl;dr a lot of them.
yes. the cost of a few comparisons is probably small relative to the cost of picking a bad pivot.
it doesn't actually necessarily often cost that much, as often in these sorts of cases, the need for a (potentially expensive) conditional jump can be optimized away.
Oh great you could have put this video up before our computing exam.
I haven't had to write many sorting algorithms lately, though I have been doing plenty of hand-sorting. For that I use a variation on a merge-sort.
Thank you! Now I can sort my books more easily
but keep in mind that the software is getting more and more complex and does stuff you couldn't think about 20 years ago.
I'm not saying all that performance goes into "quality - tasks" that enhance the experience (probably mostly programming inefficiencies), but the overall result of the power is performance where it counts (scientifically at least)
in the other video, he talks about how the complexity of these algorithms is polynomial. Quicksort is n*log(n) just like Mergesort, but the other factors are very small, which makes it very fast. If you want to read about algorithms which have nice complexity classes look at Smoothsort and Timsort =)
So does that mean one can use bubble sort until a certain number n, before switching over to merge sort? For an algorithm that accounts for both case advantages?
Check Wikipedia. I think, you will find there precise number of comparisions and memory operations. For Quick Sort, you will see something like 3nlogn + 4n comparisions and 2nlogn copies. In Merge Sort, you will need, like 6nlog + 3n copies and nlogn + 2n comparisions. Even though these algorithms are the same in Big O notation, Quick Sort still requires 2 times less work to perform. I hope, this will make things clearer.
I have no idea what you mean by "stop when the index of the itemFromLeft > index of itemFromRight". if we're using the index of the items after they are swapped as in the second swap wouldnt we stop after the first swap?
yes. for most apps which get much past "dead trivial" memory use starts getting a lot more important.
even for fairly trivial apps, newbie programmers will often do something seemingly trivial and then wonder where all the memory has gone.
a few arrays here, a few strings there, and suddenly the app is out of memory...
Interesting. I guess the thing that's hardest to wrap my head around is how computers handle numbers. It's easy to understand a computer's mindlessness in relation to real-world concepts. It's much tougher to recognize that that carries over to numbers as well, since numbers are viewed as cold, concrete, and divested from reality, much like the computers that use them.
Big O notation hides the difference between (e.g.) 2n*log(n) and 6n*log(n). The second algorithm would be 3 times slower than the first, but both would be O(n*log(n)).
Also, in the case of Quick Sort the worst case is very rare, especially if the pivot isn't chosen as naively as in the video.
The one thing I think this video didn't mention was that while quicksort's asymptotic complexity is similar to merge and bubble sort. Its worst case is far less likely to occur than in merge or bubble. So average case runtime will generally be faster for quicksort than merge or bubble.
reminds me of the days of decision maths, the different types of quick sort, boubble sort and bin packing
For many small scale userland applications, it's not very important at all. However, many services, especially online services such as social networking sites and online search engines, deal with no less than petabytes of information. In these types of scenarios, if you can shave off even one byte of memory usage for each entry in a table, there's a potential savings of many terabytes of memory.
I may be imagining things, but I'm sure a while ago Numberphile or some similar channel did a video on it and showed why the previous one will get full.
I recently learned to my astonishment that finding the median can be done in O(n). Search term is "median of medians algorithm". I doubt that it's done very often though, since just randomizing the pivot is much easier and should almost always be faster. Plus, you need to select a pivot for that selection algorithm, too, so maybe you'd just delegate the problem.
It's all about memory and what's being processed. A program only processes one or two numbers at a time. As far as an algorithm is concerned, whatever instruction the computer is executing at the moment, the numbers that instruction are looking at are the only ones it "knows about" directly; the rest are stored away "out of mind", so to speak. Contrast the human brain which processes massive amounts of information simultaneously (and the processor is also the memory, distributed throughout).
I think the best way to understand how computers do calculations is basically to look at how a mechanical computer works, because digital computers work, in principle, the same way just that they use electric currents instead of physical objects.
This helped me out. Thanks!
Is there any video planed about the Bit Coin? I would love to hear about that.
O(n) for shuffling a list doesn't matter overall, as the shuffle + quicksort would be O(n + n * log n), which works out to be just O(n * log n). And Quicksort is generally faster than other algorithms on average.
Picking a random pivot is better or equal to shuffling in any case. If you're working with something like a linked list, it's equivalent. With actual arrays it's O(1). I can't think of a reason why it would be preferable to shuffle a list as opposed to just picking a random pivot.
You're considering only moving a card as an operation, but reading the rank of the card is an operation too. You still need to read all the value of each card for each level of the pyramid: more levels => more reads.
Merge sort is also the fastest sorting method in real life (that I'm aware of). Whenever I have a stack of papers in random order that needs to be sorted, I sort small sets of three or more papers, then merge every two sets into one until the entire stack is in order.
I'm so nerdy. O_o
that was an awesome tutorial so easy to understand and so simple damn!
Well, your point was valid. It was like I was saying not to use square wheels but you have suggested using pentagons and pentagons aren't squares. Neither is a good idea :)
You keep using green as if it was your favorite color much more than any other!