I love teaching this to my students as a guessing game. I will think of a number from 1 to 100 and give them 8 guesses, but after each guess tell them if their guess was correct or lower/higher. After a few rounds of playing, most people figure out that the optimal strategy is guessing somewhere in the middle, and essentially discover the algorithm on their own. After that, we implement the gussing game on a computer. I find this a very fun way to learn about binary search.
I've been doing this exact same lesson for years too! Once you've introduced variables, conditionals and looping then you have the basic building blocks of algorithms. Binary Searching with the guessing game is a nice easy way to introduce this fundamental algorithm.
I audited MIT's 6.0001 (Introduction to Computer Science and Programming in Python) course through OCW and the professor did the exact same demo to introduce binary/bisection search!
Dr Pound sounds and acts identically to someone who went to my secondary school in London, cheeky lad in IT class who aced all his assignments while getting his computer to run games we weren't supposed to have access to. So, yes.
Dr. Pound is excellent and entertaining as always. I found myself smiling and giggling throughout the whole video due to the way he handles his small mistakes and jokes around. Yet he also imparts wisdom!
The point made at 14:30 is arguably the most important part of this video. Specifically, it is less important to know how to write, from scratch, a particular algorithm than it is to know that different algorithms have different tradeoffs. Knowing how to pick the best algorithm (and data structures) for a particular situation is more important than being able to implement an algorithm on a whiteboard. One of my interview questions explores whether an applicant understands that there are many sorting and searching algorithms and the nature of the data being manipulated, and other constraints, is important in deciding which algorithm to use.
One thing to add: Once the data is sorted, and you need to add new data to it, binary search can actually tell you where you need to insert the new data in order to keep the data sorted. Therefore, the real issue is getting the data sorted in the first place (as mentioned in the video), but once it is sorted, binary search helps you find stuff fast and helps you keep your data sorted even when you modify it.
Adding a new piece of data to the middle of a giant array would require changing a lot of memory though. Which is why we might use something like a binary tree instead.
10:25 just wanna note here for anyone trying to implement this algorithm for very large arrays, be aware of the integer overflow. (l + r) // 2 will first calculate l + r and then try to divide it by two. however if your list is lets say 2/3rds of the size of your integer and if you then add l + r and l and r are quite large, then they will overflow. Then your m will be messed up and your binary search will be incorrect. And this is not a contrived scenario, there was this exact kind of bug in the java SDK, for 9 years! I was actually surprised that Mike didn't mentioned this :O
@@solqr6521He should still write the code the l + (r-l)//2 way because code will inevitably be found and copied to a platform that has 32 bit integers (e.g. C) and then break.
@@solqr6521 Why wouldn't there be overflow in Python ? An integer has only a finite size , so if you add more than it can hold , it should wrap around (overflow).
@@raphaelfranzen9623 Integer overflows don't happen if you are using just python because they allocate arbitrary amount of data to the variables, so if a number should have overflowed, python allocates more momory to store the variable in
Finding theft (or damage) on CCTV footage is a fantastic real world use for binary search. As long as something visually changes (i.e. goes missing or gets broken), then it turns the task of watching the cause into a 5 min task instead of hours/unjustifiable.
We need more videos explaining in low level details how various common algorithms work, such as file I/O, B trees, image compression, files representation (PDF, audio, video, archive, ELF), Huffman coding, hashing algorithms, md5sum, dynamic programming. I mean walk through the pseudocode line by line explaining what it does, and draw the memory diagrams, like what is happening to the data, how the bits or bytes are stored, loaded, moved around.
It's an excellent point to remember that sorting is O(n log n), wheras a linear search is O(n). Linear search could definitely prevail for a small number of lookups.
@@IceMetalPunkNo, for a small number of items, the exact number is hardware dependant a linear search will beat everything apart from a direct index access. linear searches on arrays are very cache friendly, as you walk the array, the CPU is already pre fetching 64 bytes ahead of you. We do this all the time for fast lookups of small tables.
@@IceMetalPunk Well thats where theory and reality differ. In HF LL trading we use linear lookups all the time because they are faster period, for small arrays. Even without a cache they would still be faster, no set up time, no complicated logic, simples.
I changed a linear search on a calibration step for a product on the production line to binary searching. This small change netted millions of minutes of production line weekly(test used to take minutes and we produced over 1million units a week) back to the CM. The main point being is it doesn't have to be a large data set it could be a small known dataset but that each takes time to search or in this case "test".
This was great! It really highlighted the need for the development of 'logical' learning, and not just 'proceedual' or 'fact' learning. Actually a very 'key vlog', in all the excellent vlogs that have been made.
The point about knowing when to use an algorithm is the most important bit of knowledge to take away from this video. I've forgotten how to exactly write plenty of algorithms, but those things can always be looked up.
Constructing a list that way is really inefficient. Even if the search is O(log(n)), each insertion will be O(n). At that point, just use some kind of a tree data structure instead.
@@tiarkrezar It depends how your data is stored. For example, if you normally only store data in even-numbered positions, but insert into odd-numbered positions, then redo the entire list when you hit a collision, then insertion is only O(n^0.5). But, yeah, a naive array implementation of a sorted list will have problems with insertion speed.
Currently doing my second year in computer science and you made me realize when I'd want to use BST in projects. Sort in morning, many lookups throughout the day. Makes so much sense but I hadn't realized it, Well done !!!
I love to show these examples in my programming courses (day job); one of the arguments that sometimes is held against more complicated algorithms is that "yes, less steps but each step is significantly more complicated"; but then you do the math and compare 100M steps vs. 23 and see that they would have to be quite a bit more complicated to make the 23 step version perform worse than the linear search one. And I also encounter people who would solve a "find in 10 items" problem with such an algorithm... Cost/Benefit analysis gets quite complicated if you don't have an idea about the number of items involved, or the number of executions required.
I think it's always important to learn that there is more to performance than just big O. With webdev being so popular, people focus a lot on scalability. But there are lots of situations with very predictable dataset sizes thus optimizing functions themselves becomes more important
Binary searches over some state space with a non trivial checking condition (like some monotonic equation) is also fascinatingly useful. Also if you rectify the state by changing basis or something. One of my favourite algorithms!
I wish I had this guy as an instructor when I was going to college. I may have stood a chance with some of the data structures and algorithms classes I was taking at the time :)
This is also how a type of analogue to digital converter (ADC) works, it is successive approximation (SAR). Interestingly when used for ADCs it corresponds directly to binary. The first step where you have a comparator with half the max voltage and the input voltage, the result from the comparison is the most significant bit (MSB) of the value, then when you perform the next comparison it becomes the next bit and that continues on until it reaches the closest that the ADC can get. This method relies on a way to create all the intermediate voltages for the comparison so SAR ADCs contain digital to analogue converters (DAC) too. Also due to the nature of analogue voltages, similar to real numbers, the search will only very rarely be able to end early, since a comparison between two analogue voltages or between two real numbers is very unlikely to result in them being equal and may be impossible if the comparator only output less than or greater than. This does mean that these ADCs and real numbers will pretty much always take the same number of steps to complete.
In theory, yes binary search is faster. In practice a linear search might actually be much much faster. The one thing that makes a big difference is cache. Computers read ahead memory. So it greatly helps if all entries are stored sequentially. If the binary search is implemented in a tree, the entries probably are all over the place in memory, not sequentially, causing cache miss after cache miss. This will severely impact the performance. Obviously when searching billions of entries, binary is most likely to win. But in a lot of real life situations, the lists are much smaller. Small enough to fit in a cache line. Making the search almost instant. So never just assume what you learn in computer theory is fact, use different setups for your problem and measure(!) to determine what solution works best.
Well, if someone thinks binary search is always faster, then they didn't really learn anything did they? The algorithm clearly gives you a speedup by halving the search space at every step, then obviously it is faster only for large numbers. One might even try to quantify this with: for n = 2^m, if m
Another reason why linear search can be faster for small lists is that modern cpus heavily rely on pipelining and branch prediction. Basically for every if-else, the cpu guesses which of the two paths will be executed next and already executes the first instructions of that path in the background. If the guess was correct, then the results are already there for free. But if the guess was wrong, it is much more expensive to take the other path. In a linear search, if the cpu predicts "element not found, move to next item in list", then that will be correct in all but one case. But in a binary search, the expected results are 50% left and 50% right - so branch prediction fails 50% of the time instead of just once.
@Quarkly_ No need to be pedantic here. Log n is always smaller than n. Also for small numbers. Using binary search on a list of 10 items, worst case is 3 steps. Worst case linear search is 10 steps. Yet, as I pointed out, linear might be much faster in that case even though theory says binary search is at least 3 times faster. So, at least to me, this isn’t trivial.
Interesting! So what would the break-even size, between binary search and linear search? I had a look and the x64 cache line is typically 64 bytes, which wouldn't seem to allow for a very useful amount, but this is all new to me.
Well, the break-even size depends on a lot of things. From your code efficiency, the programming language, the compiler you use all the way to the specifics of the data set (some data types use more computation power to compare than others). But i've never seen a case where that is above 60, so I guess it won't ever get big.
I used to watch you all the time when i was in uni around 3-4 years ago. Just stumbled upon this video and glad you're still going strong! keep up the videos! it's always a treat to watch, even if i'm already familiar with the topic
If you use the values of l and r (call them L and R), then you can do a weighted binary search which usually performs a little bit better than one that defines m = (l+r)//2 So at 3:13 a weighted binary would say a = 17 , l = 4 (0-based), L = 16, r = 7, R = 9000 (unchecked in video so i made it up) Because a is very close to L and very far from R, you find m as usual but then bias it more in favour of L than R to the point it shakes out as "5" and so finds a immediately, and you don't have to make L = 10 :P
If the list is in order you can even make it simpler to determine whether you search further by opening the first and last box and if the target number isn't between those in value it's not going to be inbetween those boxes either.
First of all, thank you for agreeing that the mere fact one is enhancing their skills makes it a career move. Most employers are notorious for despising early beginnings.
in python i prefer using l=0 r=len(arr) now right is exclusive and points on invalid greater parts. the only +1 indexing will happen in the main loop now. you only check the end of valid array is 1 pos further with: „while l+1
Interestingly enough, linear search is likely to be faster than binary search for a small array of integers, due to prefetching and branch prediction mechanisms of modern CPUs.
Fun fact: if you're searching through a list of unknown size (maybe a stream of data) that's ordered then you can do a binary search by checking powers of 2 in a similar way. If the value is greater than m, m *= 2.
As a business intelligence professional I can conclusively state how important by-section of data is to resolving problems. You discard half the data from the issue each iteration.
Shouldn't you check the first and last item in a sorted list to make sure the number you're searching for is in range? It should be more efficient by potentially circumventing the need of running a search algorithm all together.
Indeed. when its out of range it shrinks in a linear fashion until the ranges swap and returns false but also works for when is in range and is not on the list. However, that is a matter of implementation. Here's same function with your solution: def binary_search(lst, a): l = 0 r = len(lst) - 1 if a > lst[r] or a < lst[l]: return print("Out of Range") while l lst[m]: l = m + 1 elif a < lst[m]: r = m - 1 else: return True return False
I am teaching myself programming and was literally practicing writing this kind of code yesterday, working on data sorting algorithms. So glad this is being written in Python. I started with C and this is a great window into how C maps on to Python.
You don't need to sort all of the time. Before you put new data into the list, you search first, then insert the data where it belongs in the list. So you are using a binary search to insert an item into an already sorted list. Which is much more efficient than adding data, then resorting. Heck, i implemented this in basic on an Apple ][ back in the day, because sorting routines were just slow. Searching and copying to make a hole for the new data was faster.
You don't really want to be using a list if you have a large number of insertions, since what you're describing has a time complexity of O(q(n + logn)). A better approach is using a Red-Black Tree or Balanced BST, for O(log(n)) insertion and query.
would it be an improvement to first check the first and last index to see if the given number is even in range before looking through the boxes? also would it be advisable to place the first lookup point based on where between highest and lowest it is? so if its closer to the lower bounds, then maybe starts dividing a little sooner like at 40% of the dataset? would this add too much performance overhead or would it possibly an improvement? i know that division is a comparatively costly operation. I would be super interested on someone elses input on this :)
make it faster by giving ranges for each of the boxes, thats then only one step to get to the right bix, which can be sub-divided to another range boxes, its better than log(n) of binary search, having ranges is better than just having the numbers in order, if you have different boxes for all different numbers, then the search is O(1), its bucket search, essential is that you pre-calculate the ranges and not during the search. works more like heap, which does the processing on the background and not during the searches/updates. if you have unique box for each different number, at a known index, which you can jump to, directly, then it works like radix sort for sorting. radix search for buckets might be super duper fast.
If you try it, don't convert to list and then sort it, like it was done in this video. Instead, sort the array first using numpy (np.sort()). It is way faster because numpy has less overhead. It always knows the type of every element so it doesn't make type comparisons like python does in the background.
would it be a good thing to check the last element of the array, and see if it is bigger than the number we want to serach? because if it is smaller, than the list do not contain the number, so it can just return with false
If the search gets REALLY big, I assume the data will be read in a record of fixed length. Would checking the Last data in the record be useful? Like if you're in the library and you may have to switch rooms or floors, should you check the last book so you won't have to come back to that room later when you notice you were SO close?
Somewhat related, "How do you keep the list sorted?" Why, use a binary tree of course! Somewhat like picking up a hand of playing cards one at a time. Find the spot where it belongs using binary search, and insert the new item where it belongs. Yes, I know there are more sophisticated trees, (i.e. red-black, avt, etc...), but the crux of the matter is, if you insert items in order in the first place, then you never have to sort the list.
If you don't just half the array but instead assume an even distribution and do a linear interpolation between the item left, right and the searched number... how much does it speed up the algorithm and what is that called? Interpolation search?
One way you could make a presorted list of random numbers is to use randint to increase an int at the index in an array equal to the generated number. Then once you have generated the quantity of integers you want you go through that array to set the values of a new array inserting x 0s where x is the value at index 0, x 1s where x is the value at index 1, etc. This would be slower than generating an unsorted list but is significantly faster than generating an unsorted list and then sorting it, because technically it's still an O(n) algorithm, and no sorting algorithm is O(n).
Assuming int size 4 bytes, generating ints up to 2 billion, it looks like you're trying to create about a 7.5 gigabyte array. It's infeasible for large numbers. It would run on O(m+n), where m is the size of the largest element, whereas generating and sorting would most likely run on nlog(n), where n is the number of elements. It's important to recognize you're still essentially just sorting a random list, just with your own custom sorting algorithm. Edit: I didn't know it had a name, it's called "counting sort", and it's apparently commonly used as part of Radix Sort. Both of those might provide you with an interesting read.
My take would be to create a list starting with a random number between 0 and, say 100, then add a number by adding a random number from 0 to 100 to the previous list. Runs in O(n), not great, but hey.
No *comparison-based* sorting algorithm is O(n). Radix-Sort does not use comparisons and achieves O(n). It can't be used for every type of data, but it works perfectly on integers.
If you for some reason have knowledge of the distribution of the values in the collection, perhaps by calculating it while sorting it, would it be worth it to use the distribution to find the number faster by making more educated guesses?
Ideally, if a student was learning this, even tho it seems that iteration is the most obvious way to implement this, would u encorage them to make a recursive functin?
Not only did you swap X for A, but you also called your list variable "lst", which in that font looks like "1st", which makes it even more confusing :D
If you use a random function it probably will have numbers evenly distributed of order them by the numbers. that means you could also just number them from 1-1000000. it would take the same amount of time more or less i think.
So I built this with the explanation before I found out he had actual python code in there, and I did it slightly differently. Why use floor division when you can just do a binary shift on "m"? Wouldn't a binary shift be exponentially faster considering that computers aren't really optimized for division?
A typical table will have entries in added in chronological order, and the primary key doesn't have a particular meaning. Does making presorted copies of the database for every column pay off in practical applications?
And obvious if you have a type of distribution (random, bell curve, ...) you could estimate the location based on the first/last value in the range to check ...
What about « ponderating » the « middle » by adding two extra steps at the beginning, looking for boundaries values. With that known, instead of taking the middle for next « check » we could check closer from the left or right if the needed number is closest to the given boundary. Do you think it would gain probability to go faster ? If yes, how much ? Second idea : to build the random order list, could be as idea to build it sequentially ba pushing new item in list which would be « new pushed item is last one plus a random number between 1 and … let say… 3 ». You would get an ordered list without doubles and spaced from 1 to 3.
The first idea would work as long as the numbers are uniformly distributed among the list. Which probably happens with how this list was produced (lots of random = linear distribution). But in any case where the list is not uniformly distributed it would get a lot slower. And lots of lists are not uniformly distributed. Take a look at Benford's Law for an example.
O(log2(n)) I wonder how much this would change, on average, if you start by seeing if your number is smaller than the first entry or bigger than the last entry.
Binary search is a notoriously hard algorithm to get right on the first attempt. The key insight is that one of l or r must change at every iteration, but there are subtleties to do with things like integer overflow, caching and so on that make it an extremely bad idea to write your own version. You should almost always use a library version of the code (these days that's true for most well-known algorithms).
Yeah, you shouldn't write your own code to do something the CPU, OS, compiler, or an available library will do for you (unless you know what you're doing and have a compelling reason). At best, you create something marginally better optimised for your specific use case; at worst, you create a buggy mess that ends up fighting the system as both it and your code attempt to achieve the same things in different ways.
Anyone who has ever looked up a name in a phone book, or used a dictionary, intuitively knows what a binary search is. Keeping driving by half until you find the page you are looking for.
Could have made a fake list that just returns a value y(x) when accessing index x. That way you're not memory bound since the values can be calculated as they are accessed.
13:06 ;-; bit painful to immediately convert this lovely numpy array to a list, to sort it immediately after, instead of using the (much faster) np.sort first haha
there is definitely some truth to that. but sometimes you can't bring your data into a format that the libraries can interact with. like if you have a huge file, then you may not be able to fit the relevant parts into memory so that you can hand it off to the library for example to do a search. i have encountered this myself. it's those kinds of situations where it's useful, but i admit is it's not something most people have to deal with
9:38, gotta say, 0 point to "r - l" (you put + for some reason, why would anyone look for MORE length than actually have?), it's the call is more likely to look like this: "++middle_index; return bsearch( base + middle_index, length - middle_index, ... );"
Assuming this isn't an ironic question: Pre-mobile phone, when hardwired dial-phones running over copper was the only telecommunications infrastructure, the only way to search a number (without phoning the operator), was to look it up in a phonebook. Which was a quarterly-printed physical book of everyones phone number in the immediate area, organised by surname.
Imagine your “contacts” app, but with orders of magnitude more names and phone numbers, and printed in a big book (on really thin paper to make the book slightly less enormous).
I love teaching this to my students as a guessing game.
I will think of a number from 1 to 100 and give them 8 guesses, but after each guess tell them if their guess was correct or lower/higher.
After a few rounds of playing, most people figure out that the optimal strategy is guessing somewhere in the middle, and essentially discover the algorithm on their own.
After that, we implement the gussing game on a computer.
I find this a very fun way to learn about binary search.
i'm imagining doing that in a classroom, and... yeah, sounds fun! Cool!
I'm totally stealing this idea!
imagine your teacher naming himself craftmechanics6483 with a minecraft pfp on youtube XD
I've been doing this exact same lesson for years too! Once you've introduced variables, conditionals and looping then you have the basic building blocks of algorithms. Binary Searching with the guessing game is a nice easy way to introduce this fundamental algorithm.
I audited MIT's 6.0001 (Introduction to Computer Science and Programming in Python) course through OCW and the professor did the exact same demo to introduce binary/bisection search!
These men make me feel as if I am in a coffee house with my best friends talking
Dr. Pound and Dr. Grimes from the numberphile videos are my absolute favorite!
Dr Pound sounds and acts identically to someone who went to my secondary school in London, cheeky lad in IT class who aced all his assignments while getting his computer to run games we weren't supposed to have access to. So, yes.
You spelt pub wrong.
Anybody who sits around and talks like this has no friends in the first place and that's just science!
@@njd9828 you spelt pubic wrong
Dr. Pound is excellent and entertaining as always. I found myself smiling and giggling throughout the whole video due to the way he handles his small mistakes and jokes around. Yet he also imparts wisdom!
The point made at 14:30 is arguably the most important part of this video. Specifically, it is less important to know how to write, from scratch, a particular algorithm than it is to know that different algorithms have different tradeoffs. Knowing how to pick the best algorithm (and data structures) for a particular situation is more important than being able to implement an algorithm on a whiteboard. One of my interview questions explores whether an applicant understands that there are many sorting and searching algorithms and the nature of the data being manipulated, and other constraints, is important in deciding which algorithm to use.
One thing to add: Once the data is sorted, and you need to add new data to it, binary search can actually tell you where you need to insert the new data in order to keep the data sorted. Therefore, the real issue is getting the data sorted in the first place (as mentioned in the video), but once it is sorted, binary search helps you find stuff fast and helps you keep your data sorted even when you modify it.
Adding a new piece of data to the middle of a giant array would require changing a lot of memory though. Which is why we might use something like a binary tree instead.
Depending on the problem the performance benefit of keeping the data adjacent in memory might offset the overhead of shifting it around.
In these times when AI is all the hype, these are the videos we need. The fundamental algos from the 60's are here to stay and run the world.
AI was used to help find an improvement on binary search recently. It’s covered in keynote talk of the 2023 CPP con.
@@thatchessguy7072wHAT?
@@thatchessguy7072 🤞 all watched over by machines of loving grace
I'd love to see a video explaining the concepts behind how the different sorting algorithms work (Heap, Merge, Shell, etc)
I'd love a video on hash maps!
@@JanB1605I loved my prof I had for computer science 2 which was lots of this. He was like this video, plus comedy, and simplified PowerPoint.
They exist here on RUclips. I've seen them.
10:25 just wanna note here for anyone trying to implement this algorithm for very large arrays, be aware of the integer overflow. (l + r) // 2 will first calculate l + r and then try to divide it by two. however if your list is lets say 2/3rds of the size of your integer and if you then add l + r and l and r are quite large, then they will overflow. Then your m will be messed up and your binary search will be incorrect. And this is not a contrived scenario, there was this exact kind of bug in the java SDK, for 9 years!
I was actually surprised that Mike didn't mentioned this :O
We try to teach the l+(r-l)//2 -method to avoid this.
It was in Python, where there wouldn’t be overflow
@@solqr6521He should still write the code the l + (r-l)//2 way because code will inevitably be found and copied to a platform that has 32 bit integers (e.g. C) and then break.
@@solqr6521 Why wouldn't there be overflow in Python ? An integer has only a finite size , so if you add more than it can hold , it should wrap around (overflow).
@@raphaelfranzen9623 Integer overflows don't happen if you are using just python because they allocate arbitrary amount of data to the variables, so if a number should have overflowed, python allocates more momory to store the variable in
Finding theft (or damage) on CCTV footage is a fantastic real world use for binary search. As long as something visually changes (i.e. goes missing or gets broken), then it turns the task of watching the cause into a 5 min task instead of hours/unjustifiable.
We need more videos explaining in low level details how various common algorithms work, such as file I/O, B trees, image compression, files representation (PDF, audio, video, archive, ELF), Huffman coding, hashing algorithms, md5sum, dynamic programming. I mean walk through the pseudocode line by line explaining what it does, and draw the memory diagrams, like what is happening to the data, how the bits or bytes are stored, loaded, moved around.
Textbooks are your best bet for finding and understanding (easy to quickly reread sentences) that kind of material.
@@ross951 Lectures are better because then we would be able to see exactly how all the bits and bytes are moved around in the memory.
They've done videos on many of the topics you've mentioned.
That's not what Computerphile is. Sounds like you should start your own channel.
@@chessthecat Who are you to say?
I can sit and hear Dr Mike Pound talking all day
Yeah, everyone on the corridor can sit and hear him talking all day
One of my favourite books from almost 50 years ago - Sorting and Searching by Donald E Knuth - tells you all you need to know!
It's an excellent point to remember that sorting is O(n log n), wheras a linear search is O(n). Linear search could definitely prevail for a small number of lookups.
That's *if* your data isn't already sorted, of course :)
@@IceMetalPunkNo, for a small number of items, the exact number is hardware dependant a linear search will beat everything apart from a direct index access. linear searches on arrays are very cache friendly, as you walk the array, the CPU is already pre fetching 64 bytes ahead of you. We do this all the time for fast lookups of small tables.
@@turdwarbler Cache lookups aren't considered in algorithmic runtime analysis.
@@IceMetalPunk Well thats where theory and reality differ. In HF LL trading we use linear lookups all the time because they are faster period, for small arrays. Even without a cache they would still be faster, no set up time, no complicated logic, simples.
I changed a linear search on a calibration step for a product on the production line to binary searching. This small change netted millions of minutes of production line weekly(test used to take minutes and we produced over 1million units a week) back to the CM. The main point being is it doesn't have to be a large data set it could be a small known dataset but that each takes time to search or in this case "test".
This was great! It really highlighted the need for the development of 'logical' learning, and not just 'proceedual' or 'fact' learning. Actually a very 'key vlog', in all the excellent vlogs that have been made.
for most problems, I prefer a strict L
Thank you this channel is going to save me before GCSEs
More Dr Pound videos on search / sorting!
14:15 My analogy for going through computer science is that we add a bunch of tools to a toolkit and learn when to use which tools.
The point about knowing when to use an algorithm is the most important bit of knowledge to take away from this video. I've forgotten how to exactly write plenty of algorithms, but those things can always be looked up.
Don't forget you can also use binary search to implement a container that is sorted on insert. (although there are other approaches)
Constructing a list that way is really inefficient. Even if the search is O(log(n)), each insertion will be O(n). At that point, just use some kind of a tree data structure instead.
@@tiarkrezar It depends how your data is stored. For example, if you normally only store data in even-numbered positions, but insert into odd-numbered positions, then redo the entire list when you hit a collision, then insertion is only O(n^0.5).
But, yeah, a naive array implementation of a sorted list will have problems with insertion speed.
@@tiarkrezar I did say there are other approaches. Insertion sort, Self-balancing BSTs like AVL trees and Red-Black trees, etc...
Currently doing my second year in computer science and you made me realize when I'd want to use BST in projects.
Sort in morning, many lookups throughout the day.
Makes so much sense but I hadn't realized it, Well done !!!
Wow, I can't believe there's never been a binary search Computerphile video before now! 🤯
I love to show these examples in my programming courses (day job); one of the arguments that sometimes is held against more complicated algorithms is that "yes, less steps but each step is significantly more complicated"; but then you do the math and compare 100M steps vs. 23 and see that they would have to be quite a bit more complicated to make the 23 step version perform worse than the linear search one. And I also encounter people who would solve a "find in 10 items" problem with such an algorithm... Cost/Benefit analysis gets quite complicated if you don't have an idea about the number of items involved, or the number of executions required.
I think it's always important to learn that there is more to performance than just big O. With webdev being so popular, people focus a lot on scalability. But there are lots of situations with very predictable dataset sizes thus optimizing functions themselves becomes more important
I already knew everything in this video, but I still watched it.
Binary searches over some state space with a non trivial checking condition (like some monotonic equation) is also fascinatingly useful. Also if you rectify the state by changing basis or something. One of my favourite algorithms!
I wish I had this guy as an instructor when I was going to college. I may have stood a chance with some of the data structures and algorithms classes I was taking at the time :)
There are 10 types of people, those who watch Binary Search videos, and those who don't.
There are 10 types of people in the world: people who understand binary, those who don't, and those who know this joke is trinary.
Clever
prof Pound is the Richard Feynman of Computer science
I love binary search. If look up times are really low binary search is as good as unbeatable because of computational simplicity.
This is also how a type of analogue to digital converter (ADC) works, it is successive approximation (SAR). Interestingly when used for ADCs it corresponds directly to binary. The first step where you have a comparator with half the max voltage and the input voltage, the result from the comparison is the most significant bit (MSB) of the value, then when you perform the next comparison it becomes the next bit and that continues on until it reaches the closest that the ADC can get. This method relies on a way to create all the intermediate voltages for the comparison so SAR ADCs contain digital to analogue converters (DAC) too.
Also due to the nature of analogue voltages, similar to real numbers, the search will only very rarely be able to end early, since a comparison between two analogue voltages or between two real numbers is very unlikely to result in them being equal and may be impossible if the comparator only output less than or greater than. This does mean that these ADCs and real numbers will pretty much always take the same number of steps to complete.
Simply and elegantly illustrated! Thanks, folks 👍
Well that was a nice refresher of the basics ^^
In theory, yes binary search is faster. In practice a linear search might actually be much much faster. The one thing that makes a big difference is cache. Computers read ahead memory. So it greatly helps if all entries are stored sequentially. If the binary search is implemented in a tree, the entries probably are all over the place in memory, not sequentially, causing cache miss after cache miss. This will severely impact the performance.
Obviously when searching billions of entries, binary is most likely to win. But in a lot of real life situations, the lists are much smaller. Small enough to fit in a cache line. Making the search almost instant.
So never just assume what you learn in computer theory is fact, use different setups for your problem and measure(!) to determine what solution works best.
Well, if someone thinks binary search is always faster, then they didn't really learn anything did they? The algorithm clearly gives you a speedup by halving the search space at every step, then obviously it is faster only for large numbers. One might even try to quantify this with: for n = 2^m, if m
Another reason why linear search can be faster for small lists is that modern cpus heavily rely on pipelining and branch prediction. Basically for every if-else, the cpu guesses which of the two paths will be executed next and already executes the first instructions of that path in the background. If the guess was correct, then the results are already there for free. But if the guess was wrong, it is much more expensive to take the other path.
In a linear search, if the cpu predicts "element not found, move to next item in list", then that will be correct in all but one case. But in a binary search, the expected results are 50% left and 50% right - so branch prediction fails 50% of the time instead of just once.
@Quarkly_ No need to be pedantic here. Log n is always smaller than n. Also for small numbers. Using binary search on a list of 10 items, worst case is 3 steps. Worst case linear search is 10 steps. Yet, as I pointed out, linear might be much faster in that case even though theory says binary search is at least 3 times faster.
So, at least to me, this isn’t trivial.
Interesting! So what would the break-even size, between binary search and linear search? I had a look and the x64 cache line is typically 64 bytes, which wouldn't seem to allow for a very useful amount, but this is all new to me.
Well, the break-even size depends on a lot of things. From your code efficiency, the programming language, the compiler you use all the way to the specifics of the data set (some data types use more computation power to compare than others). But i've never seen a case where that is above 60, so I guess it won't ever get big.
I used to watch you all the time when i was in uni around 3-4 years ago. Just stumbled upon this video and glad you're still going strong! keep up the videos! it's always a treat to watch, even if i'm already familiar with the topic
Pounder can make even this basic algorithm sound complicated.
This is the CS equivalent of asking a band to play the classics. Nice.
If you use the values of l and r (call them L and R), then you can do a weighted binary search which usually performs a little bit better than one that defines m = (l+r)//2
So at 3:13 a weighted binary would say a = 17 , l = 4 (0-based), L = 16, r = 7, R = 9000 (unchecked in video so i made it up)
Because a is very close to L and very far from R, you find m as usual but then bias it more in favour of L than R to the point it shakes out as "5" and so finds a immediately, and you don't have to make L = 10 :P
Built into COBOL as the SEARCH ALL command - often overlooked but fast if you know about it.
If the list is in order you can even make it simpler to determine whether you search further by opening the first and last box and if the target number isn't between those in value it's not going to be inbetween those boxes either.
@7:00 l+r might be an integer overflow. Depending on your language that might be undefined.
First of all, thank you for agreeing that the mere fact one is enhancing their skills makes it a career move. Most employers are notorious for despising early beginnings.
Love the algorithm explanation videos! Mooooore!!
in python i prefer using l=0 r=len(arr)
now right is exclusive and points on invalid greater parts.
the only +1 indexing will happen in the main loop now.
you only check the end of valid array is 1 pos further with:
„while l+1
I really like the way you explain everything ... making it so simple
Always happy to see a Mike video! Great refresher.
😀
I think you get the color on your hands from touching the paper before the ink has dried. How humid are the rooms you're in?
The phone book analogy is SO powerful
Interestingly enough, linear search is likely to be faster than binary search for a small array of integers, due to prefetching and branch prediction mechanisms of modern CPUs.
Fun fact: if you're searching through a list of unknown size (maybe a stream of data) that's ordered then you can do a binary search by checking powers of 2 in a similar way. If the value is greater than m, m *= 2.
As a business intelligence professional I can conclusively state how important by-section of data is to resolving problems. You discard half the data from the issue each iteration.
Shouldn't you check the first and last item in a sorted list to make sure the number you're searching for is in range? It should be more efficient by potentially circumventing the need of running a search algorithm all together.
Only if that is frequently the case.
Indeed. when its out of range it shrinks in a linear fashion until the ranges swap and returns false but also works for when is in range and is not on the list. However, that is a matter of implementation.
Here's same function with your solution:
def binary_search(lst, a):
l = 0
r = len(lst) - 1
if a > lst[r] or a < lst[l]: return print("Out of Range")
while l lst[m]:
l = m + 1
elif a < lst[m]:
r = m - 1
else:
return True
return False
It doesn't really impact performance as log time complexity is minuscule.
Nice explanation, thanks.
I am teaching myself programming and was literally practicing writing this kind of code yesterday, working on data sorting algorithms. So glad this is being written in Python. I started with C and this is a great window into how C maps on to Python.
You don't need to sort all of the time. Before you put new data into the list, you search first, then insert the data where it belongs in the list. So you are using a binary search to insert an item into an already sorted list. Which is much more efficient than adding data, then resorting.
Heck, i implemented this in basic on an Apple ][ back in the day, because sorting routines were just slow. Searching and copying to make a hole for the new data was faster.
You don't really want to be using a list if you have a large number of insertions, since what you're describing has a time complexity of O(q(n + logn)). A better approach is using a Red-Black Tree or Balanced BST, for O(log(n)) insertion and query.
simple yet efficient way of searching, doing it myself that way since I basically can think
would it be an improvement to first check the first and last index to see if the given number is even in range before looking through the boxes?
also would it be advisable to place the first lookup point based on where between highest and lowest it is?
so if its closer to the lower bounds, then maybe starts dividing a little sooner like at 40% of the dataset?
would this add too much performance overhead or would it possibly an improvement?
i know that division is a comparatively costly operation. I would be super interested on someone elses input on this :)
Hell yes! Here comes another great Computerphile video 🎉
make it faster by giving ranges for each of the boxes, thats then only one step to get to the right bix, which can be sub-divided to another range boxes, its better than log(n) of binary search, having ranges is better than just having the numbers in order, if you have different boxes for all different numbers, then the search is O(1), its bucket search, essential is that you pre-calculate the ranges and not during the search. works more like heap, which does the processing on the background and not during the searches/updates. if you have unique box for each different number, at a known index, which you can jump to, directly, then it works like radix sort for sorting. radix search for buckets might be super duper fast.
This is the first time an algorithm seemed useful
If you try it, don't convert to list and then sort it, like it was done in this video. Instead, sort the array first using numpy (np.sort()). It is way faster because numpy has less overhead. It always knows the type of every element so it doesn't make type comparisons like python does in the background.
Or just made the random numbers smaller and run a cumulative sum.
would it be a good thing to check the last element of the array, and see if it is bigger than the number we want to serach? because if it is smaller, than the list do not contain the number, so it can just return with false
love these discussions.
I taught this to my 1st year students less than a week ago.
Nice, was studying about it last monday... Thanks
If the search gets REALLY big, I assume the data will be read in a record of fixed length. Would checking the Last data in the record be useful? Like if you're in the library and you may have to switch rooms or floors, should you check the last book so you won't have to come back to that room later when you notice you were SO close?
Somewhat related, "How do you keep the list sorted?" Why, use a binary tree of course! Somewhat like picking up a hand of playing cards one at a time. Find the spot where it belongs using binary search, and insert the new item where it belongs.
Yes, I know there are more sophisticated trees, (i.e. red-black, avt, etc...), but the crux of the matter is, if you insert items in order in the first place, then you never have to sort the list.
If you don't just half the array but instead assume an even distribution and do a linear interpolation between the item left, right and the searched number... how much does it speed up the algorithm and what is that called? Interpolation search?
This was so informative and capturing, surprisingly my attention span gets a bit longer when prof Mike Pound speaks lol.
The box next to "16" could contain 16 again...
But now you've found your 16. If you need another one, you use a different algo.
I watched your video of steganography can you make that in detail to see how algorithm works
Calculating M can cause overflow. So, a better version could be -> M= FLOOR(L+ (R - L) / 2)
One way you could make a presorted list of random numbers is to use randint to increase an int at the index in an array equal to the generated number. Then once you have generated the quantity of integers you want you go through that array to set the values of a new array inserting x 0s where x is the value at index 0, x 1s where x is the value at index 1, etc. This would be slower than generating an unsorted list but is significantly faster than generating an unsorted list and then sorting it, because technically it's still an O(n) algorithm, and no sorting algorithm is O(n).
Assuming int size 4 bytes, generating ints up to 2 billion, it looks like you're trying to create about a 7.5 gigabyte array. It's infeasible for large numbers. It would run on O(m+n), where m is the size of the largest element, whereas generating and sorting would most likely run on nlog(n), where n is the number of elements.
It's important to recognize you're still essentially just sorting a random list, just with your own custom sorting algorithm.
Edit: I didn't know it had a name, it's called "counting sort", and it's apparently commonly used as part of Radix Sort. Both of those might provide you with an interesting read.
My take would be to create a list starting with a random number between 0 and, say 100, then add a number by adding a random number from 0 to 100 to the previous list. Runs in O(n), not great, but hey.
@@idonno87 add numbers one at a time in the correct order? Should be O(nlogn), binary search to find the insertion point over n elements.
No *comparison-based* sorting algorithm is O(n). Radix-Sort does not use comparisons and achieves O(n). It can't be used for every type of data, but it works perfectly on integers.
If you for some reason have knowledge of the distribution of the values in the collection, perhaps by calculating it while sorting it, would it be worth it to use the distribution to find the number faster by making more educated guesses?
thanks Dr Mike and thanks the Computerphile team, for and good video would, have loved to have had a lesson from Dr Mike
Love a good Pound...ing 😁
Ideally, if a student was learning this, even tho it seems that iteration is the most obvious way to implement this, would u encorage them to make a recursive functin?
Not only did you swap X for A, but you also called your list variable "lst", which in that font looks like "1st", which makes it even more confusing :D
If you use a random function it probably will have numbers evenly distributed of order them by the numbers. that means you could also just number them from 1-1000000. it would take the same amount of time more or less i think.
So I built this with the explanation before I found out he had actual python code in there, and I did it slightly differently. Why use floor division when you can just do a binary shift on "m"? Wouldn't a binary shift be exponentially faster considering that computers aren't really optimized for division?
A typical table will have entries in added in chronological order, and the primary key doesn't have a particular meaning. Does making presorted copies of the database for every column pay off in practical applications?
Thank you Dr. Pound!
we need to sort array in the first yes?
And obvious if you have a type of distribution (random, bell curve, ...) you could estimate the location based on the first/last value in the range to check ...
What about « ponderating » the « middle » by adding two extra steps at the beginning, looking for boundaries values. With that known, instead of taking the middle for next « check » we could check closer from the left or right if the needed number is closest to the given boundary.
Do you think it would gain probability to go faster ? If yes, how much ?
Second idea : to build the random order list, could be as idea to build it sequentially ba pushing new item in list which would be « new pushed item is last one plus a random number between 1 and … let say… 3 ». You would get an ordered list without doubles and spaced from 1 to 3.
The first idea would work as long as the numbers are uniformly distributed among the list. Which probably happens with how this list was produced (lots of random = linear distribution).
But in any case where the list is not uniformly distributed it would get a lot slower. And lots of lists are not uniformly distributed. Take a look at Benford's Law for an example.
What about n-ary search and what is the optimal n?
O(log2(n))
I wonder how much this would change, on average, if you start by seeing if your number is smaller than the first entry or bigger than the last entry.
Great video! I understood it very well.
Did you watch it on 13x ?
@@akiraincorrect4968 Yes
Binary search is a notoriously hard algorithm to get right on the first attempt. The key insight is that one of l or r must change at every iteration, but there are subtleties to do with things like integer overflow, caching and so on that make it an extremely bad idea to write your own version. You should almost always use a library version of the code (these days that's true for most well-known algorithms).
Yeah, you shouldn't write your own code to do something the CPU, OS, compiler, or an available library will do for you (unless you know what you're doing and have a compelling reason). At best, you create something marginally better optimised for your specific use case; at worst, you create a buggy mess that ends up fighting the system as both it and your code attempt to achieve the same things in different ways.
This looks easy on a sorted array but what about unsorted data? sorting it first in memory takes up computation cycles not considered here.
Anyone who has ever looked up a name in a phone book, or used a dictionary, intuitively knows what a binary search is. Keeping driving by half until you find the page you are looking for.
One the use case of binary search is for finding a squre root or cube root.
I think you need to follow up with bubble sort. Recursion is maybe the most important thing to learn.
Could have made a fake list that just returns a value y(x) when accessing index x. That way you're not memory bound since the values can be calculated as they are accessed.
13:06 ;-; bit painful to immediately convert this lovely numpy array to a list, to sort it immediately after, instead of using the (much faster) np.sort first haha
So maybe reading the r and do a rough percentage estimation might be even faster, right.
As a programmer of 15 years, I've never once needed to know how sorts work internally. That's what libraries are for.
Libraries which are created by programmers...
It's useful to know even if you have no interest in implementing them yourself
there is definitely some truth to that. but sometimes you can't bring your data into a format that the libraries can interact with. like if you have a huge file, then you may not be able to fit the relevant parts into memory so that you can hand it off to the library for example to do a search. i have encountered this myself. it's those kinds of situations where it's useful, but i admit is it's not something most people have to deal with
9:38, gotta say, 0 point to "r - l" (you put + for some reason, why would anyone look for MORE length than actually have?), it's the call is more likely to look like this: "++middle_index; return bsearch( base + middle_index, length - middle_index, ... );"
How fast is the algorithm in Julia, compared to Python?
What’s a phone book.
Assuming this isn't an ironic question: Pre-mobile phone, when hardwired dial-phones running over copper was the only telecommunications infrastructure, the only way to search a number (without phoning the operator), was to look it up in a phonebook. Which was a quarterly-printed physical book of everyones phone number in the immediate area, organised by surname.
Imagine your “contacts” app, but with orders of magnitude more names and phone numbers, and printed in a big book (on really thin paper to make the book slightly less enormous).