8.2 - Binary search.mp4 - Copy
8.2 - Binary search.mp4 - Copy
So the problem that we looked at in the previous video was given a list of n numbers,
zero, one, two, so on and so forth, n minus two and n minus one, right? This is a list of size n,
list of size n, and this is a list l. And given a query point, let's say 31, we wanted to find if the
problem, the problem was, is q present in l or not? Right? And for this we saw a very, very
simple algorithm, which is order of n. And what did we do there? We just checked each of
these elements. We just checked whether the zero th element equals to q or the first
element equals to q, so on and so forth. So roughly, we did about n comparisons and it took
us about order of n time, right? So it took us order of n time. What about space? Did we use
any additional space? No. So it took us order of n time. And apart from the space given for
this program, given for this data, because without having the list itself, we can't do anything,
right? So apart from the input here, my input, my input for this algorithm was the list itself
and the query point. Apart from these two, we didn't use much of space, right? So I would
say that I used order of one space there. Now the question, now the question is, of course,
lower the time complexity, the better it is, right? So lower time complexity, lower time
complexity, the better it is. Similarly, lower space complexity, the better it is, right? Now the
big question here is for the algorithm that we wrote, we had order of end time and order of
one space, right? For the simple, for the sequential search, it's called a sequential search
because we searched if the element is present sequentially. Now the big question is, can I
somehow reduce this? Can I somehow reduce this? And remember, oftentimes there is a
trade off between time and space, okay? So binary search is a brilliant technique which says
that I can reduce this time from n to log n. And log n, log of n is always less than n, less than
equal to n. So let's take an example, right? So let's assume log base two, okay? Let's just take
log base two for simplicity. Now, if you have n equals to two, what is log n base two? It is
one, because two power one equals to two. If n equals to four, what is log n base two? What
is two power? What is four? Two power two is four, right? Right. Similarly, if n equals to
eight, what is log n base two? Two power three equals to eight. So it will be three. If n equals
to 16, what is log n base to two power four is 16. So in each cases, you'll notice that n is so if
n equals to one, what is log n? Base to zero, right? So log n is always less than n. The simple
example here that I'm showing you, right? So imagine, so, since login is less than n, if I can
somehow write another solution or algorithm somehow, if I can somehow get from order of
n time to order of log n, that is brilliant, because imagine if my n equals to 1000, 201,024
elements, right? Suppose if I have 1024 elements using my sequential search, it would take
me order of n time, roughly n comparisons, right? 1024 comparisons, but understand, what
is login? If login base two, two power ten is 1024, right? Is ten. If I can get away with ten
comparisons, right? This is the power of it, right? So order of n basically means I am doing
1024 comparisons roughly, roughly, or operations. But if I can get away with login, which is
ten, there's a huge difference between thousand comparisons and ten comparisons, right? If
I can somehow get away with just about ten comparisons, roughly about ten comparisons,
then my algorithm is obviously much more efficient, right? Login is much, much less than n
if you think about it, okay? And you can look at these numbers. When I have n equals 16, my
login is only four. When I have n equals to 1024, my login is just ten, right? So is there some
way for us to go from order of n time to order of login time? There is an algorithm called
binary search. Those of you who have studied computer science would learn about this in
data structures and algorithms course. Okay, those of you who do not know, don't worry,
we'll just learn it right now. The idea of binary search is very elegant, okay, so the first thing
that we need for binary search is this. You need the list to be sorted. List needs to be sorted.
Otherwise, binary search does not work. List needs to be sorted. Okay, there are sorting
algorithms like bubble sort. Those of you have studied computer science would quickly
know this. But I'll not cover that right here. But imagine if my list is sorted, how do I search
for an element? Okay, let's take a list here. This is zero. Let's take an example so that we
understand what's happening. 345-6789 let's assume I have ten elements here, right? Let
me also put in the numbers here so that it's easy. Let's say 23. It has to be always ascending,
right? Sorted array 2832, 374-248-4956, and 59. Okay, this is my list. L but it is sorted,
remember? Right? It is sorted. Increasing order like this. Every number is greater than equal
to the previous number. It's a sorted list. Now imagine my query point here is, let's say my
query point is 25. Now I want to find where 25 is. The way I'll do it is this. It's called binary
search. Okay? So the way I'll do it is this. First, let's take. So what is my, what my initial
array, my initial array has. So I'll keep a variable called l at zero left. And I'll keep my
variable here called, right, okay, when I start, when I start, I keep my l to zero. I keep my r to
n. So my nine equals to n, right. N minus one, actually, because I have n elements here.
Right? Now what I'll do here is I'll try to find a middle element of this array. How do I find
the minimum middle element? My middle element can be written as left plus right by two. I
can do a floor operation here. So what, I get zero plus nine by two, which is nine by two,
right? I can do a floor or a seal, whatever I want, because this number is not exactly an
integer. I'll round it up. This is 4.5, right? Let me round it up to five. I'm rounding it up. I'm
rounding the number. Okay, I'm just rounding the number here. So I got get a value of five.
Now I go directly and check what is the value of l five. L five value is 42. Right. First, what I
have done, I've computed the middle element. My middle element is nothing but you round l
plus r by two. And my middle element was five. Now what I've done is I'll directly go to the
fifth location. I'll directly go to the index five. And I'll check what is this value? Now, is 25
less than 42 or greater than 42? It's less than 42. And since I know that this array is sorted,
right, I know that this array is sorted in increasing order. My 25 will be found below index
five because index five value is 42. If 25 exists in this, if 25 exists in this array, it has to be
between zero and five because everything above five is a number greater than 42, greater
than, equal to 42, actually. Right? So if my 25 exists in this array, it has to be on the left of
this index five, right? So what I'll do now is I will change my r to five. So, left and right,
basically tell me, which part of the array can this element lie if it exists, right? That's what it
means. Exactly. This is the left of the subarray that I'm interested in. This is the right of the
subarray that I'm interested in. So now, since my 42 is greater than 25, I will move my r to
five. I'll move my r to five. Now I will do. Okay. My l now is zero. My R now is five. Now, what
is my middle element? Now zero plus five by two. And when I round it up, I get three. Right.
I'm rounding it up, right? So now I'll check the third element. The third element is 32, right?
When the third element is 32, I will compare 25 with 32. Again, 32 is still greater than 25,
which means? Which means that my query point, if 25 exists in this array, it has to be less
than the third index. It has to be somewhere in this array. Now it has to be somewhere in
this array. So I'll make my r equals to three. I'll make my r equals to m. Now what happens?
My l equals to zero. My r equals to three. Now, what will my m look like? My m will be 20
plus three by two, right? Zero plus three by two. When I round it, I'll get two. Now, I'll go to
the second element because my l is zero. My R is 228, is still greater than 25, right? So what
is the subarray that I'm left with? I'm left with zero, one, two. This is my l, this is my r. And I
have 23, 25, and 28, right. I know that in this subarray, 25 should lie. Right. Now, what I'm
doing now, I'll break this further. What is the central element here? Right, it's two. So I'll
keep breaking it up till the time I find my 25. If the number doesn't exist, you'll just reach a
stage where l equals to R equals to M, because all of them will point to just one number.
That's called the termination case. Now, the fun part is this. How do you analyze this
algorithm? So, let's look at it geometrically. Right? So I have index zero, index one, index
two, so on, index N minus one, index n minus two. So at the first iteration, first time, when
I'm comparing, I'm breaking this array into two parts, and I decide whether it's in this part
or in this part. With the first comparison. With the first comparison, if my array, at the start
of the first comparison, my array is of size n, right? Because my element could lie anywhere
in these n element, amongst these n elements. But after the first comparison. After the first
comparison, I decide that either it lies in this left subarray or right subarray. Right. So what
is the scope of, what is the number of elements that I have? At the end of my first iteration, I
have n by two elements. After the first comparison, after the first comparison, I have a space
of only n by two elements that I need to be compatible because I'm breaking exactly into
half right now. After my second comparison, after my second comparison, what happens
after my second comparison? Let me change the color here. Okay, let's assume I decide that
it's here, I break this again into two parts, equal parts, right? So now what is the size of this
subarray? This subarray will be n by four size, right. After third comparison, after third
comparison, it will be n by eight, right? So after first comparison, it is n by two. After second
comparison, it is n by four. After third comparison, it is n by eight. After fourth comparison,
it is n by 16, so on and so forth. At some point, at some point, this will reach one value. Only
because the size of the array is decreasing. At what value will this be? So let's say n equals to
16, okay? Then at four, at four, this value will become one. If n equals to eight. At three, this
value becomes one. If n equals to, let's say 32. At five, it becomes one. So, 532, 416, three,
eight. Now you see the logarithmic relationship here. What is log of 16 base two? It is four.
What is log of 32 base two? It is five. What is log of eight base two? It is three. So at login, at
login base two comparisons, you have reached exactly one element, which means what is
the total number of comparisons that you have to do? In the worst case, you will do login
comparisons, which means this algorithm takes order of login time or comparisons, right?
Very, very simple. So binary search basically breaks your array into subarrays of size, half
and all of this. Remember, all of this will work if and only if and only if your list is sorted.
Otherwise this whole thing breaks down. Okay, this is a very, very simplified explanation of
binary search. Now let's look at some simple code to see how that works. Okay, here we use
recursion. We learned about recursion, right? So I have my left and right. Given any array,
and if x is the element that I want to find, first, I check if r is greater than equal to L, right?
Because that's what we said, right? L is here, r is here. If r is not greater than equal to L, then
you say, okay, I've reached an end state, I'll get out of this loop, right? This is called the base
case. This is called the base case of recursion. We learned about the base case, or the
limiting case of the boundary case of recursion. Right? Here, I'm computing a midpoint. I'm
checking if the midpoint equals to this, or if it is greater than, or if it is less than, if the
midpoint is. So here the algorithm runs like this. I have my l here, I have my r here. This is
my midpoint. If my midpoint is greater than x or my query point here, x is nothing but your
q, right? Then I'm saying binary search. If it's greater than, if my midpoint is greater, then
my data should lie here. What am I doing here? I'm saying my l will remain l. My middle
point, my right point becomes middle point minus one, right? And again, calling binary
search within the binary search function, that's called recursion, right? Very simple. I ask
you to try this, I ask you to run through this. This is a recursive algorithm, right? Because in
the binary search function we are calling binary search, again, because that's how this
algorithm neatly fits in, right? You have a bigger array, and then you say, if this array, if this
element is greater than or less than, you either search in this subarray or you search in the
subarray and you recursively break this array into pieces, right? Logically. So those of you
have studied bind research. This will come very, very simple to you, and I recommend you
go through this function. And this function has been taken from geeksforgeeks with some
small modifications. But Geeksforgeeks is a great place if you want to learn and practice
algorithms and data structures. Okay, so this is the order of login algorithm, given a sorted
array. Otherwise, say, look at it at the start, I'm taking my list l and I'm sorting it. Otherwise,
this algorithm, algorithm does not work.