[MUSIC] [SOUND] Hello, today we will introduce an important bit of notation in discrete mathematics and theoretical computer science, which is called the big O notation, so. Because this is a big O, it's called the big O notation. As a motivating example, I went to Wikipedia and I took a screenshot of this pseudocode. If you have done some programming before, you will surely see this is a sorting algorithm. It's not the best sorting algorithm, it's actually a pretty bad one. Anyway, it's pretty simple and let's try to see how many steps this algorithm does. So you see, let's say, the size of A of our array is n. Then you see that this here, the outer loop we'll be perform n- 1 times you also see maybe the inner loop will be perform, well actually maybe at most n- 1 times, we don't know. It depends on the number of mutilations but surely not more than n- 1 times. And then we have the swap operation here, this is a little weird. I mean, how many steps does this take? And this depends on what? I mean, it depends on how this is implemented in your computer. It could be that your processor has a swap operation that says okay take this cell in the memory, and take this cell and then just swap the contents. But maybe it doesn't and then you have to do either you as a programmer or the compiler you have to do something like this. You take a temporary variable and you say okay temp is A(j) and then say A(j) is A(j)- 1. And A[ j- 1]:= temp. And how many steps does this take? You going to depends, what is a step? J- 1, computing j- 1 already is a step, so this could be anything here. Between 1 and let's say 20, I don't know, it really depends on your computer. So if we just add everything up, we see the running time is something n- 1 squared times something, times something that we don't really know, okay? And then maybe there are some additional stuff well, that's actually, some constant A that we don't know and there might be some additional stuff and so on. So the running time of this algorithm is of the following form and we don't know a, b, c, that might depend on your programming language on your compiler, on your computer and so on. But you see that you have something n squared in your expression. Does not depend on your programming language, it only depends on the algorithm. And this is exactly the purpose of the big O notation. We want to somehow hides the unimportant information. We don't want to think about the implementation details, and we'll just want to focus on what kind of worth measures. So formally the definition is a little bit technical, so let me walk you through it. If f is a function then O(f) is formally a set of functions, it's a set of functions such that this holds. And for this actually let me make a little picture to explain it. So we have a function f, let's say this is f and then c times f so c is some constant here. So C times f is simply like this, okay? And the big O notation says from a certain point on in zero our function g here must be below the pink line, so g could be something like this. It can be bigger than f, it can also be bigger than C times f, but only for small values of n. So past a certain threshold and zero, it must stay really below C times f. Let's see an example. So we've seen our running time was something a (n-1) squared + bn + c. And now this is of course at most, an squared +bn +c, and I want to claim that this is at most C times n squared for some C. So how should I choose capital C? For example. Choose capital C to be a + b + c, and then we see that this expression here. Is less than an squared + bn squared + cn squared, which is C times n squared. So, we can conclude that our function f(n) which describes the running time of our algorithm insertion sort is in O of n squared. So here are some rules for the big O notation, you can take all the time you want and look at them. You can of course click on pause, I just want to explain to illustrate quickly two of them. The second here would, for example, tell us that if g1 is an O(n), and g2 is an of login than the product, for example, g1 times g2 is an of n logn. And also very interesting, is kind of last rule, it's a sort of domination rule. So if g is an O of f it means that f can just swallow up any amount of g that you add to it. For example n squared + 1000n simply is also O[n2]. At some point the n square will simply eat up the 1000 enter. Some of the rule to bear in mind is the following, larger polynomials win. So n to the 10 is an O of n to the 11, the second means exponential functions beat polynomial functions. So n to the thousand will at some point be overtaken by something like this, it might take a while, but it eventually will. And the last is a variation of the second, so if you have something, logn to 1000 will it's in or for example, 20th root of n. Okay, the big O has a big brother which is called the big the Omega. So what is big-Omega? Big-Omega is basically the same thing, it's just is at least. So for example if I have n squared minus 1,000n. You'd see, this for small n it's negative, but at some point it will be at least C times n squared if n is large enough, right? You can try to figure out what you have to choose for C and n0, so we can say n squared minus 1000n is in Omega n squared. So this says, eventually our function grows proportional to n squared. Finally, there is Theta and Theta is just the intersection of O and Omega. So it means g is in Theta of f, if it is in O(f) and in Omega(f). So here is a nice example, you remember from some sessions ago that I derived the formula for the sum from k equal to n of k squared. Now here, we have k cubed but suppose we are not interested in an exact formula, we're just interested in a rough estimate, how big is it roughly, okay? So what you can do is, you can say, well, obviously this here is at most the sum of k equals one to n of n cubed. Because n cubed is always bigger than k cubed, and this you can see is n to the four, so that basically means that our sum, f(n) in particular, is an O(n4). Right in the other hand, our sum is at least the following sum. I can say well, I start not at k but I start at n over two, rounded up. When I do that then every term in my sum, so k is always larger than n over two. So I can say well this is at least n over two to the third. And I run from n over two ceiling all the way up to n, so I have at least n over 2 sum ends, so what I get is this. And this says that our function, the sum is in Omega of n to the fourth. So now we can conclude our sum is Theta of n to the four. So although we haven't derived an exact formula we can say with confidence it has the same growth rate as n to the fourth. We just don't know the exact formula but it's roughly n to the fourth. So I hope here you really see the benefit of the big O, big-Omega, big-Theta notation. Because it can save you a lot of work, it helps you to distill the really important information. Another notation that you often see, you often see O of something not used as a set of functions, but as kind a placeholder for a function itself. So you see in the literature, you often see things like d and plus o of something. And what it basically means, it means the difference between f and g is bounded by o of r. So here again let's take our k to the third example. So you see what is k to the falling three, this is k (k- 1) (k- 2). So we can see that k= k to the 3, = k to the falling 3 + o(k) squared if you expand this product. And therefore the sum here is the sum from k = 1, to n, k to the falling 3, plus O (k2). So now you see this is the sum from k = 1 to n, of k to the falling third, plus. This. And we have a formula, we have an exact formula for the first sum. This is, I can't even remember, this is n + 1 to the third, of course, divided by four. And that the second sum, you can easily see is bounded by O n cubed, O of n cubed. Yeah, and so you see, this is n to the 4th/4n cube. So this term, again, also has some garbage, but the garbage is n to the 3, plus something of lower power, so we can just move it, into the o of n cubed part. And this now has the benefit, that we have a much more precise estimate for our formula. Now you see, but if you will know, it grows like n to the 4th, but now we have also determined the leading coefficient, which is 1/4, all right, and this has not been a lot of work. Because by using cleverly the big-O notation you can really save a lot of work by saying well this is actually a term or a part of it that doesn't really interest me. So you can safely ignore it, and the rules that I have showed you today, they tell you how to safely ignore them. All right, that's it for today, thank you. [MUSIC]