Welcome. We will now start looking at distributed computing and start by examining the map reduce paradigm. So one of the key applications of distributed computing is to work with big data. And the way map reduce computing works is that you abstract your big data as key value pairs. So, if you have a pair of, let's say, (KA,VA), you can perform a map function, f, on VA. So this is called a map. And this will result in a new set of key value pairs. Let me call them (KA1,VA1), (KA2,VA2), and so on. So key value pairs are similar to your use of keys in hash maps and the values are assumed to be immutable. They are objects whose values are not going to change. So the MAP function, in the spirit of functional programming, will take the input value VA and produce a number of output key value pairs where the keys KA1 and KA2 need not be the same as the original key KA. And of course you can do this on multiple key value pairs so you could have (KB,VB) and that may be mapped to (KB1,VB1). Then the next thing that happens in the map reduce paradigm is a grouping where you group with same key. So for example if KA1 and KB1 happened to be the same. Let's say KA1 = KB1 = some key K. Now, you have two values with K. You have VA1 and VB1. So the grouping just goes through all the key value pairs that you've generated from the map function and puts together those that happened to have the same key. In this case, if KA1 and KB1 were the same, VA1 and VB1 are grouped together. And then, you get to do the reduce which is perform an operation that can fold or reduce the set of values with the same key into a single value. So this could be a sum operation, for example. If g was a sum function then the output would have K as the key, and the sum of all the values that were generated by the map operation with the same key. So, really you have two things that the programmer needs to specify. The map function and the reduce function G. And the system takes care of everything. As we'll see later, there are multiple systems that can implement this paradigm. And you can have terabytes or even petabytes of data being operated upon by this simple paradigm. Let me take a very concrete example to illustrate this point. So I'm at Rice University where they have the undergraduate college system, kind of like houses in Hogwarts. And the four first colleges were Will Rice, Hanszen, Wiess, and Baker. And I'm going to use these as keys and let's just give them some values say 10 for Will Rice, 11 for Hanszen, 12 for Wiess, and 13 somewhat randomly for Baker. So this is the input data. And the first thing we want to do is a map operation. So let's say that the map operation is going to enumerate the factors in the value. So the output of the map operation over here would be, I'm going to exclude 1 as a factor, but 2, 5, 10. So those are the three factors for (WR,10). Hanszen, 11 is a prime number so all we have is (H,11). For Wiess, we have 2, 3, 4. 6, and 12. And then, for Baker, we also, since 13 is a prime, just have 13. So, you've seen how we've taken a map function, and for each input key-value pair generated one or more and in fact in some cases it could even be zero, one or more output key value pairs. In this case, the output key value pairs have the same key as the input but that need not be the case in general and moreover the output from one input key value pair could be, could contain multiple keys as well. So, the grouping in this case is simple because all the key value pairs in the output from the map function with the same key happen to have been generated by the same input. So, the grouping is already there, I will just write group as a reminder that we, that needs to be done. And then the reduce, let's say we're doing a sum. So, if you want to do a sum reduce for Will Rice we will get 2 plus 5 plus 10 which is 17. For Hanszen we'll get 11. For Wiess we'll get 2 plus 3 plus 4, that's 9, plus 6, 15, plus 12, that's 27. And for Baker we get 13. So, this is the output of the reduce phase and what we've seen is we've been able to take some input data and just by specifying two functions. A map which was factors, and a reduce which was sum, we are able to generate the output. Now in this contest, Wiess came ahead with 27 as the output, which I'm proud of because I'm a faculty associate of Wiess College. But, on campus, the competition can be very fierce with contests like beer bike, and the outcomes could be quite different. So, what we've done over here is learn some very simple operations, map and reduce, that the programmer can specify. And they provide the foundations for processing big data, as we'll see in the next few lectures.