We will now study the fundamental building blocks of distributed computing, which are processes and threads. So you've already seen examples of distributed computing through MapReduce, client server programming, message passing, and the building block over there is a process. And in the case of Java applications, the process is a Java Virtual Machine. Now what happens inside a process is that there's a large footprint of process resources, And then within a process, you can create multiple threads. So each thread will have its own private resources, but will also share the process resources. Now this maps very well onto current hardware because in a data center, we have a compute node. And this has local memory. It has a network interface that goes to the network. But in terms of the main compute engines, it has multiple processor cores. P0, P1, P2, and so on. And it's very common now in a data center to have 16 cores or even more within a compute node. So when you run processes on nodes, you have a choice because a process can have multiple threads. And the constraint is that a thread can run on one core. But if a process only has one thread, it cannot use more than one core in a compute node. So if you did not use multiple threads in a process, you would only be using a small fraction, like one-sixteenth or less of your compute node. So let us study some of these trade-offs. We have a choice of creating multiple threads in a process. And there's some advantages of doing so, the first is memory or resource efficiency, due to sharing. So if you have multiple threads that are computing within a process, they share the same process resources. In the case of hardware, very importantly they share the same local memory. And memory is a key cost constraint in today's data centers. The second is responsiveness, especially to network delays. So if you have one thread in a process that's blocked waiting for a network request, for example, in client server programming or in message passing, you could have other threads in the same process continue the work while the first thread is waiting for its communication to complete. So this responsiveness can be very helpful for web servers. And then third is performance. So because multiple threads allow you to use multiple processor cores in the hardware that can run in the same time and allow you to get more work done in the same time. So a typical chart of any server, you have throughput, which is a like the number of requests per second that are handled. And you may have a number of cores per processor that you are using, 1, 2, 3, 4 and so on. So these are the processor cores. And by using multiple threads, you can increase the throughput depending on the workload, if you're lucky, you can maybe get four times higher throughput by using four threads running on four cores. So these are good advantages of using multiple threads in a process. But then it does raise the question, that if you have a compute node, why not just only have one process in the entire node, and use as many threads as you like? There are some workloads, especially more scientific numerical computing workloads where that might make sense, but in general there are also advantages for multiple processes in a node. So it turns out the fist advantage relates to one that we saw earlier which is responsiveness. But here it can be to, for example, JVM delays. So a classic example of a JVM delay is a garbage collection. High level languages like Java and Scala that run on a Java Virtual Machine rely on underlying services like a garbage collector, but that could pause sometimes the entire process and cause all threads in the process to wait. So if you have multiple processes in a node, it could be that one process is going through something like garbage collection but the other processes could still be responding to requests for whatever service that you're providing. Another is what is referred to as scalability. So what happens in practice is even though you can use multiple threads to improve throughput, there is an inherent limitation. And it's pretty common after some time for the throughput to flatten out and even go down if you try using more threads. So there's an optimal number of threads where you can maximize the throughput and depending on your workload, it might be a small number, it may be four or six or eight. And if your node has 16 cores, then there's no point creating 16 threads when you know that your performance is going to degrade. So scalability is the idea of using more processes to increase your throughput so that you can use all the cores within the node and multiple nodes in distribution. And the third very important advantage of having multiple processes in a node is availability. Or it's also referred to as resilience. So the idea here is while when you have multiple threads in a process and one thread fails by itself, like throws exception, it may be possible for other threads to continue. But there are certain kind of errors where the entire process goes down. And so if you have multiple processes in a node, you can continue to have reliability even if, to begin with the performance might be lower, at least the service can continue. So, what we've seen is there's some very interesting system trade-offs between how we use processes and threads. Processes are the basic units of distribution. We can distribute processes across multiple nodes in a data center and even create multiple processes, as we've seen, within a node. Threads are the basic unit of parallelism and concurrency. We can have multiple threads in a process that can share resources like memory, and provide performance, can improve responsiveness, can be more efficient in terms of sharing memory. But their ability is limited in terms of scalability because you cannot spread threads across multiple nodes, and even within a single node it may be helpful to have more than one process. So now that you've understood the trade-offs between processes and threads, we'll see some examples in the next few lectures about how to use the two together.