Remember that term "factor" from the last module? Well, our main goal with this module is to achieve confidence, to run and analyze data from experiments when there are two factors. We are only going to use pen and paper only, and everything is going to be done by hand. It's actually a whole lot easier than you think. But, if you already understand the concept of factorial experiments in two factors, feel free to jump ahead; check out the last video, which is a 3-factor example, then try the quizzes for this module. If you do well, move ahead and start the material for module three. In that next module we are going to introduce computer software to analyze the experiments and visualize the data. But for now, get out that pen and paper and let's get started. -- So we are considering a basic example; an experiment with 2 factors. In the previous module we had said factors can be either numeric or categorical. In this example we will consider one factor of each type. So we're going to make popcorn! And in this experiment, the outcome is the number of popped kernels. It might be our objective to maximize that number of popped corns. Most of you will be able to try this one at home, which is why this is such a great example to start with. We're going to apply the same amount of heat each time and use the number or raw kernels to start with. From prior experience, I know that between three to four minutes are required, on medium heat, to pop most of the corn. So our first factor is going to be the time on the stove. And I'm going to use 160 seconds and 200 seconds. Notice that we use two levels, or two values, for this factor. Just under 3 minutes, and just over 3 minutes. Figuring out these numeric values for your experiments takes some practice. You will make mistakes, but we give general advice in coming classes. One quick tip though is don't use extremes. For example, you wouldn't use 30 seconds and 10 minutes for this experiment. You know in the first case that nothing happens in 30 seconds, and for 10 minutes you are going to burn it all. So let's recap: we use 160 seconds and 200 seconds. In later modules you'll learn how to either increase or decrease that cooking time, in order to improve our objective. The second factor we will consider is the type of popcorn. You could buy either white popcorn or yellow popcorn. Notice that this is a categorical variable, and there are two levels. We will assign the low level for white corn and the high level for yellow corn. So let's start planning the experiments next. We have two factors: cooking time, and type of corn; and each factor has two levels. From this we know that we will have four total combinations. This comes from the mathematical rule that two to the power of "k" tells us how many experiments we will have. Now "k" is the number of factors, and in this experiment we have 2 of them. So in other words, there will be two to the power of two, or in this case four, experiments in total that we have to run. We will write them in a table first, as follows. Let's pick cooking time and call it factor A, then call the type of corn factor B. So there are two columns, one for A and one for B. We use minus signs to indicate a low level for a factor, and a plus sign to indicate a high level. You will hear me say this a few times, but I hope you believe me: I promise it will be clearer by the 3rd module why we use minuses and plusses. The standard approach is to vary the signs for factor A the fastest, so put "minus", "plus", "minus", "plus", in the four rows for column A. These signs tell the experimenter what levels to operate that factor at. For factor A in this experiment that means we will have two experiments at 160 seconds, and these other two experiments will be run at 200 seconds. For numeric variables, the "minus" corresponds most naturally to the smaller numeric value, and the "plus" to the larger numeric value. Now let's consider factor B,. This is a categorical variable. There isn't a natural assignment for the "minus" or "plus" signs. In this case, we allocate the signs arbitrarily. For example, let's put white corn as "minus" and yellow corn as "plus". We could have flipped this allocation around. But, as you will prove to yourself in a quiz during this module, you will still get the same results. So, complete the table, now, by adding column B, and vary that one step slower than you varied column A: "minus", "minus", "plus" and then "plus". Now we are ready to implement the experiments. Here's a bit of advice and, in this course, when we give some practical advice, we will show it with this icon. The most important thing that you should NOT do is run the experiments in the order shown in the table. You MUST run the experiments in random order. Now you can choose any method you like to pick that random order of experiments. The easiest, I find, is to write numbers on pieces of paper - as you have experiments. Then randomly select these pieces until no more are left. A few other options are shown here on the screen; please take a look at them. So here the standard order column refers to the way which we will label the rows in standard order. The next column we add is the "actual order" column. This column represents the order in which the experiments were actually run. Now we can go start running our experiments and record the outcome variable. The first experiment I randomly picked was number 3 - that experiment is run at short cooking times and with yellow corn. After the experiment is run, I recorded an outcome of 62 popped kernels. Then I drew my next random number and found that I should run experiment 1. I recorded a value of 52 popped kernels. I then go get another random number and suppose I get row 4 from standard order table. When I run that experiment I get a value of 80 popped corns. My final experiment is number 2 from the standard order table, with long cooking times and white corn; this led me to a result of 74 popped corns. So once all the experiments are done we will have 4 entries of outcome values. Now where do we start with our analysis? The first thing a good statistical analysis will do is to visualize the data. We start by drawing a cube plot for the system. Remember the cube plot from the first module? That plot shows us the effect of each factor. Start by drawing a square and then put the first variable along the horizontal axis, and the second variable along the vertical axis. Let's consider the horizontal axis first. We have short cooking times on the left and long cooking times on the right. In the vertical direction we have white corn at the bottom, and yellow corn at the top. Now you are ready to add the outcome variable to this plot. The number 52 goes over here in the bottom left, because that's the combination with short times and white corn. 74 goes here at the bottom right, for those combination settings. Up here we have 62. Our final value at the top right hand corner is 80, at long cooking times with yellow corn. Start by considering the effect of time. As cooking time increases, and when using yellow corn, we go from 62 to 80. That's an increase of 18 units. For white corn, we see that we go from 52 to 74, an increase of 22 units. So, on average, we have a 20 unit increase when cooking time goes from 160 to 200 seconds. Let's consider the difference between the corn type next. This is the effect between yellow corn and white corn. Similar to before, what we do is we compare this effect, keeping the other variable constant. In other words, let's fix time at the high value of 200 seconds and see what the effect of changing from white to yellow corn does. In this case, we go from 74 to 80 popped kernels. When we report and quantify this effect we say 80 minus 74, in other words an increase of six units. What is the effect of corn colour at short cooking times? I'd like you to pause the video and calculate that for yourself now. You should found it to be a 10 unit change: from 52 to 62; in other words, 62 minus 52 is a 10 unit increase. So we can report the average: an 8 unit increase in the number of popped corns when changing from white corn to yellow corn. Make sure your interpretation matches up with your cube plot. Those visualizations are so important to check your analysis. Let's visualize this in a second way with a contour plot. I've redrawn the cube plot here for you. And now we're going to add contours to it. Start in any corner that is not a maximum and not a minimum. Then connect the lines as shown by this example. Notice that the value of 62 would appear approximately over here on this side of the square. So we draw a contour to connect these two points, because they should be at the same level. Look at the value of 74 over here. It would appear approximately on this opposite side of the cube at this point. And finally, we can guess that the rest of the contours are approximately linear. Don't worry, we will show in later classes how to verify that using computer software. This is a great way to visualize a set of experiments. Because we can quickly see here how to start moving towards improving our objective. For example, if our objective was to maximize the number of popped kernels, then we can see we should move in this direction to the top right hand corner, to achieve that goal. In this specific case that means we must use yellow corn and longer cooking times. The longer cooking times result is probably intuitive though for this particular case study. The interpretation of white and yellow corn probably wasn't. I always tell my students, you must ask: "where should I run my next experiment?" And the contour plot tells us that answer. In summary: we have seen two ways to visualize our data. One way is with a cube plot, with the values superimposed at the corners. The second method is to take the cube plot and add contour lines. I'm going to show you a third way before we end this class today. This plot is call an interaction plot, and you'll see why, especially in the next video. Put one of the variables at the bottom, with it's low value and it's high value in the horizontal direction. For example when we use white corn, our outcome variable is 52 and 74 at the two settings of time. Let's use a solid line to connect them. For yellow corn, the outcome values were 62 and 80, and I'll use a dashed line to connect those two. Notice that these two lines are roughly parallel. So there we have an interaction plots. We could have flipped our choice of variable to start with on the horizontal axis. Let me quickly show you how. Or maybe, you'd like to pause the video and try it yourself first. Put yellow corn and white corn on the horizontal axis and then connect them with two different line styles. One line for short cooking time and one line for long cooking time. Did you get the result that was shown here? I've used a solid line for short cooking durations, where the outcomes were 52 and 62, and I've used a dashed line for the longer duration experiments where the outcome values were 74 and 80. Note that the lines are parallel again. I'm pointing out the parallel lines to you because the fact that these lines are parallel means that the system has no interaction. And that's a term you are going to hear about in the next few videos. One last point to wrap up: notice that the 4 visualization methods we've considered in this video do not require any computer software. We've drawn the table by hand, we've drawn a cube plot, and a contour plot, and now this interaction plot. You can apply these visualizations whether the factors are numeric or categorical. All of this demonstrates a distinct advantage of these experiments: we can quickly understand the results using simple graphical tools, and quick calculations on a piece of paper. The fact that they are so simple, means that the results are easy to share with your colleagues and your managers at work. Now in the next class I'm going to show you how we can build mathematical models to predict the outcome. Making predictions is one of the most powerful aspects of these running our experiments in this factorial manner. See you next time.