Welcome back. Last week we explored simple linear regression, and this week we will explore multiple linear regression. In a multiple regression, the dependent variable Y that we are attempting to model and hence forecast, depends on two or more explanatory variables, the X variables. A multiple linear regression with two X variables is of the functional form, Yt equals beta 0 plus beta 1 times X1t plus beta 2 times X2t, where X1 and X2 are different X variables. Of course, this equation will logically expand for 3, 4 or even more X variables. In a multiple regression, the variable that we're trying to forecast, let's say sales, not only depends on one X variable, but depends on two or more X variables. So let's say we're still trying to forecast our sales. Sales is on our vertical axis, and sales depends on advertising. That's one of our X variables, advertising. Our scatter plot shows a positive relationship between sales and advertising. Sales depends on a second X variable, let's say the price of the product. We have another scatter plot of sales versus a different X variable. The same dependent variable, sales, depends on another X variable, price. Upon drawing that scatter plot, you may notice there's a negative relationship. As the price goes up, sales tends to go down. This is where a multiple regression will be useful. Our dependent variable Y, sales, depends on X1, advertising, and X2, the price. When we ask Excel to estimate the linear regression, we have Y equals beta naught plus beta 1 times X1, and X1 is our advertising, plus beta 2 times X2, and X2 is our price. Now technically, we're looking at a three-dimensional graph with this equation. But the standard functionality of Excel, the default options, don't allow us to draw a three-dimensional graph. A solution to this is to draw what is called a bubble chart. A bubble chart in Excel, we'll plot the two X variables, one on the vertical axis and one on the horizontal axis. Let's say we have advertising on the vertical axis and price on the horizontal axis, and a third variable, which is the dependent variable Y, will be indicated by the size of the bubble that's being plotted. When we have high advertising and a low price, we'd expect to have a high value of sales. In this region, you'll notice in that bubble chart, we have quite large bubbles where we have a lot of spending on advertising and a low price. Then in this region of the chart, where we have a very high price and we're not spending much money on advertising, you'll notice that we'll have relatively small bubbles. It's a solution to visualize what otherwise would be a three-dimensional chart. The solution for that is a bubble chart just to help us visualize that three-dimensional regression represented by this multiple regression equation. Once we generate our regression output, how can we be confident about the regression coefficients we have estimated? Just as with simple regressions, to test individual slope coefficience, we conduct t tests by examining the p value of each slope coefficient. For the overall model significance or how the x-variables are jointly significant, we conduct what is known as an F-test. For this, we look at another statistical result known as the F-statistic, which has its own p-value. If the p-value of the F-test is less than 0.05, which is five percent then we can conclude that the x-variables are jointly significant. The model has overall significance and we can use the model for forecasting. If the p-value of the F-test is more than 0.05, five percent, then we can conclude that the x-variables are not jointly significant. The model does not have overall significance, and we cannot use the model for forecasting. The R squared and standard error continue to have the same meaning as for simple regressions. However, when comparing different regression models for the same dependent y-variable, the adjusted R squared is more accurate than the R squared itself. The calculation of the adjusted R squared takes the number of x-variables we have used into account, and is thus better for comparing different regression models for the same dependent y-variable if the models have used different x-variables. If you are interested in further details about the F-test or adjusted R squared, check out this week's toolbox. But otherwise, the videos cover what you need for business forecasting. Once we're satisfied with our models based on the t-tests, F-test, R squared, and the standard error, we can use the regression equation for forecasting. Now a note for cross-section data. If our regression model is based on cross-section data, we may not only just be interested in the out-of-sample forecasts, but we may also be interested in the within sample forecast as well. For example, if we are forecasting sales rather than going out of the sample and extrapolating the variables, we may be interested to know what sales would be if prices were set around the average, and advertising was around the average as well. Our regression models will allow us to do just that and you will say, "Wow," when you see that this week. Now just as last week, over to the Excel screen flow videos. I encourage you to download the relevant Excel workbook and work alongside me as you watch the video. Attempt the quizzes to practice what you've learned, as well as the assessment and receive some feedback on your learning. Check out the weekly discussion board where you can discuss your thoughts, questions, and answers with your peers. Peer to peer learning is the greatest way to learn. Teaching anyone will reinforce your own understanding of the material, and in the end, everyone say, "Wow."