Hello everyone, in this video you will learn the basics of statistical forecasting models including the ground rules, potential patterns, and concepts of a regression analysis. The idea of the statistical forecast is to discover the underlying patterns in the history, and use them to predict the future. Thus the key assumption is that the historical patterns will continue into the future. And in some sense, the idea is similar to driving the car by looking at the back mirror. Thus, we have the following ground rules for statistical forecast. Rule number one, no one can predict the future precisely. And so, forecasts are always wrong. And they should have an error measure. And rule number two, the further you want to forecast into the future, the lower the accuracy. This is quite straightforward. Rule number three, aggregated forecasts are more accurate than disaggregated forecasts. To explain the rule number three, let's look at an example. We did a project on demand forecasting for British Petroleum, BP, on their lubricants. We first made an aggregated forecast for the sales of all of them. And the table shows that our forecast is very accurate, with the difference between the forecast and actual sales to be less than 2%. However, when we broke the aggregated forecast down to each lubricant, we found that the accuracy of the forecast for each individual item can be very bad. On the figure to your right, you can see is that if the forecast is accurate, then the bar should be at about 100%. The red circles shows the items that we significantly under forecasted. And the yellow stars are the items that we significantly over forecasted. Let's now discuss a few typical demand patterns that we often find in practice. First, let's look at the pattern of stationary and completely random fluctuations. In this example, the figure shows the changes of the SP500 weekly close index from 1998 to 2012, which is purely random and stationary. Which means that no one can predict the changes of the S&P 500 Index, and thus hedge the market just by the historical data. The second pattern is a growing or declining trend over time, either linear or nonlinear. The example here shows the increasing spending by US Social Security from 1962 to 2018. The third pattern is cycles that a combined trend and seasonality. In this example, the figure shows that the sales of beer is quite seasonal and follows an increasing trend over time. The forth pattern captures the impact of exogenous factors, such as price and promotions on demand. Now in this example of children's shoes, when the price, denoted by the blue dots, drops, demand, denoted by the red dots, may increase significantly in response. Regression analysis is widely used for demand forecasting, especially for predicting trends. A regression analysis is a statistical model to predict a dependent variable or response variable by one or more independent or explanatory variables. It can detect a causal relationship if it exists. Now, we will start with a simple linear regression and then introduce multiple regressions. In a simple linear regression, we can predict the dependent variable, y, by an independent variable x, based on the historical data such as (x1,y1), (x2,y2) and so on so forth. For example we can predict the sales of jeans by dollars spent on TV advertising, the sales in a retail business by shelf space, and the sales of a newly launched product over time. In a multiple regression we can predict the dependent variable, y, by multiple independent variables, x1, x2, and so on. For example, we can predict the sales of jeans by dollars spent on TV advertising and price discounts, the sales in retail business by the shelf space and location, and the sales of a newly launched product over time and price. For a regression analysis, we use the following standard steps: First we select one or more x variables to describe y. Second, we should do a scatter plot to visualize the potential relationships between x and y. Then we can select a model and estimate model parameters based on a sample of data. In step five, we test the significance of the model by R2 and inferences of the slope, to be explained later. In step six, we shall validate the model by analyzing the errors of the model, called residual analysis. Finally, in step seven, after the model is tested and validated, we can use the model to make forecast.