Next, we'll talk about building the predictive model. A predictive model is a function that maps the features of a patient x to the target outcome y. So depending on the value of y, the model can be either regression or classification. In the regression setting, the target variable is continuous. For example, a cost associated with a patient visit could be a target, and that's a continuous value. Popular methods include: linear regression, generalized additive models. If the target is categorical, for example, binary healthy versus non-healthy, then we have a classification model. There are many different methods, including all the standard machine learning methods such as logistic regression, support vector machine, decision tree, random forests, and of course, for this course, neural networks. Finally, we'll talk about, how do we evaluate those predictive models? The general strategy is to develop a model on some training example, then test the model on other unseen examples, the test set. So it's important to know that the training error, that is, if you build the model with some example, then you test this model on the examples you used to train the model. That's a training performance or training error. That's not a very useful measure of performance. So test error is a key metric. The performance on this unseen test set is the metrics we should be looking at, because it is a better approximation of future examples. The classical example for doing this is cross validation. The main idea behind cross validation is to split the data into training at iteratively so that different portion of data can all become test set. Then finally, we can aggregate those results across different iterations and report the aggregation results, such as taking an average. There's three common method for cross validation namely; leave-1-out cross validation, K-fold cross validation, randomized cross validation. In this leave-1-out cross validation is to take one example, for example, indicated by this red line, indicate one record, that's the test set. Then remaining as the training set. Then we repeatedly doing this process by choosing one example as test set and using the remaining as training set. Then we build many models and test it on those examples. Then finally, we can take average of those performance on those different test set. K-fold cross validation is similar to leave-1-out. But instead of just using one example as the test set, we can have a larger test set. In this particular case, we split the entire data-set into K-fold or K equal partitions, non-overlapping partitions. We then iteratively choose each fold as the test set using the remaining folds as the training set to build the model. Then test on this particular test fold. So if we have a K-fold, we will have trained K models. You can also do randomized cross validations. In this case, we would just randomly split the data into training and test set. For each split, the model is fit to the training data and test it on the test set. Then the result are averaged over all the split. The advantage of this method is over K-fold cross validation is that the proportion of training and test split depends on the number of iterations. So in this case, you can do this type of random split many, many more times than K-fold cross validations. Whereas the disadvantage of this model is, some observation may never get selected in the test set because you're doing this, construct this test set, at random. So when the data-set become really large, which is often the case when you try to build a deep learning model, and the model is very complex with many training parameters, we need to split the data into three parts: training set, validation set, and test set. The training set is used to train the model, the validation set is used to select the best model, hyper-parameter, like the number of layers, the number of neurons. More specifically, we iteratively train the model with some hyper-parameters and validate the model on the validation set, then we pick the best hyper-parameter settings in a train of final model and test it on the test set. So once we have the final model, the test set is used to estimate the model performance. So as a general practice guideline, validation test set can be small, but the data should be similar and similar to the real world use cases. The training data can be more flexible. But usually we look for large volume data even when the samples sometimes are lower quality. In summary, for supervised learning, we talked about this predictive model pipeline. It has these six different steps, and you can iteratively try to build a better and better predictive model.