So we just finished talking about linear regression, right? And with linear regression, our response variable, our target variable is some continuous variable. So the examples we dealt with for sales, but it could be anything, It could be a person's height, it could be income, something like that describes a continuous response variable. With classification, what we want to do is we want to deal with a response variable that has classes, or another way to say it is categories. For instance the default dataset, which has a response that is 01 has the person defaulted on their loan is the idea there, is to know if they haven't or that's the general ideas. You might have a variable like default on a loan, or like we've used before own a home, right? When we were dealing with linear regression, we used to own a home, but that was a predictor. Now if you own a home was our response variable, then we would be dealing with classification methods, because you have a 0 and a 1, right? You have categories, you don't have a continuous scale. There's no continuous scale to owning a home, you either own it out, right, or you don't. [COUGH] Same thing with the default dataset, very defaulting. And again, predictors don't need to be categorical. So income is continuous, balance in your account is continuous, things like that. So cool, that's fine. But you might ask, okay, my response is categorical, let's say we take own a home. Let's say my response variables own a home, 0 if you don't own the home yet 1 of you own the home. Why can't I just use linear regression? Now, the idea there is, think about what we've done with linear regression when it comes to sales, right? We want to predict sales or use sales given only TV budget, okay? So we throw in some TV budget, we get some sales, we throw in some different TV budget, we get some different sales. Now, if my sales for the first month or $1000, and my sales for the second month or $2,000, that result is very tangible. My sales for the first month or half my sales for the second month. That's a fact, right? My sales for the second month are double my sales for the first month. That's an equivalent statement also a fact. But own a home it doesn't imply an equal ordering. It doesn't imply that step from something to something else is non tangible. A 0 to a 1 doesn't imply anything about it. What does that mean, sales being 2,000 1000? My sales were doubled. I have a statement factual statement. Own a home 01. How do you quantify? How do you put that into a good ordered statement where 0 and 1 can now be compared to each other in that same way sales could. So it's very difficult because linear regression implies some type of tangible ordering. Some type of step that can be defined very, very well, and that doesn't happen here. Now when it comes to binary responses like own a home, like we've been discussing. Using linear regression on a binary categorical response is better than trying to use it on something like say a response with 0 1 2 3 where those steps are all not really well defined. It's really tough to use linear regression there. So binary is probably the best case of a categorical response in which you can use linear regression, but it's really still not good. And next time we're going to talk about and we're going to show visually why linear regression really breaks down. There's two things, one is that the model itself is poor. It really doesn't work with the data well. And we'll see that again visually next time. But it doesn't give a meaningful, the main thing is that not only does it not give a meaningful P(Y/X), which is what we're looking for here. But linear regression also dips into negative probabilities. And one might say, well, if you restrict the domain or restrict what you're working with here to not include those negative probabilities, then you kind of skate away from that and you're okay with it. But in a foundational level, you're still using a model that allows something impossible. And that just doesn't sit well, right? It's just not good. Next time we're going to see the visuals of what I'm really talking about. But hopefully this makes a little bit of sense why this breaks down with linear regression. And we're going to learn plenty of really amazing classification techniques that you can use on this. By the end of this chapter and then later chapters deal with very specific models and why they're great. You won't want to use linear regression on a categorical response anymore, and you'll say I would never do that doesn't make any sense. I have these amazing models and linear regression falls so short on so many of these key things that I need in a model. Okay, so, well, talk plenty about classification and great models. Next time we'll talk about logistic regression, a really popular model, you might have heard of it. And again, we'll see some visuals why that one's really much more superior to linear regression.