Hello everyone, Welcome to our lecture, Introduction to Sample Data Set. In this video, I'm going to show you how to get important information from your data frame using describe and informative. To demonstrate this, I'm going to use the Titanic Data Frame. The infamous British passenger ship that sunk in 1912 after hitting an iceberg. But for this lecture, I'm going to extract it from Seaborn directly. I'm not going to go to Kaggle to download it. I'll just get it from the Seaborn. To do so, I need to import Seaborn as sns and as well as a library. Now, I'm going to call the data frame Titanic titanic_df is equal sns.load_dataset. And then I'll give the name of a dataset that is in the Seaborn so it is called titanic. Run it. No error. We can also look at the first [inaudible] of the data frame to make sure data frame is fine, so the head and return it. We find the data frame directly from the Seaborn. Now let's get some more information about this data frame. So the first method that we can use is the describe method. In describe method, we'll provide some statistical information about this data frame. Describe, make sure you use as a method. Then in opening and close parentheses at the end. If you don't, for example, you lose describe, then you will not get the statistical information that you are looking for. This is almost useless. So let's use it as a method. Then you have it. You have a count, the mean, the standard deviation, minimum, 25th percentile, 50th percentile, 75th percentile, and the maximum. Just in case you don't like the layout, you can change it by taking the transpose of this matrix. By the way, this is a matrix here, so the transpose is this. Before we transpose, look at the column. The column will be survived, like pclass, age, like that. There is some important information here. This is saying that the survival rate is 38 percent for every 100 percent. Hundred person, only 38 people survive. Let me click with transpose down here, so you will see the difference. You see. The column has become a row and rows have become column. In addition to survive, we also have the info method, so titanic_df.info. This also will provide important information about our data frame. It is saying here we'll have a 891 rows and 15 columns. You go through every column and any column that is missing that is less than 891, I mean that column is missing some information. Age column is missing and the embark column is also missing some information, some data. As a data scientist, you need to clean these missing. In addition to the missing data, you can also check the type of data that each column contain. Here integers, float numbers, object, this is for strings, float4decimal, and integers. You do have a boolean. Boolean mean either one or the other. For adult_male either true or false, True, it is an adult_male. False is a child_male. You have boolean also, so thus in this column adult_male. You also have boolean on this, alone. If you came alone, it is true. If not, it would be enforced. In this video, we learn how to load Titanic dataset from Seaborn directly and then how to get some insight about these datasets using the describe and the info method. Thanks everybody. I'll see you in the next lecture.