[MUSIC] In this video we will give a brief introduction to data mining and its application to urban system. Specifically, we'll see what is the need of data mining for a smart city generated data. So, as we all know, smart cities are the future of conventional cities. And they are expected to bring about data centric solutions to urban challenges. So, this has become possible due to accelerated development in new technologies such as 5G, AI, cloud, and edge computing. Also, there has been exponential increase in the number of connected devices and the amount of data that they collect. So, this all has become possible due to high-speed connectivity through ubiquitous Wi-Fi hotspot and 5G networks. This figure shows several application areas of smart city such as smart retail, smart houses, smart environment, smart transportation, and so on. So, if we consider an application area, let's say urban transportation. So, at present there are several inefficiencies in the system, such as accidents, safety concerns, transit, congestion, and so on. However, in the future we expect that due to autonomous vehicles which are being developed, we will have smart mobility which will collect lots of big data and then analyse that data to provide better mobility solutions. If we look at the future of urban transportation in terms of challenges in the first stage we are likely to see electrification and automation which will be aided by reliable and secure hardware, car sharing based on data analytics. In the second stage, based on all the data that has been collected, we will see that there will be no more need for personal vehicles, we will have fully automated operations which will have real time data collection and knowledge extraction power. So, this will provide efficient on demand mobility services. There has been an explosive growth of data in terms of the amount of data that has been collected. So major reasons for that is a smartphone growth. So, we expect that by 2025 there will be 7.3 billion smartphones in the world. In India alone, 65% of our population will be using smartphones by then, which are equipped with ubiquitous connectivity, sensing and hardware. Smart sensors are a major reason for that. They provide real time sensing. Equipment has become cheap and ubiquitous. And the platform to engage citizens and agencies are increasing. Another reason for this is the behavioral changes. Many drivers use their smartphones for turn-by-turn driving navigation. Then several commuters used their phones to get public transit information. And almost everyone uses their phone to receive a train, flight, taxi, or car booking. So, if we see this figure, this shows several behavioral changes. People using smartphones for driving instructions and autonomous vehicles are also not too far in the future. So, if you look at the type of data that we are collecting, it consists of business transactions such as purchases, exchanges, banking, stock, etc. In-house wares and assets, lots of scientific data is being collected which was not being collected earlier, such as the CERN laboratory in Switzerland is counting millions of subatomic particles every second. GPS readings from wildlife is being used to monitor their movements. In terms of natural calamities such as hurricanes, we are collecting lots of survey data from residents which are affected. We are also collecting lots of medical and personal data such as government census as well as personalized and customer files. CCTV cameras are ubiquitous nowadays and they collect lots of surveillance videos and pictures. Satellite sensing is everywhere. We also have lots of data in terms of text reports, memos as well as worldwide web repositories. So, if you look at the data information and knowledge. So, the data is a very basic unrefined and generally unfiltered information which is recorded in terms of symbols or signal readings. Then we also have words which are text or verbal: numbers, diagrams, images. So, these are all the forms of data. If we refine this data, what we get is the information which is the data that has been refined to the point such that it is being useful for some kind of analysis. So, it aids in decision making and solving problems or realizing an opportunity. So, it is a combination of both current as well as historic data. Knowledge which is a mixture of organized experience, values, information, and insights is helpful to evaluate new experience and information. So, it can answer questions such as cognition or recognition in which we know what is happening. Or capacity to act knowing how to do something. Or understanding something, that is, why we are doing this. So, if we look at this data, knowledge, and information, they are all a part of DIKW Pyramid. And the top of that is wisdom, which is the ability to increase the effectiveness based on knowledge. It requires the mental function that we can call judgment and it is very personal in nature and depending on person to person. So, if we look at a very small example, let's say we have a data which is the GPS recording of a vehicle and a traffic light which is about to turn green or red. So, from that we can extract the information that okay, the traffic light which is at this junction and is facing this direction is about to go red and from that we can extract the knowledge okay, the direction I am going in, that traffic light is about to go red. So, from that we can extract the wisdom okay, I better stop the car now so that I do not jump the red light. So, we see that there's a lot of data explosion happening. Lots of data is being collected. Hence there is a need for data mining. So, we have tremendous amount of data which is accumulated in database and data warehouses and other information repositories. But we are able to analyse only a small part of it due to which something that may be important, will be missed. So, in the future, as datasets grow larger and larger, it would be difficult to derive decision making from data. So, essentially, we are drowning in data but starving for knowledge. So, what is the necessity of data mining? So, we are collecting lots and lots of data which constitute big data. So, we need to come up with a novel algorithm which can analyse this data in a better way to extract actionable knowledge. So, in this video we saw that smart cities are generating large and large amount of data. And we need to analyse it effectively to extract actionable knowledge. [MUSIC]