Hi. Last week we learned the main characteristics of a graph database, we have graded and query this type of database. This week we will understand the different architectures provided in order to design reliable, maintainable and scalable data intensive applications. So let's start with some important concepts. Data-intensive applications store data so that they or another application can find it again later in a database. The applications put the result of an expensive operation on cache memory to speed up rates. They allow users to search data by keyword or filter it in various ways. They can send a message to another process to be handled asynchronously by stream processing and periodically crunch a large amount of accumulated data through batch processing. An example of a data intensive application can be an online gaming experience. This application is assigned to manage several thousand concurrent users, and can scale out at several points as needed. Traditional database management systems have been utilized for data-intensive applications. However, as the system requirements, volume and availability of data are increasing the task is not that simple. Furthermore, there are various approaches to caching, several ways of building search indexes and so on. When building an application, we still need to figure out which tools and which approaches are the most appropriate for the task at hand and it can be hard to combine tools when you need to do something that a single tool cannot do alone. Therefore, it's important to keep in mind some questions that should be asked an answered when designing data intensive systems. For example, how do you ensure that the data remains correct and complete, even when things go wrong internally? How do you provide consistently good performance to clients, even when parts of your system are degraded? How do you scale to handle an increase in load? What does a good API for the service look like? There are many factors that may influence the design of a data system such as, skills and experience of the people involved, legacy systems dependencies, the timescale for delivery, your organization's tolerance of different kinds of risk, regulatory constraints et cetera. Those factors depend very much on the situation. A data intensive application should be reliable, scalable, and maintainable. Even in the case of an increment of work load. Reliability is an important characteristic of data intensive applications. Reliability is when the system should continue to work correctly even in the face of adversity. The application performs functions expected by the user. It can tolerate user mistakes or the execution of software in unexpected ways. It's performance is good enough for the required use-case under expected loads and data volume. The system prevents any unauthorized access. In the case of scalability as a system grows in data volume, traffic volume or complexity, there should reasonable ways of dealing with that growth. Scalability is a system's ability to cope with increasing load. Discussing scalability means considering questions like, if the system grows in a particular way, what are our options for coping with the growth? How can we add computing resources to handle the additional load? An intensive data application should perform well in the case of increasing workload. The workload depends on the architecture of the system. For instance, the number of requests per seconds to our web server, the ratio of reads to writes in a database, the number of simultaneously active users in a chat room or the hit rate on a cache or something else. A data intensive application should be maintainable. I mean over time, many different people will work on the system, engineering and operations, both maintain current behavior and adapting the system to new use cases, and they should all be able to work on it productively. There are three design principles of software systems that help them to be maintainable. Operability, is when an operation team makes easier to keep the system running smoothly. Simplicity, makes it easy for new engineers to understand the system by removing as much complexity as possible from the system. In the case of operability, a good operations team should be responsible for the following and more. Evolvability, make it easy for engineers to make changes to the system in the future, adapting it for unanticipated use cases as requirements change. Monitoring the health of the system and quickly restoring service if it goes into a bad state. Tracking down the cause of problems such as system failures or degraded performance. Keeping software and platforms up to date including security patches. Keeping tabs on how different systems affect each other, so that a problematic change can be avoided before it causes damage. Anticipating future problems and solving them before they occur. Software applications should be as simple as possible. Small software projects can have delightfully simple and expressive code but as projects get larger, they often become very complex and difficult to understand. This complexity slows down everyone who needs to work on the system. Increasing the cost of maintenance. An application must be prepared to evolve because system requirements change all the time. Such as, new facts, previously unanticipated use cases emerge, business priorities change, users request new features, new platforms replace old platforms, legal or regulatory requirements change, growth of the system forces architectural changes, et cetera. Well, we have learned some characteristics that a data intensive application should have. However, there are different types of information systems. We will review how to implement them during the next session. I hope you enjoyed.