There's a lot more to any Machine Learning enabled system than the question answering machine that gets most of the attention. In this video, we're going to walk through issues that arise from putting your QuAM into a real life system. To actually make use of your model, you need a way to interact with it, which creates complications you can't address simply by testing the QuAM in isolation. Machine Learning is all about data. So let's start there. In Course 3, we talked about the many intricacies of managing data. You need a way of extracting and storing raw data, then you need to combine it, and clean it, and process it, and turn it into useful features. Each project has its own ETL process particular to its needs. Each QuAM in production will have its own specific ETL process; extracting and transforming the relevant data, then loading it in to get the right answers. For each QuAM in use have a clear and explicit understanding of how its ETL process varies information through the live system. We've talked about the importance of your data pipeline a few times already, but what about the other end of our QuAM? Where do the answers go? How do we make use of them? This of course depends a lot on the purpose of the system, but there are some questions you'll always need to answer. First, where does the model live? Is it integrated right into the system or accessed by an API, accustomed Application Programming Interface? Is the actual code that makes up the model pasted into the same code as the rest of the product, or part of a library included in the code, or compiled as a totally separate program or process? Is the access synchronous or asynchronous over a network through a remote procedure call or across different processes on the same hardware? Having a distributed system means you can use load balancing to manage scalability. If it's integrated right into the system, it makes it harder to test different models in production, since the whole system has to be updated for every model change. On the other hand, if the model's kept separate, maybe running as a microservice that makes predictions on demand, that can create latency issues. You might ask the question and just be left hanging waiting for the answer. In that case, you need a backup plan. Map out how your system will handle the inevitable delays. Sometimes the production system has hardware limitations which means you can't incorporate the model directly into the system, and we'll talk about that case explicitly in a later video. Most of the time when we talk about question answering machines, and how we have the training phase and the using phase, we're technically talking about offline models. The training happens, the model's created and tested, and then that exact precise model is deployed to be used. But sometimes, you want the learning to keep going, and we call this online learning or continuous learning. You still have an extensive training and testing phase, but in some sense, your learning algorithm remains part of your question answering machine. In this case, the operational data gets fed into your QuAM and answers pop out. But at some point, that answer is validated. Maybe immediately, maybe a bit later. But when the answer is proven correct or not, that information is fed back into the model, and the learning algorithm updates the model accordingly. Integrating continuous learning systems into production requires special care, especially around monitoring the performance and parameters of the model. In highly dynamic situations, they can be the best solution. But if you're just starting to integrate Machine Learning models, go for simpler static models, and build up to the more complex cases. The process for building, deploying, and continuously improving a Machine Learning application is more complex than a traditional software solution that doesn't include a learning component. A Machine Learning application can undergo changes in the model, in the data, and in the supporting code. Having a clear integration plan is crucial. Data people build pipelines to make their data accessible. ML scientists are building and improving the ML model, and ML developers take care of integrating that model and releasing it to production, and you might fill every one of these roles. Even so, some separation is needed. But in the end, it all has to work together smoothly. So make an explicit plan for integration, communicate it across teams, and you'll be fine. Or in the worst case, you'll know what you need to do differently next time.