In this module, we will further discuss ML metadata for TFX pipelines. As we have discussed in previous modules, ML metadata is a key foundation of TFXs task and data ware pipeline orchestration. Let's take a closer look at how ML metadata works. Machine learning metadata includes structured information about your pipeline, models, and associated data. You can use metadata to answer questions such as, who triggered this pipeline run? What hyper-parameters were used to train the model? Where's the model file stored? When was the model pushed to production? Why was model A preferred over model B? How was the training environment configured? Let's take a closer look at what is inside your TFX pipeline Metadata Store. First, TFX's Metadata Store has artifact type definitions and their properties. Your actual data will be stored along with other pipeline artifacts in Cloud Storage, and your TensorFlow saved model may be stored there also or host on AI platform prediction or a model repository such as AI hub. Second, it contains component execution records and associated runtime configurations, inputs and outputs, as well as artifacts. Third, TFX contains linkage records between artifacts. This enables full traceability to trace trained models back to their training runs and artifacts such as their original training data. The purpose of the model versioning and validation step is to keep track of which model, set of hyper-parameters, and datasets have been selected as the next version to be deployed. This is a key enabler of reproducibility for your machine learning experiments and also increasingly important for compliance with legal and regulatory requirements. TFXs ML Metadata also enables you to systematically trace the source of modal improvements and degradation s. This example shows you how you can use previous model runs to compare model evaluation metrics in order to measure improvements and your model's performance over time. Next, TFX pipeline caching, lets your pipeline skip over components that have been already executed with the same set of inputs in a previous pipeline run. If caching is enabled on your pipeline class instance, the pipeline reads ML metadata and attempts to match the signature of each component, the component and set of inputs to one of this pipelines previous component executions. If there is a match, the pipeline uses the components outputs from the previous run. If there is not a match, the component is executed. This provides you with significant savings on computation time and resources, instead of recomputing artifacts on every pipeline run. TFX metadata also lets you incorporate previously computed artifacts back into your pipeline. This enables a number of advanced use cases, including resuming model training from checkpoints. This is useful when a long model training procedure is interrupted and you need to reliably restart the training procedure to accurately update model weights. Benchmarking model components in the evaluator component, this lets you import the last [inaudible] model with the pipeline node to compute statistics, to compare it against the latest trained model in order to determine whether to push the latest trained model to serve in production. Lastly, TFX Metadata enables one of my personal favorite advanced use cases, warm starting. Let's briefly take a look at some pipeline code to configure warm starting and discuss some of its benefits. Traditionally, when training a neural network model, model weights are initialized to random values. Warm starting is a general alternative strategy where instead you initialize model weights by copying them from a previously trained model. This warm starting approach enables you to start training from a better initial point on the low surface, which often leads to higher performing models. In doing so, warm starting leverages prior computation to dramatically reduce model training time, as well as leading to significant computational resource savings, especially in a continuously training pipeline. Furthermore, you can incorporate larger models trained on general tasks from model repositories like TF hub into your pipeline to fine tune on specialized tasks. This is specialized case of warm starting, known as transfer learning that can improve your model performance on significantly less data. Incorporating warm starting into your TFX pipeline is straightforward. In the code on the slide, you can see the use of a resolver node to retrieve the latest plus model weights, and then how you can pass in these weights through the bass model argument in the trainer component.