Hello, I'm Juan Vergara, welcome back. >> We just finished performing the analysis of our initial three brain versions where we run several machine teaching experiments, mainly focused on the stabilizing skill for our modular brain towards controlling the LunarLander. >> We are happy with our experimentation for the stabilize concept. We're ready to jump into prototyping the other concept of interest Move Right. Now that we have a robust version of the stabilize concept, we are ready to continue experimentation with the Move Right concept. For this concept though, we will not start from scratch again instead we will take all the learnings from the stabilize experimentation concept. Thus far, we have learned first, that aggregating left and right thrust into a single action really helps teaching the brain better and second, that a small episode iteration limit helps find working policies more quickly. >> Our states and actions remain the same as the ones for the Move Right concept, copy the last version of our brain by right clicking and selecting Copy. On the new version, first modify the concept references to stabilize. In its place, insert MoveRight without a space. To do so, go to the Teach tab, enable Visual Editor if not enabled already, then click over your concept and modify it to read MoveRight. >> After changing the concept names we're ready to modify the goal, our Move Right goal has four objectives defined in the AI aspect. >> First avoid crashing, second minimize rotation. Third minimize vertical movement and fourth minimize the spaceship deviation from moving towards the right direction. We translate these into inkling objectives as follows. On your new brain version, go to the Teach tab and select if not selected already. Then click the Edit button within the goals section. Transfer the goals that are available in the instructions file attached to this video. We are good to save and Close. You will notice that we didn't include an objective for rotation. We assume that if the brain is able to find the steady state direction towards the right without any vertical move, the angle will be kept steady at the corresponding best orientation. It's time to visit the Notes tab, enter a new line in front of the existing text and enter the following on the first line. Actually, it seems that we had already embedded the comment for this version. The note should be v3 + MoveRight BRAIN modified Goal objectives from Stabilize to MoveRight. Let's go to version 3 and remove the added note. You probably don't need to do this step. >> After making the changes described. We're ready to start a new version, click the green Train button. >> If you see your simulation doesn't start by default, remember to click the Train button again, and select the simulation for training. While the experiment is being trained, let's plan future experiments. So far we have looked at a couple bonsai training parameters as well as at these states and actions of the problem. There are two things that we have overlooked aside from other available bonsai training parameters. First, we have transferred the goals as they were given to us without a second thought and two, we haven't configured any sim config on our sim nor any training lessons for our curriculum. It's time to review these two items, starting with goal definition. The bonsai documentation has a set of examples around the ways to better use each of the available goals in the platform, avoid, drive, minimize, maximize, and reach. Let's see what happens if we modify our to minimize goals two reach goals. Copy the current brain version that is now training version 4. Then on version 5, the newly created, go to the Teach tab, then click over the MoveRight concept and hit the Edit button within the goal section. You can copy and paste the goals from the instructions file attached to these video or you can simply modify the to minimize statements to become REACH objectives. Note, you might see an error when editing the goals where the action which is currently unused, is replaced for state. Consequently, bonsai complains about having two inputs with the same name, state. Feel free to remove the second state input within the goals or modify it back to Action:BrainAction instead. We are good to save and close. Now we should go to the Notes tab, enter a new line at the top of the notes and comment: v4 + REACH objectives instead of Minimize ones. You're now ready to click over the green Train button. We don't have much of a hypothesis for this brain version number 5. We are evaluating the effect of changing the goal definition and we will have to wait to see the results to extract any conclusions. For our next experiment and whenever we talk about sim config. >> We want to think about deployment, for simulation engineers, did point out that the simulation had very simple initial conditions and that is not necessarily good for deployment. We want the brain to be robust against more unstable starting conditions. We talked to our similation engineers and we asked them if there are any configuration variables that we can use to tune the starting point of the sim. Unfortunately, they let us know that the Gym LunarLander Sim that our simulation relies on does not allow specific initialization conditions. After giving it some thought, we realize that we can apply random actions to the engines prior to asking the brain for an action to further randomize the initial conditions. The simulation engineers put together a new package and uploaded it to Azure following these guidelines. Now our package accepts a configuration parameter to tune the desired randomization during episodes start. The parameter is called randomized steps and it can take values from 0 to 40. This quantity will be the number of steps run prior to asking the brain for an action. Note that random actions will be applied to both engines within their valid spectrum from -1 to 1. >> In order to not start the version completely disconnected from the current flow of experiments and given how well the training worked for minimize in the case of the stabilize concept. Let's look into sim config based on brain version number 4 instead of version number 5, which we are currently exploring a new goal set up. Copy version number 4 by right clicking and selecting Copy Version. Note that our version number 5 is not quite learning at this moment. We don't know what happened but we know that we want to keep training this concept, whenever you see this behavior happening, do not worry just select Train again and choose the simulator that must be useful training, in your case LunarLander. Now that this concept is training, we're good to continue on the newly created version number 6. Go to the Teach tab and click over the MoveRight concept. From here we're ready to click on the simulation right beneath the definition of the same action we have the config value, click over the drop down menu and select New Type. Then transfer the configurations struct from the instructions file attached to this video. Our sim config currently has one single parameter called randomized steps. The simulation engineers are the ones to integrate the reception of this config parameter and its application inside the simulation during episodes start. Note that bonsai will not be able to know if a given config parameter is being used inside the simulation or if it even exists. Hence, the importance of having a clear JSON description file that is set along the simulation. This enables any future person working with the simulation to understand the full set of sim states, sim actions, and sim conflict accepted by the simulation. We are good to save and close for now. If you click over the Errors & Outputs tab, you will see a notification that alerts you of the following. The source defines a configuration but no lesson is specified for this curriculum. Once I will not be sending any initialization values to the simulation during episode start, unless a lesson is defined within the curriculum. Time to add the lesson based on the sim conflict we just defined. Click over the MoveRight concept now, then on the right pane, click the plus button that is located inside the lesson section. Name the lesson, randomize, start, then click the Edit button and paste the randomization that is included in the instructions file attached to this video. For this lesson, randomizes steps will be randomized between 20 and 30 during training. >> Our hypothesis for this experiment is that the brain will possibly take longer to train, it might possibly fail at the task to. Nonetheless, it is a needed step since we want the brain to be ready to handle complex as scenarios. Click the Train green button and our new brain version will start training automatically. >> It is time to wait for training sessions to finish, see you in a bit, once our training sessions are completed. It is your turn now, follow the steps described to create your own version 4, version 5 and version 6 of this experimentation pipeline we just discussed, good luck.