[MUSIC] Hello welcome to AI Empathy and ethics. I'm Calvin Lewin. So very first picture that we want to look at it both first computer generated and actual photo. So your task is to figure out what is the input and what is the output. And this was state of the art back from 2017. So now that you've got a chance to think about a little bit, let's analyse this picture actually for a human, there's quite a few hints on this picture about what is the input and the output. So with the arrow going from left to right with Sonny and Sunset, we can probably guess that the sunny is the input and the sunset is the output. But what's striking about this is even back in 2017, you actually can't tell what is probably a real photo and what is computer generated and this is how good a. I perception has gotten and you can imagine that it has gotten much better beyond those steps in subsequent years. So the first thing we want to look at is how is that possible? So in order to get to this level of being able to create this new scenes, first thing we have to do is the concept of creation or taking something apart and putting back together so we can look at this stylistic kind of artist's rendering of a nature imagery in this case the style image. And if we're able to combine it with the content of the image where there's a particular distinct style from the left and then a content image on the right. If we can extract it then we can transfer the style and the content and put it together to do this style transfer image creation process. It's not quite a full generation of completely new but taking something existing and mixing and matching and that's what deep learning was able to do to begin with. And that was with very limited sort of known styles that had already seen. And then subsequently what deep learning was able to do with using this adaptive layer that we see in this rough diagram is table to take the styles of any picture or photo or painting and then combine it with the content from anything else. And you can see from the style that we see below from the orange kind of picture and then the skyline of the city scape. And then from this output you see the result that's been produced. That is the combination of the content and the style of it right. So very first generational deep learning networks can only do styles and things that it's been trained on. But subsequent innovations like this adaptive layer allow it to be able to transfer the style from any unseen style with any combination and with this kind of thing we were able to do that. So the question is based on this diagram and what you have seen, who or what is telling us that the result is correct I E. Discriminating whether or not the output is a correct style transfer or something that just gibberish or something that's transfer some of it but not quite correct. So it's not up to you to figure out who or what in this diagram is deciding whether the output is actually correct and what does it mean by correct? So now that you have a chance to think about it a little bit, you will see that actually there's always understanding or assumption of the environment and the system that we live in right. In this case it's observer or the human who is the ai designer or a researcher or practitioner who's deciding. So there's always that human in the loop. And whenever we have these kind of systems, the diagram or the thing is not enough, you have to think about the overall environment that these things exist in. So it turns out it's actually was the humans that were deciding whether or not something was correctly in terms of the style transfer. So with that we can see that actually if we have something in the environment we can automate it by putting it together in a particular diagram or in the network together. And that is the basis of generative adversarial networks right here, you have a generated model that tries to create something as the name implies. And then you also need a counter or adversary. That's a discriminative model that's trying to decide whether or not something is real or it's steak, right? So you'll typically see these kinds of definitions that are heavy on the map. But in general, if you try to extract the idea from the name themselves, one is generating the other is discriminating and they both have to be trained in a very balanced way so that one is not a pressing the other and then they can both learn in a cooperative but in a competitive way. So that's why it's generative adversarial networks. So let's look at a little bit of these details and what that might be. Again, if we look at these diagrams, whether you're looking at it online, any blog posts or any article like that. It could be daunting, but we could break it down. And subsequently in this course we'll learn to break each of these blocks down and what they all mean so that you can piece it together. Right? So in general, if you look actually a lot of it is repeated. The very first kind of thing that's there is encoding. So you're trying to encode something, right? As the name implies, you have some input and you're trying to encode some logic or understanding of information from that. And then what's the inverse or opposite of encoding is decoding? Right, so you encode something and there's a string of serial encoding steps and then you have a subsequent reversing of decoding those steps. So that's how we could extract either the staff or the content from the original photos, then here you can see that actually, instead of just doing a strict serial connection,. This particular game network added this concept of skip connections where you see that the corresponding encoder output, not just go to the next encoder output, but also go to the equivalent decoder thing. So those are what we mean by skipping, it's skipping ahead to the output right? What that allow is that allows this network to work at very different resolutions and working it all together. And you will see in subsequent lessons why? That's a good thing to be able to work with different resolution levels and then extract different features at different sizes. So then you also need a way to put them together. So that's probably concatenation. So, if you're from a programming background, if we have two strings we usually talk about is something concatenation, putting it together. And then when we talk about different resolutions, you'll see that this is actually reflected in the diagram of these feature size or the 256 by 256 by three or two by two by 5, 12 right? So that's exactly the dimension of the input and the size of the information that this network is working on. So you'll start to see that you can actually figure out what each of these stands for later on. But in general when we look at it generative adversarial network or again you'll see it's all based on similar network architectures like this. Even those details might differ, but generally the flow is the same. So if we have one generator and one discriminator and that was good for again, the next evolution that we have is to basically duplicate it all so that you have again here a generator and a discriminator. But then you get another one, another discriminator and generator. And you put them in a loop like this, which allows us to go from black and white pictures to colour pictures and vice versa. So this is sort of known in our mathematical circle as six point calculation where if you complete the loop, you should get the original back. So this allows for that first picture that you saw in this lecture where we had this scene and then we were able to convert it from sunny to sunset and then you should be able to convert it back after. And this is the architecture that allows that to happen. So based on the innovation of generative Adversarial network, just by duplicating that function and then adding it back, that's how we get cycle gains. And in general you'll see that in a I and Tech in general where more and more in the environment or these blocks become part of the something hole and that hole becomes the architectural blocks too big, something much much bigger. So this is there's no magic to it. This is how the innovation of cycle gain is made. Thank you.