With all of that set about information records, let's move on naturally into encryption. We're going to break encryption up now into three separate areas: symmetric, asymmetric, and hashing. Now, for most of human history for the past 5,000 years thereabouts, the principal use of encryption has been confidentiality, to improve confidentiality, to make sure nobody can read our information, and this goes back to Ancient Greece, Egypt, thousands of years ago. We have the Caesar cipher, roughly around 2,000 years ago. More recently, since about 1950, we've started seeing an increase in the sophistication of encryption and some of the use cases we have for it, and we've started to develop our capabilities to improve the management of integrity of files, and hashing is a really good example of that. Something that increases the availability of a file. If something's encrypted, then we can extend, we can share it more widely, and we will look at asymmetric encryption and how it helps us with that. Privacy helps enhance our approach to privacy. Something is readable by only those that should have accessed by people that are authorized. Encryption has a very long history. As soon as we started to document, since we learned to write, we started to find ways to protect those systems. Very early, systems up until the 1800s relied on exchanging a letter for another letter, one character for another character, as we'll see in the next slide. As we started to move towards the 1800s and 1900s and we sought mechanical capabilities and electromechanical systems come into being, what we started to do then is exchange letters for numbers, for numerical values. Computers store each character. Each character on this slide is stored as a numerical representation, a number. They do this using encoding formats like Unicode or ASCII. But the important fact here is that the letter is being stored as a numerical value. Now it's being stored as a numerical value, what we can do is start to perform computations on it. The value of the plain text can be multiplied, divided by the value of the key, for example. This algorithm now can work like a mathematical algorithm. Instead of substituting one character for another, now what we're doing is mathematical calculations on numerical values that represent characters. This is such a big leap in capability, and this around 100 years ago really gave us an additional level of sophistication so that when the silicon chip was invented in 1959, what it meant was we could use computer's capabilities for performing computations, calculations to help us with encryption, and this is what drove such a massive increase in the strength and sophistication of crypto. Let's just take a look then at a really simple example. Julius Caesar commanded his armies using a cipher. He used a plus three shift, the Caesar cipher. What he did was substituted one character for another. Substitution is swapping one thing for something else. A shift is moving forward in the alphabet. If we had the word feedback and we applied a plus one shift, we would move it. Let's start with the character F in feedback. The F would move forward one character to become E, F, G, would become G. The letter E would become the letter F. You see there on the slide in the bottom right-hand side, feedback would become gffecbdl. Is this effective? Well, we've used a plus one shift, Caesar used a plus three shift. Is this effective? What's the problem potentially here? Well, this is deterministic. You can encrypt something, you can decrypt it back. The person receiving the message can unencrypt the message and recover the plaintext. The key in this example would be the plus one or the plus two, whatever the shift is. In order to unlock the encrypted message to go from the ciphertext back to the plain text, we would need to know what the shift value is. Now, the first problem with this is that there are in our alphabet 26 possible shifts. As a number of different options, that's far too limited. But also just look at the word feed and then the first four characters of the ciphertext. The two character Es become character Fs. The most commonly used letter of the alphabet in the English language is the letter E. Is that a problem? Yeah, it is. What we see here is a problem with frequency analysis, where we have commonly used letters or commonly used words. We can start to look for those. Whatever the shift value is, you can uncover. So G, F, F, E, it might be possible to assume if E is the most commonly used character in the English alphabet, that those two Fs might be Es. Based on that, we could work backwards and uncover the key. This is substitution. A really simple way of encrypting a message. We can also look at changing the order of the plaintext. If we add the word cat, we could change that to T, A, C. We have substitution as one option changing one character for another, and then we have the opportunity to change the order of the character's permutation, two examples of early encryption. What's interesting is both substitution and permutation are still used as part of modern encryption, but they're used as part of a much more sophisticated network of changes between plaintext and ciphertext. But it's interesting that we still use substitution and permutation in our modern algorithms. We talked about the key. In order to reverse the operation of our cipher, we have a key or we have a pair of keys, depending on the type of crypto that we're using. In order to change our plaintext to ciphertext, we use a key. The key should be long, should be random, and the length is really important. For some of our ciphers, it's the easiest way to gauge the strength of the encryption in use. The longer the key is, the more possibilities there are. When we look at the Caesar cipher, that shift value could be anywhere between one and 26. Twenty-six different options, that's not a very good key space. What we want is a much longer capability for our key. Key length is important. One of the easiest ways to gauge the strength of crypto, when you see regulatory standards for encryption, quite often they drive compliance based on key length. Your key must be this long or must not be longer than this length, for example. When we share a key, let's say we password protect a document, that password is in effect operating as a key. If we've encrypted an Office document, send it to somebody else, we then have to share the key. Key management is a really big headache for us. If you email the encrypted document and then email the key, what if the attacker has access to the recipient's mailbox, they will get both the file and also the key. Typically, when we're exchanging keys, we want to make sure we do it using a different band or a different way. When we use a single key, it's hard to know who has used it. You can't easily revoke that key or stop it going any further. Where we have one key used is very difficult to control the use and sharing of that key and the file. Very hard to get any accountability.