So I want to show you how authenticated encryption is used in the real world, so let's use TLS as an example and see how TLS works. So data encryption in TLS is done using a protocol called a TLS record protocol. In this protocol, every TLS record starts with a header, we'll see the structure of the header in just a minute, followed by encrypted data that is sent from one side to the other. In TLS, it so happens that the records are at most sixteen kilobytes and if more data than sixteen kilobytes needs to be sent, then basically the record is fragmented into multiple records. Now TLS uses what's called unidirectional keys, meaning that there's one key from browser to server, and there's a separate key from server to browser. So one key is used for sending messages from a browser to the server, and the other key is used from sending messages from the server to the browser, and of course both sides, both the server and the browser, know both of these keys. And just to be clear I'll say the browser will use this key to send data to the server and we'll use this key to read data from the server and the server basically does exactly the same thing just with the opposite keys. Now these keys, both of these keys are actually generated by the TLS key exchange protocol which we're gonna talk about in the second part of the course. Right now I'm gonna assume that these keys have already been established. They're known to both the server and the browser, and now the browser and server want to exchange information using those keys. So the TLS record protocol uses what's called stateful encryption, which means that the encryption of every packet. Is done using certain state that's maintained inside of the browser and the server. In particular the state that's of interest to us are these 64 bit counters, again there are two 64 bit counters. One for traffic from browser to server, and one from traffic from the server to the browser. These counters are initialized to zero when the session is first initialized, and they're incremented every time a record is sent. So every time the browser sends a record to the server, the browser will go ahead and increment this counter. When the server receives that record, it'll go ahead and increment the counter on its side. And when the server sends a record to the browser, he'll go ahead and increment the second counter and again when the browser receives this record it'll go ahead and increment its copy of this counter. So this state these two counters basically this state exists both on the browser and on the server and it's updated appropriately as records as sent from one to the other and received by the appropriate side. Now the purpose of these counters as we'll see in just a minute is to prevent replay attacks so than an attacker can't simply record the record and then replay at a later time because by then the counters will have to be incremented. Okay, so let's look at the details of how the record protocol works. In particular I'll show you kind of the mandatory cipher suit which is encryption using AES-CBC and MACing using HMAC-SHA1. Okay, so remember, TLS uses a MAC, then encrypt, where the MAC algorithm is HMAC-SHA1, and the encryption algorithm is AES128 in CBC mode. Okay, so let's look at how the browser sends data to the server, which, as I said, is done using the browser to server key. Now, the browser to server key itself, is made up of a MAC key and an encryption key. Two separated keys that are again as I said negotiated during session setup. And again I wanna be absolutely clear. There is a separate key for browser to server and a separate key from server to browser. So there, overall, there are four keys. Two MAC keys, and two encryption keys, each one used in the appropriate direction. Okay, so here I wrote down the diagram of what a tls packet looks like. It begins with a header that contains the type of the packet, the version number for the protocol, and the length of the packet. Notice the length of the packet it sends in the clear. Now, when the encrypting data, a certain record, the encryption procedure works as follows. Of course, it takes key as input, and it takes the current status input. And then it works as follows. What it'll do is first of all is it would MAC the following data, while here's the actual payload that's MACed but the heather is also MACed. In addition the counter, the current value of the counter is also MACed and of course, it's all the counters implemented during the [inaudible] fact that one more record has been sent. Now the interesting thing here is that even though the value of the counter is included in the tag. You notice the value of the counter is actually never sent in the record, and the reason it doesn't need to be sent in the record is that the server on the other side already knows what the value of the counter needs to be. So it doesn't need to be told in the record what the value of the counter is. It implicitly already knows what it is, and when it's gonna verify the MAC, it could just use the value that it thinks the counter should be and verify the MAC in that fashion. Okay, so this is kind of an interesting approach, where even though the two sides maintain these counters that function as nonces, there is no reason to send the nonces in the record, because both sides actually already know what counters they're expecting every record that they receive. Okay, so that's the tag. The tag is computed, as we said, over this triple data. The next thing that happens is that the tag is concatenated to the data. Remember, this is MAC-then-Encrypt. So here, we computed the MAC. Now we're gonna encrypt the data along with the tag. So the header, the data, and the tag are padded to the AES block, and I think we already said that this pad, if the pad length is five, then the pad is done by simply writing the byte five, five times. If the pad link leads to B5, the pad would just be 55555. And then we CBC encrypt using the encryption key, we CBC encrypt the data and the tag. And we do that using a fresh random IV, which is later embedded in the cipher text. And then we prepend the header, the type, the version and the length. And that gives us the entire TLS record, which is then sent over to the server. So the grayed out fields in this diagram correspond to encrypted data, and the white fields correspond to plaintext data. So you can see that this is TLS's implementation of MAC then encrypt. The only difference from basic MAC then encrypt is the fact that there is a state, namely this counter is being included in the value of the MAC. And again as I said that's done to prevent replays. So let's see why that prevents replays. In particular, let's see how the record protocol decrypts an incoming record. So, here comes an incoming encrypted record. And again, the server is going to use it's own key that corresponds to data, from browser to server. And it's own browser to server counter. And the first thing it's going to do, is it's going to decrypt the record using the encryption key. After encryption, it's going to check the format of the pad. In other words, if the pad length is five bytes, it's going to check that it really is five, five, five, five, five. And if it's not, it's gonna send a bad record mac alert message and terminate the connection. So that a new session key will have to be negotiated if more records need to be sent. If the pad format is correct, then removing the pad is really easy. All the server does is it looks at the last byte of the pad, say the last byte is equal to five, and, then, it removes the last five bytes of the record. By doing that it strips off the pad. The next thing it's gonna do is it's gonna extract the tag from the record. So, this would be the web sequence bytes inside of the record. So, this would be the, the trailing bytes in the record after we remove the pad, and then it's gonna verify the pad on the header, the data and its value of counter. And if the Mac doesn't verify again, it's gonna send an alert, bad record Mac, and tear down the connection. And if the pad does verify, it's gonna remove the tag, remove the header, and the remaining part of the record is the plain text data that's given to the application. Now, you can see if a record is ever replayed, in other words if an attacker records a particular record and then replays it to the server at a later time, then, by then the value of the counter would have changed and as a result the tag on the replayed record would simply not verify because the tag was computed using one value of the counter but with the replayed record as received at the server The value of the counter of the server is different from the value that was used to compute a tag and as a result the tag was not verified. So these counters are very elegant and simple way for preventing replays and the nice thing about this is because both sides know the value of the counter implicitly there's never a need to send the counter in the record itself. So the counter itself doesn't increase the length of cipher text L. Now, we already mentioned that this particular approach to, authenticated encryption, namely, MAC, then encrypt, using CBC encryption, is, in fact, authenticated encryption. However, it's only authenticated encryption if no other information is leaked during decryption. And we're going to see some acute attacks on TLS if there is information being leaked during decryption. I should say that this bad record MAC alert basically corresponds to the decryption algorithm outputting this reject symbol, the bottom symbol. Meaning that the cipher text is invalid. And as long as there's no way to differentiate between why the cipher text was rejected, in other words the decrypter only exposes the fact that a rejection took place but it doesn't say why the rejection happened this is in fact an authenticated encryption system. However, if you differentiate and expose why the cipher text was rejected whether it was because of a bad pad or because of a bad mac then it turns out there's a very acute attack. Which we're gonna see in the next segment. What I showed you so far is called TLS Version 1.1. It turns out that earlier versions of TLS actually had significant mistakes in it, and as a result, the earlier Record Protocol is vulnerable to a number of attacks. The first mistake is that the IV used in CBC Encryption is predictable. And we said earlier that in CBC, if the IV is predictable then the resulting CBC Encryption is not CPA Secure. Well, in this older version of TLS, TLS 1.0 and earlier, the IV for the next record is simply the last cipher text record of the current record. And as a result, if the adversary can observe the current record, he knows the IEV for the next record and that will allow him to break the semantic security of the next record. So the resulting scheme is not CPA secure. But not only is it not CPA secure, in fact, there is a very acute attack called a BEAST attack that's able to decrypt the initial part of the TLS record simply based on the fact that this scheme is not semantically secure. So, I should say that this method of choosing the iv to be the last block of the previous record is called chained iv's. And you should remember that this, actually should not be used in practice because it always, always leads to an attack. Because of this TLS 1.1 moved to what's called, explicit iv's where every TLS record has its own random unpredictable iv. And that's fixed the problem as soon as browsers and servers move to TLS 1.1, this will no longer be an issue. Now another mistake that was done in TLS 1.0 and earlier, enabled what's called a padding oracle. Which is something that we're going to talk about in the next segment, were what happened was, that if the cipher text was rejected due to an invalid pad The server was sent back an alert message saying decryption failed. Whereas if the cipher text was rejected due to a bad Mac, the server will send back a bad record Mac alert. As a result, and adversary who observes the alert sends back from the server, can tell whether the pad in the cipher text was valid or invalid. And this introduces a very significant attack called a padding attack, which we're gonna talk about in the next segment. The solution to this and TLS 1.1, was basically to say that, instead of reporting decryption failed here, we're gonna report a bad record Mac, even if the pad is incorrect. And, as a result, simply looking at which alert is sent back from the server, an attacker can't tell if a cipher text is rejected because of a bad pad or a bad MAC. So this tries to mask this information. Now the lesson from this is that when decryption fails, you should never ever explain why, I guess this is something that comes out of networking protocols where if there is a failure you wanna tell the peer why the failure occurred, where in cryptography if you explain why the failure occurred that very often leads to an attack. In other words when decryption fails just output reject and don't explain why the reject actually happens just reject the ciphertext. Okay, so now that we've seen TLS 1.1, let's see a broken protocol. So of course I always like to pick on 802.11b WEP, which pretty much got everything wrong. So let's see how not to provide authenticated encryption. So let me remind you how 802.11b WEP works. Basically there's a message that the laptop wants to send to the access point. The first thing that happens is it, the laptop computes a cyclic redundancy checksum on the message and concatenates the CRC checksum to the message. Then the result is encrypted using a stream cipher, in particular RC4. If you recall, the key that's used is the concatenation of an initial value IV that changes per packet and the long term key K. And then the IV along with the cipher text are transmitted to the other side. Now we've already saw two problems with this approach. One was if the IV is ever repeated and in fact it is gonna be repeated then you get a two time pad attack. And the other problem is that [inaudible] uses very closely related keys. In other words, the key is simply IV concatenated to K and the only thing that changes are the IV so the key is always fixed, which means that these PRG keys are very closely related to one another and as we said, the PRG that's used here, RC4 is not designed for this type of use and it completely breaks if you use it with related keys, and as a result WEP has no security at all. What I want to show you. Is that even the crc mechanism that's used here. In an attempt to provide integrity and prevent an adversary from tampering with the cipher text, even that mechanism is completely broken, and it's actually very easy to tamper with cipher texts en route. So let's see how that's done. The attack uses a particular property of the CRC check sum. Mainly the CRC is linear. What that means is if I give you CRC of M, and I ask you to compute CRC of M XOR P, then it's very easy to do. Basically you'll just compute some well known and public function of F(P), you XOR these two together, and that in fact will give you CRC of M XOR P. So it [inaudible] the xor comes out of the parenthesis, and that basically means the CRC is linear. Now here is how the attack works, suppose the adversary intercepts a particular, packet that's destined to the access point. Now the packet say, sais it's destined for destination port 80, and the attacker knows that it's intended for destination port 80 and what he wants to do is modify the cipher texts such that now the destination port will say 25 instead of 80. And maybe the attacker can read messages for port 25 and that's how he actually obtains the actual data in the packet. So recall that the CRC checksum is there to make sure that exactly the attacker cannot change data inside of the cipher text. But I want to show you that in fact it's really easy to modify data in the cipher text and CRC basically provides no security against tampering at all. So let's see how to do it. Well, what the attacker would do is, he would basically Xor some, a certain value XX into the byte that represents the eight zero in the cipher text. Now what he'll Xor in will basically be the string 25 Xor 80 and you remember that if I Xor a certain string XX into the cipher text. That was created using a stream cipher. When that cipher gets, is decrypted, the plain text at this position will also be Xored by XX. And as a result after decryption the plain text at this position basically will be the original 80XR 25 XR 80 which gives us 25. Okay? So after decryption the plain text of this position will now be 25. The problem is that if that's all we did then this attack would fail because the CRC check sum would now would not validate. The CRC check sum. Was built with 80 as a plain text but 25 is a different plain text and needs a different CRC. But it's not a problem because what we can do is we can easily correct the check sum, the CRC check sum, even though the CRC check sum is encrypted. What we do is we XOR F of XX into the cipher text at the place where the CRC is supposed to be and as a result, when the cipher text is decrypted what will happen is we'll get the correct CRC check sum after decryption. So, the interesting thing that happened here is even though the attacker doesn't know what the crc value is, he's able to correct it using this linearity property such that when the cipher text is decrypted the correct crc value appears in the plain text. Okay? So the linearity property of CRC plays a critical role in making this attack works. The end conclusion here is basically that a CRC check sum provides absolutely no integrity at all against active attacks and it should never, ever, ever be used as an integrity mechanism. And instead if you want to provide integrity you're supposed to use a cryptographic mac not an ad hoc mechanism like CRC. Okay, so now we've seen how authenticated encryption is implemented in a real world protocol, like TLS. In the next segment, we're gonna look at real world attacks on authenticated encryption implementations that happen to implement authenticated encryption incorrectly.