Next we'll look at separate chaining, a collision red solution strategy that makes use of elementary link list. So, what are we supposed to do when two different keys hash to the same index? The birthday problem tells us that we're going to have collisions. You need a quadratic amount of memory to avoid collisions. In the lower balancing, a coupon collector analysis tell us that the collisions are going to be evenly distribute, distributed among the table, around the table. So, what we want to do is have an easy way to deal with collisions. And so the first way we'll look at is called Separate Chaining and it's a very diagonal idea back1953, and the idea is just build a link list for each of the table positions. So, we'll have a table that's smaller than the number of keys that we have, the hash function will map each key to some integer. So in this case, we have a table of size five. So the hash function will map any key. In this case, we use single letter keys. It'll map any key to an integer between zero and four and then to [cough] do an insertion we'll just keep a link list at the table position corresponding to the hash value. So S hash is to position two, it'll be on the link list that is first link is at position two. And E goes to zero, and A goes to zero. And for search, we're going to have to go to, if we're going to look at is C in this table say, we're going to find the hash value for C and we'll look down the list to see if we can find C. So we have to look through the whole list for search but you only have to look through one list out of all the lists. Essentially if you have M entries in the hash table and M keys the link of list you're going to look at is about N over M cuz they're evenly distributed. So that's a straightforward and simple scheme for implementing symbol tables with hashing. Now, we could use a interval bag or some data structure like that and hide the link list structure underneath and that's a perfectly fine way to proceed in modern programming. This implementation directly implements the link list. Now, for a practical situation we picked some kind of, some value of M. You could make it so that the hash table itself grows once it gets really huge and such hybrid methods are easy to implement. So we won't talk to much about that. We need to just in terms of implementation details, our keys and values have to be objects. Because we can't have an array of generics. So, since we're making array of nodes, a node would have generics if we use to key in value. So we have to make them objects then when we get things off, we're going to have cast. So [cough] this is the get procedure. So to look for a key in the hash table we compute the hash value. In our hash function is pull out the system hash code, make it positive by ending off the sign bit and then mark with M to get a number of, zero and -one. So we pick that number, I and then we just go to that list and this is the standard code for diversing a link list start at the first node as long as it is not null go x = x dot x. And if you find a key that's equal to the key you're looking for, return the value and we have to cast it to value because of the generic recreation problem in Java, otherwise return null. So that's not much code, and it's trivial code at that for doing an efficient symbol table search using hashing. And insertion is not much more difficult if you do the same thing and if you find a node where key equal to key on the link list, reset the value and return. Otherwise, you make a new node and put it at the beginning of the link list with the standard code. Now, replace STFI with a new node that links to the old STFI. So again, very little code to implement search and insert using hashin g and that's why it's so popular. And what about the analysis? Well, again this the [cough] standard probabilistic analysis of the balls and bins problem tells us a lot of information of what goes on. And again, if the uniform hashing assumption holds the probability that the number of keys within a list is within a constant factor of N over M is extremely close to one. So, it means that we've divided the search cost which would be N if we have a sequential search by a factor of M. And, and in many applications even setting M = 100 or 1,000 is going to be very effective. And that's why so many system programs refuse that. So, number of pros for search and insert's proportional to N over M. Now typically, what we'd what a programmer would do is try to figure on making M about equal to the number of keys divided by five say. So, you can't make M too large, you have too much space and you'll have empty chains or short chains. And if you make M too small then they're too long, you have to search through them all. So let's say, N over five and then you get constant time searches and not much extra space. You have extra space for the links to implement the link lists but the rest of the table is not much extra space. And those are typical parameters. If you want a full service symbol table which is going to, going to grow from small to huge and then back down to small again then you'd want to use array re-sizing to make sure that M is always within a constant factor of N but we will leave that detail out for now. So that brings us to this summary where red-black trees, we were happy with a log based two of N for search and insert with separate chaining, you can really get it down to a constant number of operations for search and insert. So hashing's going to be preferred for short keys where the hash function's easy to compute. And where we don't need ordered iteration or any of the ordered symbol table operations because it has really fast access to the symbol table. That's our first collision resolution method, hashing with separate chaining.