So in this section, we'll pick up with momentum we got from lecture section part one, which means we'll do the same thing for creating confidence intervals for proportions and incidence rates. We'll see it's just more of the same, the general idea remains the same its only the mechanics that change slightly. So in this lecture set, we'll estimate a 95 percent confidence interval for a population proportion based on the results of a single sample from the population. Will show how to do that, and also show how to estimate a 95 percent confidence interval for a population incidence rate based on the results of a single sample from the population. So let's start with the binary outcomes, let's start with one of our favorite samples that we've been using is the response to therapy in a random sample of 1,000 HIV-positive patients from a citywide population. We've seen this before, if we wanted to estimate what proportion of all such patients would respond just looking at the group in general, and not subsetting by other characteristics like health or baseline CD4 count, we could take the proportion, we observe responding in this group, and use that as an estimate for the underlying overall response. In this group 206 of the 1,000 persons responded for response proportion of 20.6 percent. But how can we put confidence limits on this if we wanted to get a sense of overall how well would this work in accounting for the uncertainty or estimate? Well, you may recall that we had this conundrum, we had it with means as well, and we're going to use the same work around here is the theoretical standard error of a sample proportion denoted as SE of p-hat, is given by the following formula, it's equal to the the standard error of an imperfect estimate of the true proportion is actually a function of the true proportion, and the size of our sample. Well again, of course, if we knew the true proportion, we wouldn't care about the uncertainty in an imperfect estimates of p-hats from any one sample. Also, if the standard error is based on this unknown truth, how can we compute it if all we have is an estimate of the truth. Just like we did with the standard error for sample means which was based on the theoretical variability in our population, we're going to replace that with our sample-based estimate of this. So the standard error of our proportion is estimated by taking we're going to replace our p's with p-hats. So we're going to use are the actual observed proportion, we're using to create confidence intervals we're not only use it as the starting point for a confidence interval, but to estimate our variability. Just FYI, you may note very briefly where I mentioned that the standard deviation of a sample proportion is not a very useful quantity, it doesn't tell us anything about the data in our sample above and beyond the proportion itself, but nevertheless that the standard deviation was equal to the square root of p-hat times one minus p-hat for any sample of zeros and ones. So again, we're getting into a situation where our standard error is a function of the formula of the variability of the observations in an individual sample divided by a square root of n. So it's very similar to that concept for means even though we said for binary data the standard deviation alone is not a very useful summary measure. So let's just crank out the standard error estimate for our example where the observed sample proportion is 0.206, and the sample size is a thousand. We do this the standard error is 0.013, or 1.3 percent. Again, the standard error estimate quantifies how far on average sample proportion estimates from random samples of 1,000 from this population will fall from the true population proportion. We only have one instance of a random sample of a thousand, but there's multiple infinite number that could've been taken which could yield different results, and the standard error quantifies how variable the standard proportion estimates would be across those samples. So 95 percent confidence interval for the true proportion of persons responding in this population all HIV positive persons in the city from which the sample was taken, is given by our p-hat estimate plus or minus two estimated standard errors. Well, our p-hat estimate was 0.206, plus or minus two times 0.013. That gives us a confidence interval with rounding from 18 percent to 23 percent. So we can estimate the truth within plus or minus two times 0.013, or 0.026, or 2.6 percent. So what we add and subtract to get the 95 percent confidence interval is 0.026 or 2.6 percent. This is sometimes called the margin of error of our estimate, meaning we can estimate the truth within this amount with 95 percent confidence. So the 95 percent confidence that we've created here for the true proportion who would respond in the population from which the data were sampled is 0.18 to 0.23, or 18 percent to 23 percent. So it gives us a sense of whether how effective this could be on the lower end and the upper end if we were to treat everybody in the city using the results from this sample where we did the initial study. Let's look at another example we've been using at the seminal example, seminal study that showed that treating HIV positive pregnant mothers with AZT during their pregnancy was very effective at reducing the occurrence of maternal-infant HIV transmission. We haven't proved beyond reasonable doubt yet that it's truly effective because we have yet to take into account sampling variability we've only looked at our estimates, but this will be our first pass at doing so. So you may recall the proportion of children who ultimately got HIV born to mothers who were given AZT was seven percent compared to 22 percent in the group of untreated mothers. So if we can estimate the standard error for each of these two groups using the formulas, we'd use only the data in the AZT group for the standard error of that estimate, there were 180 children born to mothers who were given AZT of which seven percent developed HIV, so the standard error is just our formula p-hat of 0.07 times one minus p-hat of 0.93 divided by 180, we square root that, we get an estimated standard error on that proportion of 0.019, or 1.9 percent. For the standard error of placebo- The observed proportion was notably larger is 22 percent, so the standard error would be a function that 22 percent, and it's compliment from e percent, divided by a very similar number, 183 as opposed to 180. So, here you can really see the difference in the standard errors that comes from having the different estimated proportions. The standard error for the proportion in the placebo group is notably larger than the standard error from the proportionally AZT group, even though both are based on almost the same number of persons, and that's because of that difference in the estimated proportion in each group. So, we do the confidence intervals for these groups. For the AZT group, if we do this out we get a confidence interval from 0.03-0.108 of children who would ultimately get HIV from their mothers where their mothers were treated with the AZT. In the placebo group, the confidence interval shifts up from about 15.8 percent to 28.2 percent. Let's just talk about this for a minute, substitute fully start thinking about what these confidence intervals can tell us. In the untreated group this gives us an estimate and then a confidence interval for what the outcome would be across a group of mothers who were never treated. They were not treated during pregnancy when they were HIV positive, and that suggests that if we didn't treat mothers, roughly 22 percent of the children would actually contract HIV. So, this is a significant public health issue in the estimate. But if somebody pushed us and said, "Well, you only base this estimate on 183 mothers and 183 children, so there's potentially a lot of uncertainty in that, how can you account for that?" Say well, after accounting for the uncertainty in this, we get a confidence interval for the true proportion of children who would develop HIV from about 15.8 percent to 28.2 percent. Then we said, "Wow, that's a pretty wide interval." Let's say, "Well, yes it is, but what it tells me is even in the best-case scenario, we're talking about 16 percent of these children developing HIV. That I would argue is a huge public health burden and issue. If we could reduce that, it would be good for the health of children and good for resources. So, this confidence interval can be used for any single group in a substantive contact in cases like this. Also to note that the confidence interval for the proportion of children who would develop HIV, or their mothers treated with AZT goes from about 3.2 percent to 10.8 percent. This is where confidence intervals become really illuminating in my opinion when we start to compare results between two or more groups which we'll actually formalized in the next lecture set. But if you look at these two confidence intervals you notice that they don't overlap. Again, we saw an example like this in the previous section. Just think about that. What does that imply? Here's one more example of confidence intervals and use, this as a study we looked at for patients in a health system randomized to one of four groups to remain different levels of assistance and intervention with regards to colorectal cancer screening from receiving nothing from the health plan to graded levels of intensity. What they showed was that more attention on the patient resulted in substantially higher proportions getting screened. This just result also includes the confidence interval for each of those groups. So, in the untreated group there was the usual care, nothing special was done, only about a quarter of 26.3 percent got screened with a confidence interval of 23.4 percent to 29.2 percent. So the best we could help for, the health plan could hope for for the business as usual was at 29.2 percent screen proportion at best, and on the low end twenty 23.4 percent. This just shows that as soon as they automated the second tier of intensity, the observed value jumped to 50.8 percent with a confidence interval of 47.3 percent to 54.4 percent. So even if we just had these two levels, the usual care or the automated, the results were much better for the automated group. Even in the worst case scenario for the automated group, a lower bound on the screening proportion of 47.3 percent, that's still a lot larger than the upper bound or best-case scenario for the automated group. So, just things to think about an interpreting when you see such results in the literature. Turns out the computation of a confidence interval for a single incidence rate is follow suit from what we've been doing. I will only give one example here, we'll give another example in the additional examples section. But it'll be more interesting to talk about confidence limits on incident rate related things when we start comparing groups because single incidence rates in a vacuum without a Kaplan-Meier curve etc, are hard to interpret in my opinion. So, let's just look at our primary biliary cirrhosis, randomized clinical trial example. You may recall patients with primary biliary cirrhosis were randomized CBC with drug, or a placebo. But right now, we're going to just look at the overall incidence of death in the entire sample, not breaking out by drug group. There were 312 patients, and there were 125 deaths in 1,715 years of patient follow-up for an estimated incidence rate of 125 deaths per 1,715 years or 0.073 deaths per person-year. Turns out the standard error of incidence rate can be computed by taking the square root of the number of events in our sample divided by the total follow-up time in the sample. So, if we to do that for this group is observe 125 deaths. So, the standard error would be the square root of 125 deaths where we have 125 which is 11.2 divided by the total person-time of 1,715 years or when all the math has done this turns out to be 0.0065 deaths per year. So, if we to do a confidence interval for the population level at incidence, take the estimated incidence rate plus or minus two standard errors of it, 0.073 deaths per person-year plus or minus two times 0.0065 deaths per person-year, and it gives us confidence interval, 0.06 deaths per year to 0.086 deaths per year in this group of patients. So in summary, it was more of the same really mechanically, 95 percent confidence intervals for a population proportion p, based on data in a random sample taken from the population can be constructed by taking our estimate plus or minus two standard errors. We can estimate the standard error as a function of the estimated proportion we're building off of to create a confidence interval and the sample size. Again, the standard error of the sample proportion quantifies the variation in sample proportions across random samples of the same size from the same population. Similarly, for incidence rates we can do the same thing. We take our estimated incidence rate and then add and subtract two estimated standard errors, where the estimated standard error over incidence rate is a function of the number of events we see in our sample divided by our total follow-up time. So in the next section, we'll think a little more critically about what we've done here. It's mechanically easy, but we'll pay a little more attention now to thinking about what does it mean to have 95 percent confidence? What exactly are we getting at in that uncertainty bound?