Notes of M.Sc. II Biotechnology, Biostatistics sample size determination.pdf - Study Material
Page 1 :
Sample size determination is the act of choosing the, number of observations or replicates to include in a, statistical sample. The sample size is an important, feature of any empirical study in which the goal is to, make inferences about a population from a sample. In, practice, the sample size used in a study is usually, determined based on the cost, time, or convenience of, collecting the data, and the need for it to offer, sufficient statistical power. In complicated studies, there may be several different sample sizes: for, example, in a stratified survey there would be different, sizes for each stratum. In a census, data is sought for, an entire population, hence the intended sample size is, equal to the population. In experimental design, where, a study may be divided into different treatment groups,, there may be different sample sizes for each group., Scanned by CamScanner
Page 2 :
Sample sizes may be chosen in several ways:, using experience - small samples, though, sometimes unavoidable, can result in wide, confidence intervals and risk of errors in statistical, hypothesis testing., using a target variance for an estimate to be derived, from the sample eventually obtained, i.e. if a high, precision is required (narrow confidence interval), this translates to a low target variance of the, estimator., using a target for the power of a statistical test to be, applied once the sample is collected., using a confidence level, i.e. the larger the required, confidence level, the larger the sample size (given a, constant precision requirement)., Scanned by CamScanner
Page 3 :
Estimation of a proportion, Main article: Population proportion, A relatively simple situation is estimation of a, proportion. For example, we may wish to estimate the, proportion of residents in a community who are at, least 65 years old., The estimator of a proportion is p = X/n, where X is, the number of 'positive' observations (e.g. the number, of people out of the n sampled people who are at least, 65 years old). When the observations are independent,, this estimator has a (scaled) binomial distribution (and, is also the sample mean of data from a Bernoulli, distribution). The maximum variance of this, distribution is 0.25n, which occurs when the true, parameter is p = 0.5. In practice, since p is unknown,, the maximum variance is often used for sample size, assessments. If a reasonable estimate for p is known, the quantity p(1 – p) may be used in place of 0.25., |, Scanned by CamScanner
Page 4 :
For sufficiently large n, the distribution of p will be, closely approximated by a normal distribution.1], Using, this and the Wald method for the binomial distribution,, yields a confidence interval of the form, 0.25, 0.25, p + Z., n, n, Scanned by CamScanner
Page 5 :
where Z is a standard Z-score for the desired level of, confidence (1.96 for a 95% confidence interval)., If we wish to have a confidence interval that is W units, total in width (W/2 on each side of the sample mean),, we would solve, 0.25, Z., W/2, for n, yielding the sample size, = U, W?, in the case of using .5 as the most, conservative estimate of the proportion. (Note: W/2 =, margin of error.), Otherwise, the formula would be, p(1 – p), Z., W/2,which yields, 4Z°p(1 – p), n =, W?, Scanned by CamScanner