Problem 8.58 (4 points)

Given \(Y\) has a binomial distribution with parameter \(p\). Find the sample size necessary to estimate \(p\) to within \(0.05\) with probability \(0.95\)
  1. if \(p\) is thought to be approximately \(0.9\). The standard error for \(\hat{p} = Y/n\) is \[ \sigma_{\hat{p}} = \sqrt{p\cdot (1-p)/n} \doteq \sqrt{0.9 \cdot 0.1 / n} = 0.3 / \sqrt{n}. \] We want \(P(\,|\hat{p} - p| < 0.05\,) \doteq 0.95\), and since \(\hat{p}\) is approximately normal for all sufficiently large \(n\), we estimate \[\begin{align*} \frac{0.05}{\sigma_{\hat{p}}} &\doteq z_{0.025} \doteq 1.96 \\ \implies n &\ge \left[ \frac{0.3 \cdot 1.96}{0.05}\right]^2 \doteq 138.3. \end{align*}\] We may take \(n \ge 139\).
  2. if no information about \(p\) is known. A little algebra shows that \(p\cdot (1-p) = 1/4 - (p - 1/2)^2\). From this we see that the maximum value of \(p\cdot (1-p)\) is \(1/4\), and that the maximum value occurs when \(p=1/2\). Thus, the standard error \[ \sigma_{\hat{p}} = \sqrt{p\cdot(1-p)/n} \le \sqrt{1/(4n)} = 0.5/\sqrt{n}. \] Therefore, a conservative estimate for \(\sigma_{\hat{p}}\) is \(0.5/\sqrt{n}\). Substituting this estimate into part (a) we compute \[ n \ge \left[ \frac{0.5 \cdot 1.96}{0.05} \right]^2 \doteq 384.16. \] We may take \(n\ge 385\). Notice that this estimate is about \(2.8\) times larger than the sample size in part (a).

Problem 8.60 (4 points)

  1. Assuming \(n_1 = n_2 = 1500\), and \(\hat{p}_1 \doteq \hat{p}_2 \doteq 0.75\), we have \[\begin{align*} \sigma^2_{\hat{p}_1} &\doteq \sigma^2_{\hat{p}_2} \doteq 0.75 \cdot 0.25 / 1500 \doteq 0.000125 \\ \implies \sigma_{\hat{p}_1 - \hat{p}_2} &\doteq \sqrt{\sigma^2_{\hat{p}_1} + \sigma^2_{\hat{p}_2}} \doteq \sqrt{0.00025} \doteq 0.0158. \end{align*}\] Thus, a two standard error bound on the error of estimation is approximately \(3\%\).
  2. We want \(P(\,| (\hat{p}_1 - \hat{p}_2) - (p_1 - p_2) | < 0.02\,) \doteq 0.9\). Using the normal approximation, this means we want to find an \(n\) so that \[ \frac{0.02}{\sigma_{\hat{p}_1 - \hat{p}_2}} \doteq z_{0.05} \doteq 1.645. \] Since \(\sigma^2_{\hat{p}_1 - \hat{p}_2} \doteq 2\cdot 0.75 \cdot 0.25 / n\), we want \[ n \ge \frac{2\cdot 0.75 \cdot 0.25\cdot(1.645)^2}{(0.02)^2} \doteq 2536.9. \] We may take \(n_1 = n_2 = 2537\).

Problem 8.62 (2 points)

Given \(\sigma \doteq 0.5\), then \(\sigma_{\overline{Y}} \doteq 0.5/\sqrt{n}\). We want \(P(\,|\overline{Y} - \mu| < 0.1\,) \doteq 0.95\). Using the normal approximation, this means we want to find an \(n\) so that \[\begin{align*} \frac{0.1}{\sigma_{\overline{Y}}} &\doteq z_{0.025} \doteq 1.96 \\ \implies n &\ge \left(\frac{0.5\cdot 1.96}{0.1}\right)^2 \doteq 96.04. \end{align*}\]

We may take \(n = 97\). Water speciments should be selected randomly and not from the same rainfall, so that all observations will be independent.

Problem 8.64 (2 points)

We are given estimates for \(\sigma_1\) and \(\sigma_2\); \(\sigma_1 \doteq s_1 \doteq 24.3\, \mu g\) and \(\sigma_2 \doteq s_2 \doteq 17.6\, \mu g\). If we assume \(n_1 = n_2 = n\), then an estimate for the standard error is \[ \sigma_{\overline{Y}_1 - \overline{Y}_2} \doteq \frac{\sqrt{(24.3)^2 + (17.6)^2}}{\sqrt{n}} \doteq \frac{30.004}{\sqrt{n}}. \] We want \(P(\,|(\overline{Y}_1 - \overline{Y}_2) - (\mu_1 - \mu_2)| < 5\,)\doteq 0.9\). Using the normal approximation, this means that we want to find an \(n\) so that \[\begin{align*} \frac{5}{\sigma_{\overline{Y}_1 - \overline{Y}_2}} &\doteq z_{0.05} \doteq 1.645 \\ \implies n &\ge \left[ \frac{(30.004)(1.645)}{5}\right]^2 \doteq 97.4. \end{align*}\]

We may take \(n_1 = n_2 = 98\).

Problem 8.70 (6 points)

\(n = 20\) Verbal Math
\(\bar{y}\) \(419\) \(455\)
\(s\) \(57\) \(69\)
  1. \(1-\alpha = 0.9\), \(df = \nu = 19\), \(t_{\alpha/2} = t_{0.05} \doteq 1.729\). For the verbal scores, \(\bar{y} \pm t_{\alpha/2}\cdot s/\sqrt{n} \doteq 419 \pm 1.792\cdot 57/\sqrt{20} \doteq 419 \pm 22.04\). Thus, a \(90\%\) confidence interval for the mean verbal score is \[ (396.96 , 441.04). \]
  2. Yes, the interval in (a) includes the true mean of 422. The interval in part (a) cannot, however, distinguish the true mean from any other point in the interval.
  3. For the math scores, \(\bar{y} \pm t_{\alpha/2}\cdot s/\sqrt{n} \doteq 455 \pm 1.792\cdot 69/\sqrt{20} \doteq 455 \pm 26.68\). Thus, a \(90\%\) confidence interval for the mean verbal score is \[ (428.32 , 481.68), \] which does contain the true mean of 474, but cannot distinguish it from other scores in the interval.

Problem 8.72 (2 points)

Data: \(3.85, 3.88, 3.90, 3.62, 3.72, 3.80, 3.85, 3.36, 4.01, 3.82\).

\(n = 10\), \(\bar{y} = 3.781\), \(s \doteq 0.18095\), \(df = \nu = 9\)

\(1-\alpha = 0.95\), \(t_{\alpha/2} = t_{0.025} \doteq 2.262\)

\(\bar{y} \pm t_{\alpha/2}\cdot s/\sqrt{n} \doteq 3.718 \pm 2.262\cdot 0.18095 / \sqrt{10} \doteq 3.781 \pm 0.129\).

A \(95\%\) confidence interval is \((3.652 , 3.910)\).

Problem 8.74 (2 points)

Data: \(16, 5, 21, 19, 10, 5, 8, 2, 7, 2, 4, 9\).

\(n = 12\), \(\bar{y} = 9\), \(s \doteq 6.4244\), \(df = \nu = 11\)

\(1-\alpha = 0.90\), \(t_{\alpha/2} = t_{0.05} \doteq 1.796\)

\(\bar{y} \pm t_{\alpha/2}\cdot s / \sqrt{n} \doteq 9 \pm 1.796 \cdot 6.4244 / \sqrt{12} \doteq 9 \pm 3.33\).

A \(90\%\) confidence interval is \((5.67 , 12.33)\).

Problem 8.76 (8 points)

\(n_1 = n_2 = 15\) Verbal Math
Engineering \(\bar{y}_1 = 446\), \(s_1 = 42\) \(\bar{y}_1 = 548\), \(s_1 = 57\)
Language / Literature \(\bar{y}_2 = 534\), \(s_2 = 45\) \(\bar{y}_2 = 517\), \(s_2 = 52\)
  1. \(df = \nu = n_1 + n_2 - 2 = 28\), \(1-\alpha = 0.95\), \(t_{\alpha/2} = t_{0.025} \doteq 2.048\)

    \(s^2_p = [(n_1 - 1)s^2_1 + (n_2 -1)s^2_2]/(n_1 + n_2 - 2) \doteq [14\cdot 42^2 + 14\cdot 45^2]/28 \doteq 1894.5\). \(s_p \doteq 43.526\).

    \((\bar{y}_1 - \bar{y}_2) \pm t_{\alpha/2}\cdot s_p\cdot \sqrt{1/n_1 + 1/n_2} \doteq -88 \pm 32.55\).

    A \(95\%\) confidence interval for the difference in verbal means is \((-120.55, -55.45)\).

  2. \(s^2_p \doteq [14\cdot 57^2 + 14\cdot 52^2]/28 \doteq 2976.5\). \(s_p \doteq 54.56\).

    \((\bar{y}_1 - \bar{y}_2) \pm t_{\alpha/2}\cdot s_p\cdot \sqrt{1/n_1 + 1/n_2} \doteq 31 \pm 40.80\).

    A \(95\%\) confidence interval for the difference in mean math scores is \((-9.80, 71.80)\).

  3. Verbal SAT scores appear to be different for engineering and literature majors, since the \(95\%\) confidence interval does not contain \(0\).

    The interval for the mean math scores contains \(0\), so we cannot distinguish the observed difference from no difference in means.

  4. For the Verbal Category, we assume the engineering scores are well approximated by a normal distribution, and the language / literature scores are well approximated by a second normal distribution that has the same variance as the first normal distribution. We further assume that the observed scores are selected randomly and independently. The same assumptions apply to the Math category.

Problem 8.80 (2 points)

Given: independent samples of sizes \(n_1\) and \(n_2\) from two normal populations with equal variances. In section \(8.8\) we find that \[ T = \frac {(\overline{Y}_1 - \overline{Y}_2) - (\mu_1 - \mu_2)} {S_p\cdot \sqrt{1/n_1 + 1/n_2}} \] is a \(t\)-distributed random variable with \(\nu = n_1 + n_2 - 2\) degrees of freedom. Now, \(P(-t_{\alpha} \le T) = 1 - \alpha\) implies that \(P(\mu_1 - \mu_2 \le \overline{Y}_1 - \overline{Y}_2 + t_{\alpha}\cdot S_p\cdot \sqrt{1/n_1 + 1/n_2}) = 1 - \alpha\). Therefore, a \((1 - \alpha)\cdot 100\%\) upper confidence limit for \(\mu_1 - \mu_2\) is \[ \overline{Y}_1 - \overline{Y}_2 + t_{\alpha}\cdot S_p\cdot \sqrt{1/n_1 + 1/n_2}. \]

Problem 8.82 (2 points)

Data: \(78, 66, 65, 63, 60, 60, 58, 56, 52, 50\)

\(n = 10\), \(s \doteq 7.97\), \(df = \nu = 9\), \(1-\alpha = 0.90\)

\(\chi^2_{1 - \alpha/2} = \chi^2_{0.95} \doteq 3.32511\), \(\chi^2_{\alpha/2} = \chi^2_{0.05} \doteq 16.9190\)

A \(90\%\) confidence interval for \(\sigma^2\) is \[ \left( \frac{9\cdot 7.97^2}{16.9190} , \frac{9\cdot 7.97^2}{3.32511} \right) \doteq (33.79 , 171.93). \]

Problem 8.84 (2 points)

A \((1 - \alpha)\cdot 100\%\) confidence interval for \(\sigma\) is \[ \left( \sqrt{ \frac{(n - 1)\cdot S^2}{\chi^2_{\alpha/2}} } , \sqrt{ \frac{(n - 1)\cdot S^2}{\chi^2_{1 - \alpha/2}} } \right). \]

Problem 8.86 (2 points)

Using the \(n = 20\) data points in this problem, we calculate \(s = 186.69\). With \(1 - \alpha = 0.99\), \(df = \nu = 19\), and \(\chi^2_{1 - \alpha} = \chi^2_{0.99} \doteq 7.63273\), a \(99\%\) upper confidence limit for \(\sigma\) is \[ \sigma < 186.69\cdot \sqrt{ \frac{19}{7.63273} } \doteq 294.55 \text{ hours.} \] Values in this interval, including the value \(150\), are indistinguishable with this test.

Problem 8.88 (2 points)

Data: \(39, 54, 61, 72, 59\)

\(n = 5\), \(df = \nu = 4\), \(1 - \alpha = 0.99\), \(s \doteq 12.02\)

\(\chi^2_{1 - \alpha/2} = \chi^2_{0.995} \doteq 0.206990\), \(\chi^2_{\alpha/2} = \chi^2_{0.005} \doteq 14.8602\)

A \(99\%\) confidence interval for \(\sigma\) is \[ \left( 12.02\cdot \sqrt{ \frac{4}{14.8602} } , 12.02\cdot \sqrt{ \frac{4}{0.206990} } \right) \doteq (6.24 , 52.84). \]

