Section 3E

How numbers can deceive ... exercises

...

Basic skills

Exercise 12 p.179 ... Jeter and Justice

1995 1996
Jeter 12 H, 48 AB, 183 H, 582 AB,
Justice 104 H, 411 AB, 45 H, 140 AB,

Notations: hits (H), at-bats (AB), batting average (AVG=H/AB). Which player had higher AVG in 1995, 1996, and over the period of two years?

Exercise 12 p.179 ... Jeter and Justice

Clearly, Justice had a higher AVG both in 1995 (253 vs 250) and in 1996 (321 vs 314).

Over the period of two years:

Jeter: 12+183=195 H, 48+582=630 AB, 195/630=0.309 ...
309 AVG

Justice: 104+45=149 H, 411+140=551 AB, 149/551=0.270 ...
270 AVG

Jeter has a higher AVG (309 vs 270)

Explanation: Jeter had a very small amount of at-bats (48) in 1995. That almost does not influence his high AVG (314) from 1996.

Test scores. Exercise 14 p.180.

Math SAT scores of high school students in 1988 and 1998

% Students SAT Score
Grade Average 1988 1998 1988 1998 Change
A+ 4 7 632 629

A 11 15 586 582

A− 13 16 556 554

B 53 48 490 487

C 19 14 431 428

Overall average     504 514 +10
Observations ( Exercise 14 c, p. 180)

While within every grade category the average dropped, the overall average has increased from 504 to 514 points.

That is because the fraction of higher grades is bigger in 1998 than in 1988.

Weight training. Exercise 16 p.180.
Two cross-country running teams try weight training
Time Improvement with Weight Training Time Improvement without Weight Training Team Average Time Improvement
Gazelles 10 s 2 s 6.0 s
Cheetahs 9 s 1 s 6.2 s
While both with and without supplementary weight training Gazelles improved better, it is Cheetahs who showed better overall improvement.

Exercise 16 p.180. ... Continued

Explanation: more Cheetahs improved by 9s than Gazelles improved by 10s.

Here is a specific example which yields this outcome.

Out of 20 Gazelles, 10 improved by 10s, and 10 only by 2s

with an average improvement of $$\frac{10 \times 10+10 \times 2}{20} = \frac{120}{20}=6$$ s

Out of 20 Cheetahs, 13 improved by 9s, and 7 only by 1s

with an average improvement of $$\frac{13 \times 9+7 \times 1}{20} = \frac{124}{20}=6.2$$ s

Better drug. Exercise 22 p.181
Two drugs, A and B, were tested on a total of 2000 patients, 1000 men and 1000 women.
Women Men
Drug A 5 of 100 cured 400 of 800 cured
Drug B 101 of 900 cured 196 of 200 cured
Exercise 22 p.181 a) Support the claim that B is more effective than A

While A cured 5 out of 100, which is 5% of women, and 400 out of 800, which is 50% of men,

B cured 101 out of 900, which is 11.2% of women, and 196 out of 200, which is 98%

Thus B is more effective for both men and women.

Exercise 22 p.181 b) Support the claim that A is more effective than B
Recall:
Women Men
Drug A 5 of 100 cured 400 of 800 cured
Drug B 101 of 900 cured 196 of 200 cured
Exercise 22 p.181 b) Support the claim that A is more effective than B

While A cured 5+400=405 out of 900 people, which is 45%,

B cured 101+196=297 out of 1100, which is 27% of the patients

Thus A is more effective for the patients.

Exercise 22 p.181 c) Which claim makes more sence? Why?
Recall:
Women Men
Drug A 5 of 100 cured 400 of 800 cured
Drug B 101 of 900 cured 196 of 200 cured
Exercise 22 p.181 c) Which claim makes more sence? Why?

Drug B is better.

Drug B is more effective for both men and women.

Note that both drugs are much more effective for men than for women.

It is because drug B was tested to mostly on women, its overall performance looks poorer than A.

Exercise 17 p.180 Mammograms with 90% accuracy
Assuming that 1% of tumors are malignant
Tumor Is Malignant Tumor Is Benign Total
Positive Mammogram 90 990 1080
true positives false positives
Negative Mammogram 10 8910 8920
false negatives true negatives
Total 100 9900 10,000
a) Verify that the numbers in the table are correct.
Exercise 17b p.180 Mammograms with 90% accuracy

Suppose that a patient has a positive mammogram. What is the chance that she really has cancer?

Solution. There are 1080 women with positive mammograms. Out of them, there are 90 with cancer.

That is $$\frac{90}{1080}=0.083$$, that is 8.3%

Exercise 17c p.180 Mammograms with 90% accuracy
Recall:
Tumor Is Malignant Tumor Is Benign Total
Positive Mammogram 90 990 1080
true positives false positives
Negative Mammogram 10 8910 8920
false negatives true negatives
Total 100 9900 10,000
c) What is the chance of positive mammogram given a patient has cancer?
Exercise 17c p.180 Mammograms with 90% accuracy

c) What is the chance of positive mammogram given a patient has cancer?

Out of 100 women with cancer, 90 got a positive mammogram.

The chance in question is 90%

In fact, it was just assumed that the mammogram has an accuracy of 90%

Exercise 17d p.180 Mammograms with 90% accuracy
Recall:
Tumor Is Malignant Tumor Is Benign Total
Positive Mammogram 90 990 1080
true positives false positives
Negative Mammogram 10 8910 8920
false negatives true negatives
Total 100 9900 10,000
d) What is the chance for a patient with negative mammogram to have cancer?
Exercise 17c p.180 Mammograms with 90% accuracy

d) What is the chance for a patient with negative mammogram to have cancer?

Out of 8920 women with negative mammogram, 10 actually have cancer.

The chance in question is $$\frac{10}{8920}=0.0011$$, which is 0.11%