# Basic Applied Statistics 200 Solutions to Practice Midterm 1

1.
1. (ii)Crunchy (stemplot is centered around higher values)
2. (ii)Crunchy's somewhat more variable (stemplot has more spread)
3. (ii) Crunchy (stemplot has longer right tail, whereas Creamy has a slightly longer left tail, when configured on an axis with lower values to the left, higher values to the right)
4. (i) Creamy (because it would appear far from the main clump of values; in fact, it is an official outlier according to the 1.5*IQR rule)
5. 1st, 5th, average of 9th and 10th, 14th, 18th: 34,42,51,62,80
6. (i) Creamy (min is 22, max is 68, etc.)
2.
1. .9505 (look up proportion LESS than +1.65)
2. approximately 1
3. .0160 (subtract proportion for -1.4 from proportion for -1.3; proportions CANNOT be negative, so I took off 3 pts. for an answer of -.016)
4. Look up a proportion of .8000 below, and find z=.84
5. Look up a proportion of .1000 and find z=-1.28
1. Half are below the mean, 3.0
2. Take mean plus or minus 3 standard deviations: between 2.7 and 3.3
3. x>3.14 means z>-1.4, look up proportion less than +1.4: .0808
4. .09 below has z=-1.34, so x=3-1.34(.1)=2.866
1. gender: it could affect use of contraceptives, whereas use of contraceptives of course cannot affect gender
2. 510/2000=.255
3. 400/2000=.20
4. (i) females (400/800 as opposed to 530/1200)
5. 210/560=.375
3.
1. number of persons
2. (i) increase
3. (ii) moderate
4. (v) .63 (this could also be found by taking the square root of R-Sq=.396)
5. (ii) stay the same; r is independent of units of measurement
6. 4.55+0.996(1)=5.546
7. 6.83-5.546=1.284
8. (ii) Take the number of people and add 4.55
9. 2 (Number of persons=2 clearly has the most extreme residual, and it's even singled out as an outlier in the output at the bottom of the page. However, I only took off 2 points if you didn't notice the residual for 2, and thought the one for 6 persons was the most extreme.)
10. (ii) 10 people discarding 4 pounds (This is the only one of the three that is way off the regression line, and the fact that its x-value is far from the rest could give it a lot of influence.)
1. (ii) (In this design, a treatment---encouraging mothers to breast-feed---is imposed.)
2. (i) 200 is the sample size, the number of individuals actually studied
3. (ii) infants not participating are the control
4. (ii) income/education is a possible lurking variable, which is tied in with whether or not a baby is breastfed, and could also impact the likelihood of infection. (i) is in fact the explanatory variable, (iii) is the response.
5. (ii) random assignment to treatment or control is essential; I mentioned coinflipping in class as a viable way to make assignements
6. (ii) The fact that mothers must know they are being encouraged to breast-feed rules out the possibility of a double-blind study. It would certainly be possible to make it blind on the part of doctors: they wouldn't have to know if a baby has been breast-fed when they are diagnosing for infection.
1. (i) bar graph (2 categorical variables)
2. (i) compare percentages
1. (ii) histogram (1 quantitative variable)
2. (ii) mean and standard deviation, because the distribution should be fairly normal
1. (iv) scatterplot (2 quantitative variables)
2. (iv) report the correlation