DOC

# Homework 5 Solutions

By Patricia Sullivan,2014-05-07 13:47
6 views 0
Homework 5 Solutions

Homework 5 Solutions

303 Spring 2003

Steve Fienberg

Tuesday, March 11

Part III

a) Find the sampling distribution of y for:

? A simple random sample of size 3 with replacement.

To do this, all of the possible simple random samples need to be enumerated.

Thus, with replacement, there will be 512 equally likely samples in a simple

random sample (8 choices for the first, 8 choices for the second and 8 choices for the third is 8*8*8=512). To do create these 512 combinations in Minitab, go to

Calc_Patterned Data_Arbitrary set of numbers.

Store patterned data in C1, arbitrary set of numbers 1,2,4,4,7,7,7,8, list each value 1 time, list whole sequence 64 times.

Store patterned data in C2, arbitrary set of numbers 1,2,4,4,7,7,7,8, list each value 8 times, list whole sequence 8 times.

Store patterned data in C3, arbitrary set of numbers 1,2,4,4,7,7,7,8, list each value 64 times, list whole sequence 1 times.

Next, get the average for each of the possible samples by Calc_Calculator Store

results in C4, expression (C1+C2+C3)/3. To determine the frequency of each

average, choose stat_tables_tally and choose column C4, check off counts and

percents. I got the following distribution.

C4 Count Percent

1.00000 1 0.20

1.33333 3 0.59

1.66667 3 0.59

2.00000 7 1.37

2.33333 12 2.34

2.66667 6 1.17

3.00000 21 4.10

3.33333 33 6.45

3.66667 15 2.93

4.00000 47 9.18

4.33333 48 9.38

4.66667 12 2.34

5.00000 63 12.30

5.33333 57 11.13

5.66667 21 4.10

6.00000 57 11.13

6.33333 36 7.03

6.66667 6 1.17

7.00000 27 5.27

7.33333 27 5.27

7.66667 9 1.76

8.00000 1 0.20

N= 512

A simple random sample of size 3 without replacement. ?

For a simple random sample without replacement, there are a possible 336 equally likely samples (8*7*6=336) . The way I created them is a bit contorted, but I think it works. To create these samples, go to Calc_Make Patterned Data_Arbitrary Set of Numbers

Store data pattern in C6, Arbitrary set of numbers 1 2 4 4 7 7 7 8, list each value 1 time, list whole sequence 42 times

Store data pattern in C11, Arbitrary set of numbers 2 4 4 7 7 7 8, list each value 6 time, list whole sequence 1 times

Store data pattern in C12, Arbitrary set of numbers 1 4 4 7 7 7 8, list each value 6 time, list whole sequence 1 times

Store data pattern in C13, Arbitrary set of numbers 1 2 4 7 7 7 8, list each value 6 time, list whole sequence 2 times

Store data pattern in C14, Arbitrary set of numbers 1 2 4 4 7 7 8, list each value 6 time, list whole sequence 3 times

Store data pattern in C15, Arbitrary set of numbers 1 2 4 4 7 7 7, list each value 6 time, list whole sequence 1 times

Manip_Stack_Stack Columns, Stack the following columns C11 C12 C13 C14 C15 in a column of the current worksheet C7.

Store data pattern in C16, Arbitrary set of numbers 4 4 7 7 7 8, list each value 1 time, list whole sequence 1 times

Store data pattern in C17, Arbitrary set of numbers 2 4 7 7 7 8, list each value 1 time, list whole sequence 2 times

Store data pattern in C18, Arbitrary set of numbers 2 4 4 7 7 8, list each value 1 time, list whole sequence 3 times

Store data pattern in C19, Arbitrary set of numbers 2 4 4 7 7 7, list each value 1 time, list whole sequence 1 times

Store data pattern in C20, Arbitrary set of numbers 4 4 7 7 7 8, list each value 1 time, list whole sequence 1 times

Store data pattern in C21, Arbitrary set of numbers 1 4 7 7 7 8, list each value 1 time, list whole sequence 2 times

Store data pattern in C22, Arbitrary set of numbers 1 4 4 7 7 8, list each value 1 time, list whole sequence 3 times

Store data pattern in C23, Arbitrary set of numbers 1 4 4 7 7 7, list each value 1 time, list whole sequence 1 times

Store data pattern in C24, Arbitrary set of numbers 2 4 7 7 7 8, list each value 1 time, list whole sequence 1 times

Store data pattern in C25, Arbitrary set of numbers 1 4 7 7 7 8, list each value 1 time, list whole sequence 1 times

… and so on …

Then stack the appopriate columns for the appropriate number of times. Let

Calc_Calculator store result n variable C9, expression (C6+C7+C8)/3. Get the

counts and percentages as computed above.

C9 Count Percent

2.33333 12 3.57

3.00000 6 1.79

3.33333 24 7.14

3.66667 6 1.79

4.00000 36 10.71

4.33333 48 14.29

4.66667 12 3.57

5.00000 36 10.71

5.33333 42 12.50

5.66667 18 5.36

6.00000 36 10.71

6.33333 36 10.71

7.00000 6 1.79

7.33333 18 5.36

N= 336

b) Is y unbiased for each of the two sampling schemes? Explain by using the information

The quantity of interest is (1+2+4+4+7+7+7+8)/8 = 5.

Note that the way the samples were created, they are all equally likely. Thus, the

expected values of each of the sampling schemes is just the average of the columns (C4

or C9). To get the average of the columns, calc_column statistics. Statistic mean, input

variable (the column number). Doing this, I see that both of the columns have an average

of 5. Thus, both of the sampling schemes are unbiased.

c) Draw histograms.

Graph_histogram variables C4 and C9.

For plan I)

140

120

100

80

60

Frequency40

20

0

0123456789

C4

For plan ii)

90

80

70

60

50

40

Frequency30

20

10

0

0123456789

C9

We can see that the distribution for plan ii) has a smaller spread. This is because the

“extreme” values can not be repeated (i.e. 1,1,1 and 8,8,8 are not possible). In addition, one of these is considered to be sampled from an infinite population (plan 1) and one is from a

finite population (plan 2). Thus, the improvement factor would be used from plan 2, giving it

a smaller variance.

d) Suppose that now you wish to draw samples of size 5. Explain why you do not need to do

any further calculations in order to demonstrate that the unbiasedness property holds for

this sample. (I’m assuming this is for the without replacement plan)

This has to do with the symmetry of the problem. We know that the population has 8

possible values, and we have shown that if we pick 3, then the remaining 5 values have

the same probabilities as those chosen in the sample of 3. To determine the sum of the

remaining 5 values, realize that the total of all the values is 40. Thus, the average of the

remaining 5 values is (40-3*Avg for the 3 sample)/5. This is because 3*avg for the three

sample is just the sum of the three sample, subtracted from 40 is the sum of the 5 sample,

divided by 5 is the average of the 5 sample. To get the expected value of this, we see that

E((40-3*Avg for the 3 sample)/5) = 1/5*E(40-3*Avg for the 3 sample) = 1/5*(40-

3*E(Avg for the three sample)) = 1/5*(40-3*5) (since we showed the average is unbiased)

= 1/5*25=5. Thus we know that the samples of size 5 would also be unbiased.

Report this document

For any questions or suggestions please email
cust-service@docsford.com