DOC

One Sample t-test

By Sally Long,2014-12-12 01:10
13 views 0
One Sample t-test ______________________________________________________________________ A manufacturer of fertiliser claims that on average one bag of his fertiliser will be sufficient to cover 14 square metres of lawn. A firm buying the bags takes a random sample of 12 bags and measures the area covered by each. The data are: 13.2 13.8 13.4 12.4 ..

    One Sample t-test

    ______________________________________________________________________ A manufacturer of fertiliser claims that on average one bag of his fertiliser will be sufficient to cover 14 square metres of lawn. A firm buying the bags takes a random sample of 12 bags and measures the area covered by each. The data are:

    13.2 13.8 13.4 12.4 14.3 13.2

    13.6 13.9 13.2 14.5 12.6 12.6

Is the manufacturer's claim true?

    i.e. Test H : ; = 14 against H : ; 14 01

    1. Enter the 12 data values into C1 in a Minitab worksheet and name the column

    area.

    2. Select

    Stat > Basic Statistics > 1-sample t

    and complete the dialog box as shown. In order to carry out the test with the appropriate alternative hypothesis, select the Options box and set the Alternative to

    Not equal.

The resulting output is

One-Sample T: area

Test of mu = 14 vs mu not = 14

Variable N Mean StDev SE Mean

    area 12 13.392 0.665 0.192

Variable 95.0% CI T P

    area ( 12.969, 13.814) -3.17 0.009

    Note that:

     The p-value is 0.009 < 0.05. This indicates that, at the 5% significance level,

    the null hypothesis can be rejected. There is evidence that the manufacturer

    claim is not correct.

     The 95% confidence interval indicates that the true mean is between 13.0 and

    13.8 square metres. Thus the results indicate that the manufacturer is

    exaggerating the area covered.

     If a one-tailed test was appropriate, the alternative in Options would be set to

    less than or more than. Note that a confidence interval is only given when a

    two-tailed alternative is used.

    3. Go back to the dialog box for a t-test and select the graph option to obtain the following boxplot. The red line indicates the 95% confidence interval and the blue dot is at14, the mean stated in the null hypothesis.

    Boxplot of area

    (with Ho and 95% t-confidence interval for the mean)

    []_XHo

    12.513.514.5

    area

    4. The t-test is based on the assumption that the data follow a normal distribution. One way of testing this is to obtain a normal probability plot of the data. Select

    Stat>Basic statistics>Normality test

    Enter area in the Variable box. You should obtain the following plot. The fact that the plot is approximately linear indicates the assumption of normality is valid.

    Normal Probability Plot

    .999

    .99

    .95

    .80

    .50

    .20Probability.05

    .01

    .001

    12.513.514.5

    areaAverage: 13.3917Anderson-Darling Normality TestStDev: 0.665321A-Squared: 0.235N: 12P-Value: 0.732

    Example 1

    Established records show that the lengths of mussels from an estuary are normally distributed with a mean of 30 mm. The following data are lengths from a random sample of 25 mussels taken from a polluted beach. It is suspected that the effect of the pollution will be to inhibit the growth of the mussels.

    27.72 17.44 19.72 42.39 22.31

    30.87 20.06 18.03 16.29 24.95

    19.15 32.22 27.33 35.88 18.57

    22.02 27.45 26.56 22.32 31.40

    19.12 43.56 40.63 36.12 26.95

    a) Write down appropriate null and alternative hypotheses for this example. b) Enter the data into Minitab and carry out the appropriate test. From the results

    comment on whether or not pollution appears to inhibit the growth of mussels. c) Obtain each of the graphs available in the t-test options and make sure you

    understand the information given on these.

    d) Obtain a normal plot to test the assumption of normality

    e) Obtain a 95% confidence interval for the mean percentage length of the mussels

    from the polluted beach.

Example 2

    The angles, measured in degrees, between the first two segments of fifteen hyphae of a certain fungus were as follows:

     115 124 126 121 135 113 119 116 116 112 123 122 130 113 134

    a) A researcher has postulated that the mean angle should be 120 degrees.

    Carry out a t-test, using a 5% significance level, to assess whether the data

    are compatible with this claim.

    b) What assumption have you made and is it valid?

Example 3

    The weights (g) of eight adult starlings caught at a roost were as follows:

    78 82 88 81 87 80 88 80

    a) Calculate a 95% confidence interval for the mean weight of starlings in the

    roost. (Use the one-sample t dialog box entering any value for the mean.) b) Calculate a 99% confidence interval for the mean weight. (Go to options in the

    one-sample t dialog box and change 95 to 99.) How do the intervals compare. (c) What assumptions are you making to calculate these confidence intervals?

    Check that this is valid.

     __________________________________________________________________

     Hypothesis testing: Paired and two-sample tests The manufacturer of a suntan lotion wants to know whether or not a new ingredient increases the protection against sunburn. Seven volunteers have their backs exposed to a sun lamp with the old lotion on one side and the new lotion on the other side of the spine. A higher number indicates more burning. Does the new ingredient improve the effectiveness of the lotion? The data are:

    volunteer 1 2 3 4 5 6 7

    burn without new ingredient 42 51 31 61 44 55 48

    burn with new ingredient 38 53 36 52 33 49 36

1. Enter the data into two columns in Minitab, without and with.

    2. Select

    Stat>Basic Statistics>Paired t

    and complete the dialog box.

    Note: a) To select the appropriate form of the alternative hypothesis select

    Options.

    b) The difference evaluated is the first sample minus the second. The order

    is not important but should be consistent throughout.

    c) You can also obtain a boxplot of the differences by selecting Graphs.

3. Clicking on OK you should obtain the following output.

Paired T-Test and Confidence Interval

Paired T for without - with

     N Mean StDev SE Mean

    without 7 47.43 9.71 3.67

    with 7 42.43 8.54 3.23

    Difference 7 5.00 6.48 2.45

95% CI for mean difference: (-1.00, 11.00)

    T-Test of mean difference = 0 (vs > 0): T-Value = 2.04 P-Value = 0.044

    ______________________________________________________________________ Two-sample t-test

    Eleven seedlings are divided at random into two groups. The first group is grown under normal light conditions; the second group in conditions of restricted light. After 4 months the heights of the seedlings are measured and recorded:

normal 46 39 38 41 36

    restricted 39 28 31 37 41 34

Does restricting the light inhibit the growth of the seedling?

    1. Enter the data into two columns, one for plants grown in normal light conditions,

    one for those in restricted conditions. e.g.

    C1 (normal) 46 39 38 41 36

    C2 (restricted) 39 28 31 37 41 34

2. Select

    Stat>Basic Statistics>2-sample t

     and complete the dialog.

    Note: a) An alternative way to enter the data is to have the measurement in one column with a subscript giving the sample number in another column e.g.

    C1 46 39 38 41 36 39 28 31 37 41 34 C2 1 1 1 1 1 2 2 2 2 2 2

    b) If it does not seem reasonable to assume that the populations have the same variance then the test can be carried out on Minitab without this assumption. The output assuming equal variances is:

    Two Sample T-Test and Confidence Interval

    Two sample T for normal vs restricted

     N Mean StDev SE Mean

    normal 5 40.00 3.81 1.7

    restrict 6 35.00 4.94 2.0

95% CI for mu normal - mu restrict: ( -1.1, 11.1)

    T-Test mu normal = mu restrict (vs >): T = 1.85 P = 0.049 DF = 9

    Both use Pooled StDev = 4.47

    Boxplots of normal and restrict(means are indicated by solid circles)

    45

    40

    35

    30

    normalrestrict

    ______________________________________________________________________ Example 1

    A researcher wished to test whether smoking a cigarette increased the aggregation of platelets in the blood. Blood samples were taken from 11 individuals before and after they smoked a cigarette. The data are shown below:

Before 25 25 27 44 30 67 53 53 52 60 28

After 27 29 37 56 46 82 57 80 61 59 43

    a) An alternative design of the experiment would be to take 11 individuals and test

    them after smoking a cigarette and another 11 individuals and test them without

    the cigarette. Explain the flaw in such a design and why a paired design is, in this

    case, preferable.

    b) Formulate appropriate null and alternative hypotheses.

    c) Enter the data into Minitab, putting the results for before in one column and the

    results for after in another. Carry out a paired t-test stating any assumptions you

    have to make to do so.

    d) Comment on the effect of the smoking on the aggregation of platelets in the blood.

    ______________________________________________________________________ Example 2

    The following are the lengths of cuckoo eggs found in the nests of hedge sparrows and reed warblers.

Egg lengths in hedge sparrow nests (mm)

21.98 23.41 22.39 22.79 24.30 22.39 23.78

    21.69 22.78 22.77 23.78 24.06 22.48 23.42

Egg lengths in reed warbler nests (mm)

21.46 22.32 21.89 21.95 22.86

    21.28 21.94 21.94 21.71 22.54

    a) Write down appropriate null and alternative hypotheses to test the claim that the

    cuckoo alters the size of the egg to suit the nest.

    b) Carry out the test on Minitab assuming equal variances.

    c) Obtain a boxplot to illustrate the results.

    d) Do the data support the hypothesis that the cuckoo alters the size of egg to suit the

    nest?

    e) Carry out the test without the assume equal variances option. Does it make any

    difference to the conclusions reached? Explain your answer.

    ______________________________________________________________________ Example 3

    For 28 heart attack patients, their cholesterol levels were recorded 2 days after the attack and again at 4 days and 14 days. Data were also obtained from a control group of 30 patients who had not had a heart attack. Select File > Open worksheet. Look in

    the drive Deptdata and the select the folder Statistics followed by Minitab then Data. Highlight the file cholest and click OK. This file contains the data described above. A

    '*' in the column for 14-days indicates missing data.

    Thinking carefully about whether a paired or a two-sample test is appropriate, test

    the following hypotheses, illustrating your findings with an appropriate diagram

    in each case.

    a) The blood cholesterol levels after two days for those patients who had a heart

    attack is higher than the level for the patients in the control group.

    b) The blood cholesterol level on the 4th day is lower than that on the 2nd day

    after the attack.

    c) The blood cholesterol level on the 14th day is lower than that on the 4th day

    after the attack.

    ______________________________________________________________________ Example 4

    The data file pulse contains information on students in an introductory statistics class. Each student recorded their pulse rate (C1) and other information such as height, weight, whether or not they smoked (C4, 1-no, 2-yes) and sex (C5, 1-male, 2-female). Obtain the data set and comment on the following hypotheses using the appropriate test. Obtain suitable diagrams to illustrate your findings.

    a) The pulse rate for smokers differs from the pulse rate for non-smokers.

    b) The pulse rate for males differs from the pulse rate for females.

Solutions

    One-sample t-test

    1.a) H : ; = 30, H : ; < 30 b) t = -2.00 p-value = 0.02p. There is 01

    evidence to suggest that pollution inhibits growth. d) Assumption of normality is okay.

     e) (23.42, 30.11) (Use one-sample t with alternative set to not equal)

    2. a) p-value for a two-tailed test is 0.522. There is no reason to doubt the claim

     b) Assumption is that the data are normally distributed. Assumption is valid from normal probability plot.

    3. a) (79.62, 86.38) b) (78.01,87.99) The greater the level of confidence the wider the interval. c) Assumes the data are normally distributed. Valid.

Paired and two-sample t-tests

    1. a) Alternative design takes no account of the variation between individuals.

    b) H : ; = 0, H : ; > 0, where ; is the mean difference (after 0d1ddbefore)

    c) Paired t-test, p-value = 0.001 so H can be rejected. Assumes 0

    differences are normally distributed. d) The results suggest that smoking does increase the aggregation of platelets in the blood by 10.3 units on average.

    2. a) H: ;= ;H: ;?; where A indicates hedge sparrow and B, 0A B, 1A B

    reed warbler.

    b) p-value = 0.002. d) Data do support the hypothesis that the cuckoo alters the size of the egg to suit the nest. The eggs laid in the sparrows nest are larger on average than those laid in the reed warblers nest. e) p = 0.001. Conclusions are the same. 3. a) two-sample, one-tailed t-test. p-value = 0.000. There is evidence that the cholestrol levels of the patients who had a heart attack is higher than those in the control group. b) paired, one-tailed t-test, p-value = 0.002. Suggests cholesterol level after 4 days is lower than after 2 days. c) paired, one-tailed t-test, p-value = 0.116. Suggests cholesterol level after 14 days not significantly lower than after 4 days

    4 a) Using a two-sample, two-tailed test, there is no evidence that the pulse rates for smokers and non-smokers differ, p = 0.285. b) Using a two-sample, two-tailed test, there is evidence that the pulse rates for males and females differ, p = 0.008. Male pulse rates are lower.

Report this document

For any questions or suggestions please email
cust-service@docsford.com