Statistical Analysis of Data: Null hypothesis tests to determine whether habitat features differed between pools and riffles
~ Understand the construction of null and alternative hypotheses.
~ Understand the logic behind statistical comparisons, using both t-tests and
behind comparison of categorical data using chi-square tests.
~ Manually perform t-test calculations
~ Manually calculate chi-square calculations
What to Turn In
~ A document that collates results from each step of the analyses that
compare mean depth, mean velocity, and substrate composition between
riffles and pools
Scientists are often interested in comparing the means of two samples to determine if the populations from which the samples were drawn are different from each other. One might be interested in comparing, say, the average size of steelhead trout between two different rivers. Your data on fish sizes represent two different samples, each of which is one of an infinite number of outcomes that you might have obtained. Had you sampled on a different day, had you managed to catch that one last trout that “got away”, if you sampled in different locations, your data would be different. The challenge then is to use these data to infer something about the true differences between two rivers.
Consider our field trip to Pipers Creek: we collected data on pools and riffles. These were random samples, and so far we've looked at differences in the samples between pools and riffles to evaluate to what extent might pools and riffles might differ. As above, our samples were but one of an infinite number of outcomes we might have gotten (recall our sample locations were randomized). How can we say anything about the "true" differences between riffles and pools if all we have is a sample from these two groups?
Statistics is the body of work that allows us to draw inferences about the true state of the world from the limited view of our data. By far, the most common way that this is conducted is through “null hypothesis testing”, where you evaluate the probability of obtaining your data (or more “extreme” data) if some null hypothesis was true. Typically we define the null hypothesis as being one that we aim to reject: there is no difference between control and treatment groups. If the data are highly unlikely if this null hypothesis is true, then we reject that null hypothesis and accept the alternative, that there is a difference between control and treatment groups.
Here are the steps for null hypothesis testing
(1) State null hypothesis
(2) State alternative hypothesis, which must be true if the null hypothesis is false
(3) Calculate the probability of getting your data, or more extreme data, if the null
hypothesis is true
(4) If the probability of obtaining your data is very low if the null hypothesis is true,
you reject the null hypothesis and accept the alternative hypothesis
(5) Otherwise, you fail to reject the null hypothesis.
Importantly, just because you have insufficient evidence to reject the null hypothesis, it does not necessarily follow that the null hypothesis is correct. It simply means that you do not feel safe ruling out that hypothesis given your data.
Comparison of means; the t-test.
One of the standard ways in which we can compare whether the mean values between two groups truly differ from each other is through the Student’s T-Test (the name has
nothing to do with academic position: it was derived by a statistician who published under the pseudonym “Student”).
Suppose we have replicated samples of some quantity measured in two groups: pools and riffles. The sample means are easily calculated from the data, and most often they will be different from each other. In null hypothesis testing, we ask whether the difference in
samples means is bigger than we would expect from random chance alone.
To do this, we first evaluate a quantity that tells us something about the variability among the samples: the standard deviation
Standard Deviation: within a distribution
of outcomes, the standard deviation (；) can
be thought of as the average distance each
possible outcome is from the mean (：). The
standard deviation of a sample (s) is
calculated from the sample mean, , and the x
number of samples, n, as:
We use the difference in sample means, the sample size, and the sample standard deviation to calculate the probability of getting the observed data (or even larger
;；differences in sample means) if there really is no difference between riffles and pools. We
call this value a p-value.
If the p-value is very small, we reject the null hypothesis that there is not
difference between pools and riffles.
When we reject the null hypothesis, we can claim that there is a statistically significant difference in the sample means. The largest value of the p-value with which you reject the null hypothesis is called the ； level. Typically, we use ； = 0.05 to determine
Do pools and riffles have different mean depths?
Here, our null hypothesis is that the mean depths are identical between pools and riffles. We want to evaluate our samples to determine whether we have evidence to reject that hypothesis.
Step 1: calculate sample means for riffles and pools
Use the supplied Excel spreadsheet to calculate the sample mean transect depths for pools
XpXpand riffles (aggregating data from upper and lower regions). Denote these as and
for the sample means for pools and riffles, respectively. You may find it helpful to set up your spreadsheet like this:
Step 2: Calculate standard deviations
Calculate the sample standard deviation of transect depths for riffles and pools using the “=stdev()” command in Excel.
We will assume that the actual standard deviations are the same in the pools and riffles. We must therefore calculate a single standard deviation based on the calculations in pools in riffles. Calculate the “pooled” standard deviation using the following equation:
where n and s denote the sample size and standard deviations from pools, and n and s pprr
denote the sample size and standard deviations from riffles. Don’t confuse the sfor pooled ;；the sample variance of pools! When using Excel to calculate s, play close attention pooled
to where you place the parentheses in your equation.
Step 3: Calculate the T- statistic
The t-statistic is simply the ratio of the difference in sample means divided by a measure of the precision of your estimate:
Step 4: Calculate the degrees of freedom
For a two-sample t-test with a constant standard deviation assumed between the two
;；populations, the “degrees of freedom” is the number of sites sampled minus 2 (in case you’re wondering, you subtract two because you “lose” degrees of freedom for each quantity that you estimate; in this case you are estimating the difference in sample means and the sampling standard deviation).
Step 5: Calculate the p-value
Excel will calculate a p-value, given a t statistic and the number of degrees of freedom. The formula to use is:
where abs(T) is the absolute value of your T-statistic, d.f. is the calculated degrees of freedom, and the last number, 2, indicates that you are performing a 2-sided test (you are equally likely to consider that the difference in sample means might be positive or negative).
Step 6: Reject or accept the null hypothesis based on your p-value.
Using an ； level of 0.0.5, do you accept or reject the null hypothesis? Can you claim that the mean depths are significantly different between pools and riffles?
Repeat the procedure above, but this time compare the mean water velocities between pools and riffles. Turn in results from steps 1 – 6.
Chi-Square test: categorical data
Our substrate data are “ordinal”, because they are discrete categories but they have an order to them (0 is smaller than a 1, 1 is smaller than a 2, etc.). We might get away with calculating the mean substrate score and use a t-test. However, this might mask other substrate composition trends (say one habitat has either 1’s or 5’s, while another has only 2’s and 3’s). For that reason today we’ll treat these data as categorical.
Step 1: Calculate the observed frequency of each substrate code by habitat Use the provided excel spreadsheet to produce a table that lists the total number of substrate samples scored as each of the substrate categories (to maintain independence of sample sites, only a single value for each transect is used) among all pools and riffles. Because a score of 0 was never used in the lab, use only substrate categories 1 through 5. Your table should look like this:
Habitat 1 2 3 4 5
Pools x x x x x
Riffles x x x x x
where the x’s are the numbers you calculate from the table.
Step 2: Calculate row and column totals
1. Using your table from the step above, calculate the total number of observations
in each row (“row totals”) and in each column (“column totals”).
2. Calculate the grant total number of observations.
Step 3: Create table of expected values
1. Set up a table that looks like the one you created above (same column and row
labels), but instead of listing the observed values in each category, instead you
will calculate the expected value under the null hypothesis.
2. The expected value in row y and column x equals the row y total times column x
total divided by the grand total. Write excel formulas in each cell of your table to
calculate the expected numbers of observations.
2Step 4: Calculate ， statistic
1. Create a third table set up in the same manner as above (same column and row
22. In each cell, calculated the (Observed – Expected) / Expected for each habitat
and substrate combination, using cell references to the two tables you created
above. 23. The ， statistic is calculated by summing all of these values.
Step 5: Calculate the degrees of freedom.
The calculation of degrees of freedom is different for a ，， test than for a t-test. The
degress of freedom equals (# rows -1) x (# columns -1).
Step 6: Calculate the p-value.
Use the “chidist(，2, degrees of freedom)” to calculate the p-value from the class data.
Using an ； level of 0.0.5, do you accept or reject the null hypothesis? Can you claim that the substrate composition is significantly different between pools and riffles?
What to Turn In.
In a word document, collate your analyses and results to clearly indicate:
1. Results from Steps 1- 6 for the comparison of mean depth
2. Results from Steps 1-6 for the comparison of mean velocity
3. Results from Steps 1-6 for the comparison of substrate composition