Session Two Real and Perceived Distances

By Larry Elliott,2014-01-20 03:33
8 views 0
Session Two Real and Perceived Distances


    This Lab was modified by Patricia Humphrey and John Rafter from Spurrier, J.D. et al,

    Elementary Statistics Laboratory Manual, Duxbury Press, 1995.

    Real and Perceived Distances


    One of the most important aspects of data analysis is the study of relationships between variables.

    How does a cricket’s chirping rate change as temperature decreases? How does the yield of a

    chemical reaction change when pressure is increased? This session uses graphical and descriptive

    tools to help quantify relationships between variables.

The Setting:

    Often the measurement we really wish to make on an object is difficult to make (or expensive,

    toxic, or destructive). If we can find an easily measured variable that is closely linked to the

    difficult one, we may be able to use the easy one in place of the difficult one. Before doing so,

    we should do an experiment, called a regression experiment, that involves measuring both

    variables on each of several objects and then studying the manner in which the easy

    measurement tends to change with the difficult one. We may then be able to adjust the easy

    measurement to better approximate the difficult one. This process is called calibration. This

    session is a regression and calibration experiment to study the manner in which guessed

    distances between objects (an easy measurement) denoted by X, the predictor variable, vary in

    relation to true distances (a more difficult measurement) denoted by Y, the response variable.


    It is well known that people tend to underestimate the size of faraway objects. Do we also tend to

    underestimate the distance to faraway objects, or do we tend to overestimate these distances? Or,

    do we guess right, on average?

    The Experiment

Step 1: Data Collection

    The class as a group will go to a pre-chosen spot, with handouts and pencils. The instructor will

    first identify a fixed reference point, such as a lamppost or a fire hydrant. Next the instructor

    will identify a landmark. You should write a brief description of this landmark in column 2 of

    Table 2.1. Each student will then be asked to guess the distance between the reference point and

    the landmark. Please keep your guess to yourself so as not to influence others. To simplify calculations later, guess in units of feet only. Silently record your guessed distance in column 3

    of Table 2.1. Then the instructor will ask you to guess the distance between the reference point

    and a second landmark, to be recorded in Table 2.1, and then another, and so on, for a total of 13

    landmarks. Don’t worry that your guesses might be bad. You will calibrate them later.

The class will then be split into teams to measure the true distances to the landmarks. Each team

    will have three members:


     1. The Base: This person holds the tape end at the reference point and at intermediate

    points along the way if the landmark is too far away to measure in one tape length. He or she

    also advises the other team members if they are not walking straight toward the landmark and

    keeps track of the number of full tape lengths that have been used en route to the landmark.

     2. The Point: This person takes the tape roll and carefully walks straight toward the

    landmark, until it is reached or the tape runs out. If the tape runs out, the Point is responsible for

    keeping track of exactly where the starting point for the next tape length will be while the Base

    comes forward. Also, the Point verifies the final reading made by the Eyes.

     3. The Eyes: This person walks beside the Point. When the landmark is reached, the Eyes

    reads the tape and (in conjunction with the Base and the Point) calculates the final measured

    distance to the landmark and records it in the appropriate row of column 4 in Table 2.1. The Eyes

    is also the spokesperson for the team in class discussion.

Each of the first 12 landmark distances will be independently measured by at least 3 teams.

    There are two serious errors that occur with surprising frequency:

     1. It is very easy to forget how many tape lengths have been used when measuring distant

    landmarks. It is the Base’s responsibility to remember this, but the other team members should

    help, too.

     2. If the end of the tape is reached, the Point should be very careful where the start of the

    next tape length is marked. For example, if you are using 25-foot tape measures, the tape is

    actually longer than 25 feet, but the new start point should be at the 25-foot mark, not the tape

    end. The Eyes should back up the Point to help prevent this error.

Do not measure the 13th landmark distance. It will be a test case. Its true distance has been

    measured in advance by your instructor, and we discuss it later.

    Table 2.1: Distances Between a Fixed Point and Several Landmarks

     Guessed Measured Median Landmark Distance Distance Measured Number Landmark Description (feet) (feet) Distance


    X Y Distance)

    1 Railing on handicap ramp

    2 Heat Pump by Biology door

    3 Left end of the close Bike Rack

    4 Post at center point of lawn

    5 Near steps, bottom right corner

    6 Large Magnolia on left

    7 Storm drain toward bike racks

    8 Left railing on main steps to MPP

    9 Third tree by Biology

    10 Right corner of far sidewalk

    11 Far right corner of near sidewalk

    12 Right side of second bike racks

    13 Sprinkler control toward Pecan


Step 2: Individual Data Analysis

    First, the instructor will lead the class in resolving team-to-team differences in measured

    distances. The median of all the measured distances for each landmark will be used as the true

    distance. Fill in column 5 of Table 2.1 with these medians as the discussion proceeds. Notice that,

    by using the median of at least three measurements, if one of the teams messed up in a big way

    (resulting in an outlier), its mistake will not have much effect on the final number, because the

    median is resistant to extreme values.

    We are now ready to examine the relationship between the true distances (response) and your guessed distances (predictor). The most useful graphical tool for examining the relationship

    between two variables is the scatter plot. A scatter plot of true distances versus guessed distances

    locates a point on the Cartesian plane for each landmark, with the point’s coordinates given by

    (horizontal coordinate, vertical coordinate) = (guessed distance, true distance). Note that when

    we say “true distance versus guessed distance” we mean that guessed distances are to be on the

    horizontal (X) axis. Also, when we describe a scatter plot, it is “Y versus X” or “Y against X.” When we discuss regression, it is “Y on X.”

We can make a scatter plot, with an added 45? line to help you judge whether you are an

    accurate guesser as follows:

1. Press Y= and enter X by pressing the X,T,,n key. (Figure 2.1) Exit from this screen by

    pressing 2

    nd, MODE.

    2. Enter the true distances (Y) into L and your guessed distances (X) into L. DO NOT 12THENTER THE GUESSED (or true) DISTANCE FOR THE 13 POINT. nd3. Press 2, Y=, and select a Plot.

    4. Be sure the ON is highlighted. Use the down arrow key to get to Type:. Choose the scatter

    plot icon (first plot type), and press ENTER. Set Xlist: to L and Ylist: to L. Choose 21

    whichever Mark: you prefer. (Figure 2.2)

    5. To display the scatter plot and the line Y = X, Press Zoom, 9.

     Figure 2.1 Figure 2.2

    After a short delay, the plot window should open and you should see your scatter plot and the

    45? line. If the points on your scatter plot tend to lie above the 45? line, you tend to

    underestimate the true distances to landmarks. We would then say that as a guesser you are

    negatively biased. This is the case for the majority of guessers. Some guessers are fairly

    accurate on the average with their guesses. That is, the points on their scatter plot tend to fall

    along the 45? line. We say they are unbiased guessers. A few individuals tend to overestimate

    the true distances. The points on their scatter plot tend to lie below the 45? line. We say they are


    positively biased guessers. Of course, one’s ability as a guesser may vary from situation to situation.

Reproduce the scatter plot on a piece of paper to turn in with this assignment. Be sure to label

    the x-axis and y-axis as well as title your plot. To more easily do this, either press the TRACE

    key to find the x and y values at each point or plot the values directly using table 2.1. After you

    determine the regression equation (see Step 4) plot it on your scatter plot.

Step 3: Calibration

    If the points on your own scatter plot lie approximately on a straight line, then the relationship

    between your guessed distances and the true distances is said to be approximately linear. If there

    seems to be a U-shaped curve, the relationship is said to be convex. If the curve is an inverted

    U-shape, the relationship is said to be concave. In the writing assignment (see Page 6), you will

    be asked to describe the relationship shown by the points on your plot?

If we consider you, as a distance guesser, to be a new sort of measuring instrument, we can use

    the regression line to calibrate you. Calibration is an activity or operation for correcting bias in a

    thmeasuring device and is an example of one very important use for regression experiments. Here landmark. is a graphical method for calibrating your guess for the 13 1. Locate the point on the horizontal (X) axis of your plot that corresponds to your initial

    guess for the 13th landmark.

     2. With a straightedge, draw a vertical line from that point up the plot until the line

    touches the sketched regression line. (See Step 4)

     3. From that point, draw a horizontal line to the vertical (Y) axis.

     4. The adjusted guess is the reading on the Y axis found at the end of this horizontal line.

    Figure 2.3 shows the result of using the calibration to adjust an initial guess of 55 feet. The

    calibration leads to an adjusted guessed distance of 64 feet in this case, an increase of 9 feet from

    the initial guess. This makes a lot of sense under the observation that these guesses are

    negatively biased; if the initial guess is 55 feet, it should be adjusted upward.

    My Guesses of Distance to 12 Landmarks



    60Y = X50


    30SketchedReal Distance (ft)20




    Guessed Distance (ft)

    A guess of 55 feet calibrates to a true 64 feet

    Figure 2.3: Calibration Example

     5 Step 4: Regression Equation

    The numerical equivalent to the graphical approach for adjusting your initial guess is to

    substitute your guess into the regression equation. To calculate the regression equation for this

     and your guessed distances are in L. data, use your guesses as the predictor (X) variable and the true distances as the response (Y) 122. Press STAT and arrow to CALC, then Press 8. variable. Use the following steps:

    3. Enter L as the X list and L as the Y list and store the resulting line in function Y. This is 211ndnd1. Recall that the true distances are in Ldone by first pressing 2, 2, a comma, 2, 1, a comma, and Vars. Next, arrow to Y-Vars, select Function, and press Enter. Then press Enter again to choose Y The screen should 1.

    look like Figure 2.4.

    4. After pressing Enter (for the third time), the regression equation will appear. In this case,

    the X represents your guessed distance and the Y represents the true distance. (Figure 2.5)

     Figure 2.4 Figure 2.5

    Record your values of a, b, r-squared, and r in the space below.

Now, we can once again calculate a true distance for your guessed distance for the 13th

     landmark by plugging your guess into your regression equation for x. If this calibration equation is useful

    the calculated distance will be closer to the true distance than your guess was.

Step 5: Residual Plots

    Residual Plots can be of use in determining whether the model used (in this case a straight line)

    was a good choice. Plotting residuals against X will indicate any curvature, outliers, and

    problems with nonconstant variance.

To put the residuals in list L3, press 2ndnd, 1, -, Vars, arrow to Y-Vars, Function, Y, (, 2, 2, ), 1ndSTO>, 2, 3, Enter. The resulting screen should look like the figure below. Then make a

    scatter plot using your guessed distances (L) as the X variable and the residuals (L) as the Y 23variable.


    Reproduce the residual plot on a piece of paper to turn in with this assignment. Be sure to label and L. 23the x-axis and y-axis as well as title your plot. To more easily do this, either press the TRACE key to find the x and y values at each point or plot the values directly using the editor to display Parting Glances: LIn this experiment, we used a regression line to calibrate a “measuring instrument”, in this case, a

    human being guessing distances between objects. A more important calibration exercise was

    performed to improve verification of nuclear weapons tests under the Threshold Test Ban Treaty

    between the United States and Russia. After the cold war, the two countries embarked on an

    effort to make onsite yield measurements of each other’s nuclear tests, for the purpose of

    calibrating a monitoring system based on seismic measurements. That is, there are two

    measurements of a nuclear explosion’s force:

     1. The onsite measurement

     2. The seismic disturbance as measured by a seismograph halfway around the world.

    Once a reliable calibration method is constructed, each country should be able to monitor the

    other’s nuclear tests using at-home seismic measurements, instead of traveling overseas to make onsite measurements. Note that this difficult onsite measurement will become impossible if

    relations between the United States and Russia return to cold war levels. Also, as more is

    learned about the relationships between the two measuring methods, nuclear testing in countries

    other than Russia may be more reliably monitored in the United States by seismologist. (Picard,

    Richard and Bryson, Maurice (1992), “Calibrated Seismic Verification of the Threshold Test

    Ban Treaty,” Journal of the American Statistical Association, 87, 293-299.)

    Short Answer Writing Assignment All answers should be complete sentences.

    Also the table of guesses and measured distances, the scatterplot of your data including the

    regression line, and the scatterplot of the residuals.

    1. Name two factors (variables) that might cause your ability as a guesser to vary from situation

    to situation?

    2. Describe the form of your scatter plot. (i.e., Do points lie approximately on a straight line, a

    convex curve, a concave curve, or in some other pattern?)

    3. Do your distance guesses tend to be negatively biased, positively biased, or approximately


    4. What was your initial guess for the mystery 13

    th landmark? Write your regression equation. thWhat was the estimated distance to the 13 landmark from the regression equation? Did the

    regression lead to a more accurate guess? If not, what was unusual about your line or guess

    that led to the “calibrated” guess being less accurate?

    5. What percent of variation in true distance was accounted for by your guesses?

    6. Give a meaningful interpretation of the slope for your regression line.

    7. Describe the pattern of your residual plot. Indicate any problems with using the regression

    line for prediction that are suggested by the residual plot.

Report this document

For any questions or suggestions please email