Richard B. Harris
Wildlife Biology Program
University of Montana
Missoula, Montana 59812
On estimating wildlife densities from line transect data
RICHARD B. HARRIS, University of Montana, Missoula, MT USA 59812
KENNETH P. BURNHAM, Colorado State University, Fort Collins, CO USA 80523
[English version: Published in Acta Zoological Sinica (动物学报)] 48: 812-818 (2002)
Abstract: Line transects are one of the best ways to estimate density of wildlife populations over large areas. However, density estimates will be unreliable if using mathematical procedures that, although simple and easy to use, do not correspond with reality. We argue here that using a naïve estimator, in which the mean of observed perpendicular distances are equated with effective strip width, is unlikely to yield reliable results. If conducted correctly, density estimates using this equation will most often be too high. Instead, we urge investigators to use program DISTANCE, and to familiarize themselves with the underlying theory, by reading Buckland et al. (1993). ------------------------------------------------------------------------------------------- Key words: density estimation, detection function, Fourier series, line transect, negative exponential distribution, program DISTANCE
Acta Zoological Sinica (动物学报) 00:000-000
It is now well known that estimating the abundance of wildlife populations is fraught with difficulties. As pointed out by Sheng and Xu (1992), line transect methods are among the best for medium and large-sized animals when estimation on large-sized areas is required. Thus, the line transect method has increasingly become used by Chinese wildlife scientists (e.g., Liu and Yi 1993, State Forestry Administration 1995, Gao and Yao 1997). However, even this method can produce unreliable results if critical assumptions are violated in the field, and/or if inappropriate mathematical analyses are applied afterwards.
It is worthwhile reviewing the underlying assumptions of line-transect estimation here (Anderson et al. 1979, Burnham et al. 1980, Buckland et al. 1993):
1. Objects on the center line must be observed with probability = 1.0 (i.e., every
object on the line must be detected).
2. Transect lines are placed randomly, or at least objectively, with respect to the
population being studied;
3. Objects (i.e., animals or animal groups) do not move toward or away from the
transect line in response to the observer before distances are measured;
4. Distances from the transect line to each object are measured accurately;
5. Transect line segments are straight;
6. The size of the object (or, if objects occur in groups, the size of the group)
does not affect the probability of observation (if it does, analyses that account for
size-bias must be used); and
7. Objects encountered are independent (i.e., observing an object does not affect the
probability of observing any other object);
Additionally, sample sizes (number of objects observed) must be sufficient to provide robust estimates of the detection function and its variance (Burnham et al. 1980 proposed a minimum of 40 for any single estimate). If sample sizes are too small, results can be accurate in theory, but unreliable in practice. Under field conditions, it is difficult to comply with all these assumptions and obtain reasonably large sample sizes(Southwell, 1994, Harris 1996).
After data have been appropriately collected, equations or numerical methods are used to model the detection function, from which density is estimated. A number of competing models of how detection decreases with distance have been proposed. Common sense, empirical data and simulation modeling have supported use of detection functions with a “shoulder” near the center-line, such as the Fourier series (Burnham et al. 1980) and
the half-normal (Buckland et al. 1993). Detection functions with a „shoulder‟ are likely to
reflect reality better than other shapes, because objects are often only slightly less detectable when near the center line than on it, whereas detectability often drops off at some distance from the center line. Equally importantly, modern theory has stressed that detection functions may differ among taxa, habitats, sighting conditions, and other factors. Thus, computer programs, such as DISTANCE (Thomas et al. 1998), provide alternative detection functions as well as metrics comparing the fit of each, allowing the user to chose the most appropriate based on a priori or empirical information (Burnham and Anderson 1998).
A simple model of declining detection with distance is the negative exponential (Eberhardt 1968, Gates et al. 1968, Eq. 1).
(-ax) g(x) = e (Eq. 1)
g(x) = probability of detecting the animal at perpendicular distance x,
assuming that all animals on the transect line are seen, i.e., g(0) = 1
a = parameter fitted to the data
x = perpendicular distance
However, the negative exponential function lacks a shoulder; in fact, the steepest decline in detection is closest to the center line. An even simpler approach to treating distance data is to equate the mean of recorded perpendicular distances with the effective width of a sampled strip, and then to proceed with calculations (Eq. 2) as though a strip transect had been conducted (Sheng and Xu 1992, State Forestry Administration 1995).
D = ns/2LW (Eq. 2)
D = estimated density of animals (or animal groups)
n = number of animals (or animal groups) seen
s = mean group size
L = length of transect line(s)
W = mean perpendicular distance of animals (or groups) seen
However, the point estimate of density obtained using Eq. 2 will only be accurate if the underlying (i.e., true) detection function is negative exponential. As well, Eq. 2 lacks a theoretical basis and a method to estimate its variance.
Our objective here is to examine the use of Eq. 2 (and the conceptually similar Eq. 1) to estimating density from distance data, and to encourage Chinese scientists to use alternative methods that have been found superior.
PROBLEMS IN USING EQUATION 2
Equation 2 is inflexible and will usually show a positive bias
If the true decline in detectability with distance follows a negative exponential distribution, point estimates produced by either Eq. 1 or Eq. 2 will be approximately correct. However, they will be unreliable if other detection functions characterize the data. Buckland et al. (1993) recommend using a modified half-normal parametric detection function. If the half-normal detection function is true, and the “mean distances” approach is used, a positive bias in the resulting density of 57% can be expected.
Burnham et al. (1980) conducted simulations to assess the performance of alternative detection functions when the true, underlying detection function was known. Table 1 reprints a portion of their results, comparing the negative exponential distribution (Eq. 1) with the much more flexible Fourier series. As can easily be seen, the negative exponential model performed well when the detection probability did, in fact, decline exponentially. Under these conditions, the Fourier series produced a negative bias of about 12-16%. However, when any other detection probability was simulated, the negative exponential function produced highly biased results, from 10 to almost 66% too high, while the Fourier series returned relatively unbiased results. Thus, Burnham eta al. (1980) recommended using the Fourier series because it was more robust to varied underlying detection functions.
One obvious way to compare the appropriateness of alternative detection functions is to apply all of them in a situation in which density is already known. In a test of various sighting methods performed prior to the development of rigorous line-transect theory, Robinette et al. (1974) demonstrated that Eq. 2 produced positive proportional
biases of 19% to 89%, with a mean proportional bias of 48%. Similarly, Parmenter et al. (1989) showed that modeling detectability using the negative exponential model always resulted in an upward biased.
Laake (1978) conducted experiments in which observers documented
perpendicular distances to wooden stakes placed in the ground at a known density (37.5/ha). Even in this well-controlled experiment, observers often failed to record all objects directly on the line, violating a critical assumption. An example of the detection function for one of the experiments is shown in Figure 1a, where we have corrected for the fact that, in this case, g(0) = 0.82 (rather than g(0) = 1.0). In this example, the Fourier series estimator using program DISTANCE estimated the density as 42.5, about 13% from the true value. Had Eq. 2 been used with these data (as illustrated in Figure 1b), the estimated density would have been 67.9/ha (biased positively by 81%), and there would have been no way to assess the amount of uncertainty in this estimate. This example is not an isolated case. In a recent experimental survey of the desert tortoise in the southwestern United States, Anderson et al. (submitted) had 12 teams estimate the abundance of artificial tortoises in which the true number was known. The 12 estimates varied from a negative bias of 7% to a positive bias of 13%, with a mean bias of –4%
(Table 2). Had Eq. 2 been used instead, biases would have varied from 62% to 93%, with a mean of 70% (Table 2).
Equation 2 allows calculation without inspecting the data
Most published applications of line transects in the Chinese literature lack raw data with which to compare competing mathematical approaches. However, Gao and Yao
(1997) displayed their raw data on line-transect surveys of argali (Ovis ammon) in
Xinjiang. They calculated densities from line transects with sample sizes of 4,1,1,3,3,2,2,2, and 3 argali groups/transect. Even if transects from each study area had been combined (the more appropriate procedure), total sample sizes for the 2 study areas would have been 9 and 14, both far smaller than the recommended minimum of 40 (Burnham et al. 1980).
A cursory examination of histograms for the 2 study sites suggests that detection did not decline with distance (Fig. 2). Thus, the fundamental assumptions for fitting any of the possible detection functions were evidently not met. The only function that is truly consistent with these (admittedly few) data are that detection probability was approximately invariant at least as far as the furthest group of argali seen. Thus, for the Hami study area (Fig. 2a), a more appropriate estimate of the width of strip “effectively” sampled would not have been the mean perpendicular distance of 327 m, but instead the largest perpendicular distance of 380 m. Doing so would have reduced their estimated
22density of 0.53 argali/km to 0.41 argali/km. Similarly, for the Mulei study area (Fig. 2b),
2the estimated density of 0.82 argali/km would have been more appropriately estimated
2as 0.54 argali/km. By using Eq. 2, Gao and Yao (1997) had no need to examine
histograms of their data, or to consider the implications of assuming the negative exponential.
AN EXAMPLE FROM FIELD WORK IN CHINA
Field work in China is particularly difficult, and many of the means for conducting line transects used in the West (e.g., aircraft) are not available. However,
sufficient observations can sometimes be obtained even in the difficult conditions Chinese scientists usually find themselves in. From these, detection functions can be estimated using program DISTANCE, rather than using Eq. 2. For example, Harris (1996, see also Harris and Miller 1995, Harris et al.1996) walked randomly placed line transects in Qinghai to estimate densities of Tibetan gazelle (Procapra picticaudata). The sample
sizes obtained (N=64 groups) allowed the estimation of the detection function using the Fourier series (Fig. 3). It appears that even here, some “heaping” occurred in those distance categories closest to the center-line, which should be avoided if possible.
It may be tempting to apply Eq. 2 because it is so easy to calculate. However, the accumulated experience in western countries, illustrated briefly by the examples provided here, is that it forms an unreliable basis for estimating density from distance data. Considerable effort has gone into providing user-friendly computer software (Thomas et al. 1998) and explanatory text material (Burnham et al. 1980, Buckland et al. 1993) for methods that are known to be more reliable. Both the software (program DISTANCE) and the accompanying text book (Buckland et al. 1993) are available at no cost over the internet from site http://ruwpa.st-and.ac.uk/distance. Now that computers and internet access are becoming more common in China, the methods provided by program DISTANCE should be used whenever possible. It is true that program DISTANCE is not a panacea; users can (and will) choose differing ways of treating data, resulting in slightly different density estimates. As well, the most robust and precise estimators often underestimate true density slightly because real data rarely match the ideal perfectly.
However, recent work has shown that, given an introduction to the important concepts, conscientious investigators will produce results varying by less than 10% from one another using program DISTANCE (Anderson and Southwell 1995). Thus, even if a small negative bias cannot be avoided, results will be fairly consistent from survey to survey.
However, no detection function will perform well when sample sizes are very small. For example, Gao and Yao (1997) reported density estimates from lines surveyed in which only a single group of animals (i.e., n=1) was observed. Such an estimate is, of course, theoretically possible using Eq. 2. However, investigators should not be fooled into thinking that they really know much about the density of animals in an area when they have only a single group with which to model detection. Unless there are persuasive reasons to avoid doing so, it is appropriate to combine data from replicate surveys, or portions of a study area, to achieve reasonable sample sizes in estimating a detection function. However, if, after having combined similar lines, sample sizes are still quite small (e.g., 10-20), it is then best to avoid estimating densities from distances. Instead, it is more prudent to simply report the number and type of animals observed (as well as thoroughly documenting methods used), and treating the results as an index to abundance. This index cannot be used to determine absolute density or abundance, but might still be useful if repeated periodically to obtain a rough idea of population trends.
Similarly, estimates using program DISTANCE are invalidated to the degree that field procedures violate the fundamental assumptions of line-transect sampling. If sampling cannot be conducted in a way that minimizes assumption violations, it is again
advisable to simply report the raw data and methods used, rather than attempt to derive a density when no model exists to do so.
Finally, the importance of a rigorous, objective sampling regime cannot be stressed enough. Extrapolations of density are only valid if the transects sampled truly represent an unbiased selection of all possible transects within the area of interest.
Work in China was funded by the Robert M. Lee Foundation and the Liu Guo Lit Charitable Trust. S. T. Buckland and J. L. Laake provided suggestions to improve the manuscript.
Anderson, D.R., Laake, J.L., Crain, B.R., and Burnham, K.P. (1979). Guidelines for line
transect sampling of biological populations. Journal of Wildlife Management.
Anderson, D.R. and Southwell, C. (1995). Estimates of macropod density from line
transect surveys relative to analyst expertise. Journal of Wildlife Management.
Anderson, D. R., K. P. Burnham, B. C. Lubow, L. Thomas, P. S. Corn, P. A. Medica, and
R. W. Marlow. 2001. Field trials of line transect methods applied to estimation of
desert tortoise abundance. Journal of Wildlife Management 65: 583-597.