Trust, Automation, and Feedback: An Integrated Approach
Department of Industrial & Systems Engineering
North Carolina A&T State University
Greensboro, NC 27411
Ann M. Bisantz
Gordon J. Gattie
Department of Industrial Engineering
University at Buffalo, The State University of New York
Buffalo, NY 14260
A large body of research exists which uses Lens Modeling outcomes to provide cognitive feedback to a human judge, in order to improve the judge’s own policies and performance. Our research extends this work by using modeling outcomes to provide assessments of one agent’s judgment performance to other agents in a multi-agent
judgment system. More specifically, meta-information about an automated decision aid, which comprised parameters measuring the aid’s performance, was developed based on concepts of cognitive feedback. In such systems, trust serves as an important intervening variable in decisions on the part of the human agent to utilize outputs from automation. Problems of miscalibration of trust can lead to both under-use of good quality automated systems, as well as overuse of poor quality systems (Parasuraman & Riley, 1997). Providing operators with information to enhance their understanding of the quality of the automation’s outputs and its processes may provide a means of improving trust calibration.
This chapter reviews research in which an integrative framework, based on the Lens Model, was developed to address these issues. The framework characterizes aspects of automated systems that affect human trust in those systems, and allows assessment of aspects of human trust through the application of Lens Modeling outcomes. Implications are drawn from an empirical study which relied on this integrative framework, and are provided for the design of training systems and displays.
Cognitive Feedback: An Application of Lens Model Outcomes
One application of outcomes from the Lens Model has been in the area of cognitive feedback. Human judges may be viewed as information processors that can
change their judgment strategies based on evaluations provided from previous judgments. The Lens Model Equation can generate useful feedback information to provide these evaluations. Cognitive feedback based on the Lens Model Equation includes information about the properties of, and relations between, an individual’s judgment policy and the environmental structure. Since feedback information can be delivered before future judgments are made, such information may be considered both feedback and feedforward information (Cooksey, 1996). While feedback provides information about the
“correctness” of prior judgments made compared to the truthful states of the environment, cognitive feedback can also be used in a feedforward manner, allowing human operators to use the information in making future judgments. This distinction is important for identifying information that not only summarizes judgment performance, but provides judges with information to enhance their understanding of the task environment and their own judgments.
Feedback regarding judgment quality can be readily obtained from Lens Model parameters. This type of detailed feedback, termed cognitive feedback, has been shown to improve judgment performance, particularly compared to outcome feedback, in which participants are only provided with the correct judgment (Adelman, 1981; Balzer, Doherty, & O'Connor, 1989; Balzer, Sulsky, Hammer, & Sumner, 1992; Lindell, 1976; Steinman, Smith, Jurdem, & Hammond, 1977). Cognitive feedback can include information about the task environment (e.g., ecological predictability – R, ecological e
validities, or the relationship forms linking cue values to criterion values), cognitive information about the judge’s policy (e.g., cognitive consistency - R, cue utilization S
validities, or the relationship forms linking cue values to judged criterion values), or
functional validity information, which shows the relationship between the task environment and actions of the judge (e.g., overall achievement – r, linear knowledge – a
G, or degree of non-linear knowledge – C; Doherty & Balzer, 1988).
Cognitive feedback can also include comparative information, such as the relative size of environmental cue weights compared to a judge’s cue weights. Feedback can be presented verbally, in terms of textual descriptions of the forms of the cue-criterion relationships (e.g., the criterion value increases linearly as the cue value increases); numerically, in terms of correlations for R and R, or numeric cue weights; or eS
graphically, using bar graphs to show sizes of cue weights or line graphs to depict cue-criterion function forms. Typically, cognitive feedback is provided by first collecting data on several participant judgments and then computing Lens Model parameters based on those values. Participants are then either provided with the information immediately (e.g., after a block of 20 judgments) or after some time period has passed (e.g., before the next experimental session). Gattie and Bisantz (2002) performed an experiment in a dental training domain to investigate the effects of different types of cognitive feedback under different environmental and expertise conditions. Specifically, the task environments (consisting of cases drawn from actual patients) were either well represented by a linear model or not, as measured by R; and participants were either medical trainees, primarily e
in dental medicine, or medical novices. Feedback was provided after each case decision. Results indicated that performance and consistency initially declined with the introduction of cognitive feedback, but recovered after some experience with interpreting the detailed feedback. Overall, medically trained participants performed better with task information, and the presence of cognitive information allowed untrained participants to
perform at similar levels to trainee participants.
Multi-agent Judgment Systems and Trust
As discussed, cognitive feedback has generally been utilized to provide information to a judge regarding their own judgment policy, the environmental structure, and the individual-environment relationship, in order to improve judgment performance. Additionally, information drawn from the Lens Model, analogous to cognitive feedback, may prove to be useful as a diagnostic tool to help individuals understand the behavior of other judgment systems with which they interact. These systems could include automated as well as human agents. Collaboration among such systems of judges may be made more effective and efficient if judges have an understanding of each others’ judgment policies.
In particular, two types of information regarding a judgment agent - cognitive information and functional validity information - may provide critical information in understanding the behavior of the agent, by providing an indication of both the quality of judgments produced by the agent (i.e., their functional validity), and the reliability with which they are produced (i.e., through the cognitive information, specifically R, s
regarding the agent).
Many complex work environments involve multi-judge systems in which human operators must assess, and choose whether or not to rely on, the outputs from one type of judgment agent: automated decision aids. Such aids provide estimates and information which support situation assessment judgments and choices made by the human operator. Bass and Pritchett (this volume) present a methodology which explicitly addresses the measurement of human interaction with an automated judge.
Issues regarding human judgments in conjunction with such decision aids include the following:
1. Because of the inherent uncertainty residing within the environment, judgment
and decision making tasks challenge human information processing capabilities.
Furthermore, the environmental uncertainty is transferred to the situational
estimates generated by automated decision aids, which subsequently places the
human operator in the difficult position of determining whether they should rely
on the decision aid or not. Such decisions are related to the circumstances, and the
level of automation autonomy that has been selected (Endsley & Kaber, 1999;
Parasuraman, Sheridan, & Wickens, 2000).
2. Automated decision aids are subject to multiple types of failures, such as simple
mechanical malfunctions, environmental disturbances, and intentional attacks,
which can make decisions to rely on the aid difficult (Bisantz & Seong, 2001).
In systems with automated judgment agents, questions of human operators’ trust in those agents become important. Empirical studies have shown that operators’ trust in automated controllers is a critical intervening component affecting operators’
performance in controlling the systems as well as using the automated controllers (Lee & Moray, 1992; Muir & Moray, 1996). Many characteristics which influence the level of individual’s trust in automated systems have been suggested by researchers (Lee &
Moray, 1992; Lee & See, 2004; Muir & Moray, 1996; Sheridan, 1988), including reliability, validity, transparency, utility, and robustness. Among these, an understanding of the reliability of an automated system (i.e, the consistency with which it performs) has
been suggested as critical in allowing humans to correctly calibrate their trust in, and subsequently use, the automated system. Interestingly, one component of cognitive feedback, as applied within a multi-agent judgment framework, represents information regarding the reliability of the other judgment agents: the R of other judgment systems. s
In this sense, cognitive feedback can be used to diagnose the reliability of automated systems, which in turn may affect the level of an individual’s trust in such systems.
There are other important characteristics that play important roles in determining the level of trustworthiness of automated decision aids. For instance, Sheridan (1988) discussed validity as an important characteristic. Within a framework in which the automation is providing situational estimates, such a characteristic corresponds to the validity (or correctness) of the estimates provided. Again, the functional validity information provided by the Lens Model can serve as a measure of this validity.
Additionally, Sheridan (1988) suggested that the understandability and transparency of a system plays an important role in determining the level of trust in an automated system. If operators can better understand and evaluate the actions or judgments provided by an automated system, they may be better able to calibrate their trust in, and reliance on, such systems. Providing information in the form of reliability and functional validity information as described above may enhance the human judge’s understandability of an automation system.
The Lens Model with its extensions (Brunswik, 1952; Cooksey, 1996) can provide a framework within which aspects of human interaction with an automated judgment agent, such as that described above, can be modeled. In particular, the n-system
design in which multiple agents make judgments about the same environmental criterion provides a framework to represent a situation where the human judge is supported by automated technology. In this case, a human operator and an automated system are making judgments about the same environmental state (see Pritchett and Bisantz, this volume, for a more extensive description of the n-system Lens Model). However, in contrast to a standard n-system model, these judgments are not separate: the human judgment is based not only on the cues themselves, but also on the output of the decision aid. Thus, the hierarchical Lens Modeling framework also has applicability, in that the cues are utilized by the automated decision aid, which subsequently provides a situational estimate to the human judge, who partially bases their judgment on the decision aid’s judgment. However, the hierarchical framework does not capture the fact that the human judge may have access to the same cues as the automated decision aids. To address this modeling need, a hybrid approach was taken, combining elements of both the n-system and hierarchical models.
Figure 1 depicts the components of this n-system/hierarchical hybrid Lens Model. Judgments of some environmental criterion (e.g., the intent of a hostile aircraft) are made, based on a set of cues, by both an automated decision aid (ADA) and the human operator without the decision aid (HO) (the top and bottom agents in the figure, respectively). In a system where the human operator has access to the outputs from the aid (shown as the second judgment agent), the judgment made by the aid serves as an input – essentially
another cue – to the human’s judgment (note the link from the aid’s judgment to the human+aid agent). As with the standard single system Lens Model, the degree of judgment competence of all three of these systems (the ADA, the human judge acting
Impact of ADA Reliance of HO
On HOOn ADAJudgment byEnvironmental
Human + ADACriterionCorrespondence
Of HO and ADA
Figure 1. Hybrid n-system/hierarchical Lens Model. Three judgment agents (an
automated decision aid (ADA), a human acting alone, and a human acting in concert
with the automated aid) are shown. Note that the output from the ADA acts as an addition cue for the human + ADA system.
alone, or the human judge acting with aid input – HO+ADA) can be assessed by
measuring the correspondence between the environmental criterion and the output of the system (i.e., using the standard measure of achievement). Additionally, one can compare the judgment outputs across systems (as shown by the links on the right side of the figure), to evaluate the degree to which the human and ADA correspond (see Pritchett and Bisantz, this volume, for an elaboration of this comparison); the degree to which the judgment from an unaided human is similar to that with the aid (to understand the impact of the aid on the human’s judgment) and the similarity of the aided human judgment with
the output from the aid itself (to assess the degree to which the human mimics the output from the aid).
The parameters identified within this modeling framework may be able to illuminate important characteristics of human operators’ trust in the decision aid and their
resulting strategies in considering the decision aid’s estimates. For example, assume that the automated decision aid produces accurate estimates of the environmental states. If the human judge is well calibrated to the performance of the aid, this should result in a high performance level of the human operator in combination with the decision aid, and will also result in a high degree of correspondence between the ADA and HO+ADA estimates. The extent to which the aid is having an influence, over and above the human operator, would be reflected in the level of correspondence between the unaided HO, and the HO+ADA. In contrast, if the aid is not performing well, but the operator is able to compensate (i.e., still maintain a high degree of judgment performance), there should be a low degree of correspondence between the HO+ADA and the aid alone, but a high degree of correspondence between the HO and the HO+ADA.
One can also consider the Lens Model outcomes from each of the judgment
C(ADA) Unmodeled Knowledge
Cue2??EADAYYEADALinear ModelLinear ModelCriterionJudgments(ADA) iOutputOutputWCue3
Aid Competence r(ADA)a
Figure 2. Single-system Lens Model of the automated decision aid.