Synthetic Interview Development
This document describes the general processes for Synthetic Interview ? development, rule-base
development for an Intelligent Synthetic Interview ?, and the specific tasks for each phase as
performed for the GFS Distance Learning System project. The goal of the DLS project is to
assist an automotive sales consultant/trainee with the sales interaction of a customer looking to
buy a car.
Basic Synthetic Interview ? Process and Technology Description
The Synthetic Interview is a technology and technique that creates an anthropomorphic interface
into multimedia data of a particular kind: video of a person responding to questions (interacting
with another person). The responses of the interviewee are presented in such a way as to simulate
the experience of interacting with a live person.
The process of creating a Synthetic Interview is split into are four principal phases: Pre-production/Production (domain/biographical analysis, video pre-production & production).
Language Analysis (indexing and the creation of language models relevant to the domain of
discourse), Integration, (video, html, and other media with the SI index), and Testing. Pre-production/Production
Pre-production and production is similar to a traditional video project. Tasks include: scripting,
assembly of crew, casting, location selection, video format selection; special effects, interface
design, and scheduling. The principal difference is the domain analysis and ―pool‖ capture. In a
Synthetic Interview, it is necessary to develop and anticipate the questions likely to be asked by
the target audience. Still, it is impossible to predict every possible question. It is important that
the interface, script, and overall experience is designed to set user expectations. That is, if the
users believe they are interacting with a cardiologist, they are unlikely to ask questions about
football. Conversely, users will not be likely to discuss heart pain with a sports figure.
Equally important is the ability to deal with unexpected questions. We have developed a series of
pool topics and associated questions to handle events such as: out-of-bounds questions and
statements, follow-on questions, exceptions, transitions, and transformations.
Transitions include phrases like, ―I disagree with you.‖ And transformations change invalid
statements to valid ones such as, ―I don‘t really know about that, but let me discuss something
else of interest.‖ Follow-on statements are handled by specific transitions such as, ―That‘s really all I have to say.‖ Or, ―As I was saying.‖ Out of bound questions are recognized, but not
answered, i.e. admonishments for obscene questions. And exceptions handle unrecognized
questions, ―I don‘t have anything to say.‖ Or ―Please repeat yourself.‖
Page 1 of 25
Language Analysis - indexing
For indexing and retrieval we, apply a combination of manual and automatic language expansion
to the base set of interview questions. Manual techniques are used for semantic expansion and
automatic techniques foe syntactic expansion. For example, assume a base question/answer pair
Q1: When were you born?
A1: I was born on April First, 1968, in Chicago, Illinois.
Manual semantic expansion would include generating a set of questions mapping to this answer
Q1a: How old are you?
Q1b: What‘s your age?
Q1c: Where were you born?
Q1d: What‘s your birthday?
Depending on the content of the full interview, one might even map ―Where did you grow up?‖
to A1. Since listeners fill in much in natural conversation, A1 will be typically acceptable if no
specific response is available and likely better than responding with a pool such as ―I don‘t have
an answer for that question.‖ Simple automatic syntactic expansion would include
Q1c -> Q1c‘: What is your birthday? from grammatical expansion
Q1d: -> Q1d‘: What‘s [What is] your date of birth? from grammatical and synonymic
Integration includes video encoding, creation of indices and catalogs (an automatic process),
incorporation of graphics, HTML, and merging with any special case software (such as Flash
applications). Much of this is done incrementally and in parallel with both the production and
We try to bound the domain of discourse by the experience itself. Nonetheless, users will still be
quite broad in their dialogue. Therefore, Synthetic Interviews benefit greatly from incremental
development; continual user testing is essential.
The indexer/retrieval system takes any typed sentence and retrieves what it believes to be the
most relevant response. As a consequence of this design there are two principal error types to Page 2 of 25
test for: 1) indexing and retrieval errors wherein incorrect responses are presented for proper sentences; and, 2) sentences or topics that were not covered during pre-production domain analysis.
Testing is best-accomplished free form, with naïve users. We can capture full users‘ sessions to
analyze whether appropriate responses were generated from valid questions, determine if there were recognition errors or indexing errors, and identify valid questions which were not Rule-Base Development Process anticipated during pre-production.
In order to model appropriate discourse and personae personality a rule-base mediates all interactions between the user the basic Synthetic Interview. In practice, rule-base and language analysis occur simultaneously and are interdependent on one another.
Distance Learning System Prototype Development
Domain knowledge was obtained from MBUK course materials, interviews with expert trainers, and observation of actual MBUK sales skills classes. MBUK course material contained detailed information on how to elicit information from clients and structure negotiations, from introduction to close of sale, in a series of pre-defined stages. (Please see Appendix A for a listing of all of the stages.) The stages modeled by the DLS prototype include
1. Establish contact
2. Establish rapport
3. Agree on agenda
4. Discover and uncover needs
5. Handling Objections
9. Gaining Commitment An integral part of the course exposes student to the concept of personality types, trains students on questioning techniques appropriate for various personality types, and provides techniques for discovering the client's personality.
Four behavioral types defined in the MBUK course are:
1. Driver (the director)
2. Analytic (the clinician)
3. Amiable (the friend/supporter)
4. Expressive (the socializer).
Page 3 of 25
The DLS prototype models the Analytic personality. (Please see Appendix B for more information on the Analytic.)
Finally, based on the MBUK course, a hypothetical customer profile was developed. The customer profile includes background information such as socio-economic status, education level and prior car experience. The DLS prototype profile was partially based on input from UK, what type would they most like to see included. Our sample Analytic profile is a single, female academic buying her first car. (For her complete profile please see Appendix C.) Language Analysis and Rule-Base development
After the customer profile was completed, increasingly rich scenarios were developed, working through each of the stages to be covered in the SI. The first scenario gave a very simple description of a sales interaction:
The customer comes into dealership interested in purchasing a Mercedes. This
will be her first car purchase and is possibly interested in leasing options.
Once the customer profile, background, and initial scenario are complete, the critical needs for the customer are established. Critical needs help drive question development and provide guides for changes in customer behavior (i.e., when needs have or haven‘t been identified by the trainee). Needs comprise two categories: rational and emotional. Rational needs include economic value, profit, utility, convenience, quality, efficiency, health, durability, speed, safety, appearance, security, versatility, company growth, and self-development. Emotional needs include confidence, appearance, fear, envy, pride, esteem, respect, survival, comfort, self-satisfaction, pleasure, safety, security, belonging, novelty, self-development, vanity, love, loyalty, and competitive spirit. The most critical needs for this particular profile were determined to be safety, comfort, and appearance.
To vary moods for the customer we developed personality attributes that would change over the course of the interaction with the trainee. We identified four attributes, dissatisfaction, unhappiness, frustration and skepticism, that were important for this personality profile. Other types of attributes may be appropriate for other personalities. The values of the four attributes were averaged for each interaction to determine current customer state. Other attributes can also be added to this character, if necessary, to add more complex behavioral reactions. Details were added to the scenarios, considering the customers needs and possible changes in behavior. As a first approximation, one likely path an interaction could take from start to finish was defined. Simultaneously, places where behavior changes could take place were identified and rules to affect these changes created (please see Appendix D). This generated a set of typical questions and from which a hypothetical discourse was scripted.
Next, stages, topics, specificity levels, and ―nuggets‖ where identified. Stages, roughly matching the course stages, are used to keep track of where the trainee is in the course of the conversation with the customer. Topics are clusters of questions, at varying levels of specificity, about a subject (e.g., number of seats in car). Questions within a topic were assigned specificity levels to differentiate levels of detail; more detailed questions had higher specificity levels and more
Page 4 of 25
detailed answers. ―Nuggets‖ are the most detailed response (highest specificity level) in a topic,
providing the trainee clues about the customer and guiding the trainee through the interactions.
When a trainee asks a nuggets question he or she receives the most information from the
customer about that particular topic. A topic within each stage of the interaction includes both
nugget and open questions. The topics titles for this prototype include:
Stage 1: introduction, small talk, weekend, climb1, week, weather
Stage 2: assist, what currently own, tell me a little bit about yourself, profession, how
long with current employer, decision to buy a car, benefits in a new car,
information, family status, children, where live, neighborhood
Stage 3: agree
Stage 4: consult with anyone else, primary car, priority, mileage, miles per gallon,
number of seats, back seat, boot space, work use of car, use car for business travel,
commuting to work, leisure driving, road conditions, off-road driving, towing
anything, quality, friends car problems, service, safety, safety features for children,
car theft, anti-theft, garage, engine size, remote locking system, automatic
transmission, size of interior, power features, cruise control, smoking in the car,
dual temperature control, car tires, leather interior, convertible, exterior car color,
roof rack, bike carrier, how much spend on the car, flexible with the spending
amount, current yearly income, additional money down for the car, when will be
purchasing the new car, stereo system
Stage 9: closing thoughts, buying or leasing, financing, time to think, any additional help,
schedule an appointment, thanking customer, good bye / close
Stage Recovery: sorry, discuss topic later
Please see Appendix E for an example of a topic.
Each behavioral type also has a backup style. This is behavior that is initiated if the customer is
unhappy with the interaction. A customer becomes unhappy when the trainee asks questions that
are not consistent with the customer‘s behavioral type. When this happens the customer will
revert to the back-up behavior for that behavioral type. The trainee must be able to identify what
‗triggers‘ this mismatch and then identify how to modify his behavior to get back on track.
Backup behavioral styles include: Autocrat (Driver), Avoider (Analytic), Compliant (Amiable),
Aggressor (Expressive). Backup style for the Analytic is the Avoider and characteristics of this
style include being overcritical of others, unwilling to be influenced by others, risk-avoiding as
they seek security, and likely to become a procrastinator. (Please see Appendix B for more
information on the backup style.)
Answers to questions were first scripted assuming everything was going fine in the
customer/trainee interaction and that the customer was happy (state=happy). A second level of
response was added assuming the customer was unhappy and had gone into her backup style
(state=avoider). The change between these two levels was fairly abrupt and a midlevel response
was added to provide transition (state=neutral). Three levels were determined to be sufficient to
Page 5 of 25
provide an adequate range of response. There is nothing to preclude adding additional levels of
answers. Please see Appendix F for an example of the different levels.
The response level (happy, neutral, avoider) provided by the customer to any particular question
depended on the value of the customer‘s personality attributes at that point in time. Attributes
can be set at different levels at the beginning of the interaction with the trainee to start the
customer off in different states.
1 were integrated with rules of behavior. The combination of topics and Once developed, topicsrules guided the iterative development of question-answer pairs for each topic. (Note, to simulate
behavioral states there are actually multiple answers for each question.) Behavioral changes
drive the adaptation and addition of both topics and discourse within topics. (Please see
Appendix G for the rules.) Ideally, role-playing exercises should be videotaped and analyzed to
refine topics and rules of behavior. As important as scripting anticipated questions is the ability to deal with unexpected questions.
Such events are managed by a series of pool topics and associated questions (e.g. I don't
understand, I've already answered that, I'm leaving). The following are examples from each of
the pool categories. The ‗Don't Understand Pool‘ includes phrases like ―I‘m sorry, I don‘t
understand the question‖ or ―I'm not sure.‖ A ‗Storm out answer‘ would be ―I've had enough.
I‘m leaving.‖ If the system recognizes that the same question has been asked sequentially, the
‗I‘ve already answered that‘ pool will respond with an answer such as ―I gave you an answer
already‖ and ―Didn't we just talk about that?‖
For the Analytic behavioral style, once the Analytic is sent into backup mode, or the Avoider
mode, some examples of pool answers include: ―I don‘t think that's any of your business,‖ ―I‘m
not interested in what other people are doing,‖ ―I prefer not to discuss that right now,‖ and ―I'll
have to think about it.‖ Transitions include phrases like, ―Ok, but....‖ and ―As I mentioned...‖
Generic pool answers include ―Yes,‖ ―No,‖ ―Maybe,‖ and ―Could be.‖ During Stage 5, the
viewing of the car, a separate set of pool answers were developed covering three levels of
interest: not at all interested, semi-interested, and interested. An example of a ‗not at all
interested‘ answer is ―This is not important to me.‖ A semi-interested answer would be ―I‘m not
sure‖ or ―That sounds fine.‖ And the interested pool answers include ―What a great feature‖ and
―I'm definitely interested in that.‖ Three classes of rules were developed for the Synthetic interview, administrative, state-changing,
and state-effect rules. Administrative rules keep track of the current state as well as items like
what topic was active, a list of closed topics, etc. State-changing rules deal with modifying the
emotional state of the customer based on the question asked and/or the history of questions.
State-effect rules use the current customer state to determine the ―correct‖ answer to the asked question.
1 We were not exhaustive in our topic list. We felt the number of topics was sufficient to provide
Page 6 of 25 a reasonable test of the SI. More time will need to be spent on this stage of development to make
a more complete list.
State rules were first developed in natural language, then converted into pseudo code, and finally into Visual Basic code. A small set of core rules was coded, leaving the option to expand the rule base in the future. Testing of the rules was initially done manually, on paper, to determine state changes, based on changes in attributes. Once the system was implemented the rules were tested again to verify they functioned correctly. Please see Appendix H for an example of conversion from natural language to pseudocode.
Finally, project members performed manual semantic and syntactic expansion (permutations). Two main techniques were used in the process for developing permutations: 1) each team member was asked to provide five different forms of each question; and, 2) all terms used in the questions were run through a thesaurus to increase vocabulary. Results from each process were integrated. After integration the resulting question sets were reviewed and added to as necessary. MBUK training experts reviewed the questions for both coverage and idiomatic form. Alpha testing provided a second order expansion of the question forms. Production
The video was shot in CMU's Media Services studio. This provided a controlled environment for optimal audio and video production values. The actress was composited into an image of a dealer showroom. This afforded great flexibility and permits easy follow-up production should new video be required. Editing was performed by CMU Media Services as was final encoding for web streaming.
Five tables were created in the database, question, topic, stage, answer, and event. The question table contains the questions themselves, tracking information on the questions, and the code for the appropriate answer for each of the three response levels. The topic table contains tracking information for all of the topics. The stage table contains tracking information for all of the stages. The answer table contains all of the possible answers that can be given (pointers to the video clips), including pool responses. The event table contains all the possible ways the customer can change state. Please see Appendix I for a detailed description of the tables and fields. See the Interface Design section for a discussion of the integration of the UI with the database.
Eight subjects completed a full discourse with the Synthetic Interview. As each subject entered questions into the Synthetic Interview system, a database file collected the questions submitted, the question the system thought the user was asking, and the response to the question. For this initial trial the Synthetic Interview database contained 73 topics and 224 question sets covering 1079 permutations.
Two members on the project went through the first set of interactions using the coding scheme as listed in Appendix J. Both members individually coded the 253 question/answer sets between
Page 7 of 25
2. From this first trial 52% of responses from the 3Synthetic Interview sessions were appropriate matches for the question asked. Upon reviewing
the logs from the Synthetic Interview sessions, we were able to add 16 new topics, 16 new the trainee and the Synthetic Interviewquestion sets and 1138 new permutations (these included permutations for existing and new
We retested the system using the logs of the questions from the first test. Upon reanalysis of the
question/answer sets, with the inclusion of the new topics and permutations, 95% of the answers
provided by the Synthetic Interview were appropriate matches for the questions asked. There was
agreement on all question and answer sets between both coders for both the first test and the
second test of the Synthetic Interview.
Interface Design of the Synthetic Interview Prototype
The Synthetic Interview (SI) prototype is composed of four frames. The top left frame houses the
SI component. The bottom left frame contains the car selection interface. The Show & Tell
interface resides in the vertically oriented middle frame. The frame on the far right was added to
keep the other frames from stretching to fill the screen and is empty. Because this last frame is
not important to the design of the interface, it will not be referred to in the rest of this document.
The Show & Tell frame will be referred to as the right-most frame. The entire interface was
designed to fit within an 800 x 600 display with the browser set to full screen. None of the
frames should scroll.
All three frames have been implemented with DHTML because of its capability of exact pixel
placement across multiple browsers and platforms. Each frame has a full background image that
connects seamlessly with the other frames, creating the illusion of one coherent interface, rather
than three separate frames. The SI component was placed in the top left frame because that is the
initial point of interaction, as well as the most important and often used component. The
progression of use and the components' sizes intuitively lead to the placement of the rest of the
interface around the SI component.
The visual aesthetic was influenced by the CBT course CD-ROMs, the S-Class CD in particular,
and the DaimlerChrysler websites. The images used were taken from the MBUSA and MBUK
websites. The light color scheme was chosen for the first several phases to match the atmosphere
of the dealership that was keyed into the SI video clips. The change to black during phase four
was done to accommodate the design of the Show & Tell interface, which works better on a dark
background. It also serves to switch focus and suggests a change in location, though there are
subtler ways of doing this that may be more appropriate.
2 Every question/statement input by the subjects received an answer from the Synthetic Interview,
resulting in a series of question/answer set that was analyzed after the Synthetic Interview
session was complete.
3 This does not include Stage 5 interactions (the automobile interface—see discussion of the
interface below). There were 109 of these interactions (talking about car specifics) that were
correct. Technically this isn‘t really the Synthetic Interview since all inputs have an exact match, which is why these interactions are not included in these numbers.
Page 8 of 25
The background image of the SI component extends the background of the video, allowing the
video window to merge with the rest of interface. This enhances the immersiveness of the
The text entry field used to enter a question is the one piece that does not fit with the rest of the
interface. Due to technological constraints, the standard HTML form field was the only option. It
is possible that a custom field could be created using Java4. Regardless, the field should be
centered beneath the video with the "Ask Question" button directly below it and right aligned
with the field. The field must be constrained to the size of the frame, but should be large enough
that an average question will not require scrolling.
In phase four, a button appears in the top right corner of the frame that allows the user to "leave
the car," consequently changing the backgrounds of the frames to the original color scheme, and
removing the show and tell interface. The background makes it appear as if it is part of the Show
& Tell interface. The button highlights when the cursor moves over it. Ideally, the "Ask
Question" button should highlight as well. There seems to be an HTML form constraint that
keeps the type of button used from working. There is most likely a way to solve this problem
with Java Script.
Car Selection Frame
The car selection interface was created in Macromedia Director as a Shockwave movie. There
are two versions, each accommodating one of the background colors. The car models are
categorized per the MBUK website classification. Left and right arrows allow the user to select a
category, displaying photos of the relative models. Selecting a model causes all three frames to
switch to the black background and loads the Show & Tell component.
Auto Feature Demonstration Frame
The auto feature demonstration interface (Show & Tell), was created to allow the user to show
car features to the customer. This component was also created in Macromedia Director as a
Shockwave movie. The resulting file is fairly large and takes too long to download, so another
means of implementation, or more concern for small image files, may be necessary.
All features of the car have been categorized by general physical components of the car (e.g.
information about the tires, brakes, and suspension are all accessed by clicking on one of the
wheels of the car). Some features are listed under multiple components (e.g. all of the safety
features are listed when one of the safety features, such as the airbag, is selected). To make the
components accessible to the user, several views of the car were chosen: front, side, back, and
two interior shots. Each view can be accessed through the diagrammatic buttons at the top of the
frame. The components that can be selected are outlined and highlight when the cursor moves
over them. Clicking on a component brings up the list of features associated with it. In addition,
the outline of the component fills in to show that it is selected, and the rest of the car fades out,
4 Style sheets can also be used.
Page 9 of 25
indicating that it is no longer active. Each feature in the list has a button that allows the user to
show the specific feature to the customer, prompting her to respond. A close box hides the
feature list so that another component may be selected.
Page 10 of 25