STUDENT PERCEPTIONS OF THE FACULTY
COURSE EVALUATION PROCESS: AN
Heine, Richard P.
Maddox, E. Nick
Student evaluation of teaching (SET) has long been the subject of research, primarily focusing on two areas. The first area addressed the accuracy of the perceptions students hold of their teachers’ performance in class. Secondarily, research has focused upon uncovering the source of
students’ perceptions about teaching effectiveness and quality. Most such studies utilize actual SET data sets generated by student course evaluations. A variety of variables such as class size, gender, and expected grade are used in statistical analyses. This paper, however, describes a study, not of SET data itself, but of student perceptions of the entire faculty class evaluation process. After a series of student focus group discussions, a 16-item Faculty and Course Evaluation Questionnaire was developed. The survey results indicate a number of significant statistical differences in responses related to gender and class ranking, as well as other process issues. For gender differences, female students were found to take the evaluation process more seriously than their male counterparts. Additionally, female students reported believing the evaluation process was more important than males in the sample. Male students, indicating some
cynicism about the class evaluation process, were significantly different in a negative way from female students in terms of their perception that the higher the grade projected the higher their evaluation of a professor and their belief that professions adjusted their in-class behavior at the end of the semester to achieve higher evaluations. Discussion of these and other results are provided as well as a discussion and implications for future research.
Conversations among university colleagues on the topic of student course and faculty evaluations are typically animated and full of opinions, myths, war stories and frustrations. Almost every discipline within schools of business forms professional associations, hold conferences and publish journals focused only on teaching students in their discipline, not on organizational research within the discipline. Predictably, due to their self –interest, many articles appear in
such journals that focus, not on how to more effectively teach the discipline, but on how faculty teaching is evaluated.
Published studies of course and faculty evaluation by students generally fall into two separate but related areas: First, is the accuracy of perceptions by students on faculty performance and, second, are the sources of students’ perceptions about teaching effectiveness. The first area, accuracy of perceptions, often involves grading leniency as related positively to student evaluation, a commonly held perception among faculty. Cognitive dissonance theory, for example, suggests that students who expect poor grades rate instructor poorly to minimize psychological or ego threat. One study (Maurer, 2006) found support for cognitive dissonance as a significant variable affecting accuracy of student perceptions. Another such study (Heckert, Latier, Ringwald-Burton & Drazen, 2006), however, found that student perceived “effort appropriateness” was more
positively related to faculty evaluation than was simple expected grade. That is, students who extended effort, learned more, and who were subsequently rewarded, rated instructors more highly than simply expected grade could explain. The role of the scale used and a question’s
sequencing on the student evaluation form is another approach studied (Sedimeier, 2006). That author concluded both of these had significant effects on the accuracy or reliability of student evaluations. Participant ratings were strongly influenced by manipulating the polarity of the scale and sequencing of questions.
A second, and perhaps more common, research approach in analyzing the faculty and class evaluation processes is to assess the source of the student perceptions about teaching effectiveness. Such studies often focus on the students’ demographics as well as the
characteristics of the course or delivery method. Females were found to have more effective “evaluation abilities” than males and evaluated some aspects of courses more favorably than
males, particularly in open-ended formats. Females, for example, had more accurate recall (Darby1, 2006) in articulating course events. Another study by the same author found that elective courses were more highly rated than required courses (Darby2, 2006). In a complex longitudinal study, McPherson (2006) found that variables such as student class level, and even time of day, related significantly to student evaluation scores. Using the same course and instructor teaching both a traditional course format and a two-way interactive television delivery, a researcher found that the traditional course format was more highly rated (Mintu-Wimsatt, et al., 2006). The courses in the present study was of the traditional course face-to-face format.
A third, and more unique approach, is presented in this paper. The authors focused on student perceptions of the entire faculty and course evaluation process, not just on the student perceptions of a single course or faculty member. One study could be found which addressed some concern about how students perceived the evaluative process. It suggested that faculty and students should partner in evaluating courses and pedagogy to overcome erroneous perceptions, such as why faculty emphasized certain topics. The study found that course effectiveness was improved through such partnering (Giles, et al., 2006).
We will describe the origins, methodology and results of our study, discuss their implications, and finally provide suggestions for future research in this area of great personal and professional significance to educators.
METHODOLOGY AND ANALYSIS
A focus group of undergraduate students in a human resource management course met four times over four weeks to generate ideas and perceptions related to their views of the required faculty/course evaluation process. Discussion was facilitated by the instructor and included dialogue about the forms currently used in faculty evaluation, the timing and subsequent use of SET by faculty and administrators, student candor and comfort level with the process, and faculty behavior surrounding SET.
As a result of these focus group discussions, a 16-item Faculty and Course Evaluation Questionnaire (FCEQ) was developed. The 16 items are presented below. Fifteen of the items dealt with the experienced process of class evaluation by students. The instrument was divided into three sections addressing: 1) Students Responses about Themselves; 2) Student Responses about Professors; and 3) Student Responses about the Evaluation Process. Item 16 was an omnibus “effectiveness” item used in multiple regression of the other 15 items. All 16 primary
items were scored on a 1 – 4 scaling from Strongly Disagree (1) to Strongly Agree (4).
Demographic data was collected on gender, school (Business versus Arts and Sciences) and class ranking (freshman through senior). Copies of the instrument are provided in the Appendix of this paper.
The FCEQ was administered by the students in the original focus group to approximately 320 randomly selected students on campus and in a variety of classes in the School of Business Administration at a small, AACSB-accredited liberal arts university during the spring semester of 2006. A comparison of the sample demographics to the actual population indicates some skewing of the sample. Fewer first-year students were represented in the sample and more senior-year students were represented. The second and third-year population was almost exactly represented in the sample. Male and female students were closely represented in the sample. Table 1 below indicates these data.
COMPARISON OF SAMPLE TO POPULATION DEMOGRAPNICS
Sample (N=316)__ __ Population (N=2228)
First Year 13.6% 32.1%
Second Year 24.0 24.0
Third Year 26.3 22.0
Fourth Year 36.1 21.9
Male 48.3 41.4
Female 51.7 58.6
The data set was analyzed for overall means and standard deviations, for gender differences on the 16 individual items (ANOVA) and for class differences (ANOVA) on the same 16 items. Additionally, a multiple regression was performed using Item 16 as the dependent variable for the other 15 scale items. Table 2 indicates the item number, mean and standard deviation.
ITEM NUMBERS, MEANS AND STANDARD DEVIATIONS
Item # Mean Std. Dev.
1. 3.03 /.80
2. 1.84 /.75
3. 3.23 /.84
4. 2.46 /.71
5. 2.45 /.80
6. 2.05 /.90*
7. 2.91 /.83
8. 2.54 /.76
9. 2.32 /.78
10. 2.56 /.77
11. 2.36 /.76
12. 2.37 /.78
13. 2.16 /.96
14. 3.29 /.62
15. 2.98 /.68
16. 2.64 /.78
*Analysis of this item omitted due to reverse coding error.
The survey results indicate a number of significant statistical differences in responses related to gender and class ranking. For gender differences, female students were found to take the evaluation process more seriously than their male counterparts (Item # 1, Males = 2.87, Females = 3.17, F = 11.41, p = .001). Additionally, female students reported believing the evaluation process was important, more so than males in the sample (Item #7, Males = 2.79, Females = 3.01, F = 5.72, p = .017). Results indicate that male students may experience some cynicism about the class evaluation process Males were significantly different in a negative way from female students in terms of their perception that the higher the grade projected the higher their evaluation of a professor (Item # 5, Males = 2.54, Females = 2.37, F = 3.62, p = .058) and their belief that professors adjusted their in-class behavior at the end of the semester to achieve higher evaluations (Item # 12, Males = 2.49, Females = 2.27, F = 6.031, p = .015).
In regards to class ranking, three significant differences were identified. First-year students (more so than sophomores, juniors and seniors) tended not to rate female professors higher than their male counterparts (Item # 2, First Year = 1.58, Second Year = 1.88, Third Year = 1.98, Fourth Year = 1.82, F = 3.098, p = .027). Upperclassmen in the sample, therefore, reported rating their female professors higher than their male professors. The first-year students also reported believing that faculty took the evaluation process comments more seriously than sophomores, juniors and seniors. (Item # 8, First Year = 2.93, Second Year = 2.43, Third Year = 2.52, Fourth Year = 2.49, F = 4.628, F = 4.628) Lastly, in addressing the relevance of evaluation questions to actual professor evaluation, the first-year students reported a higher perception of question relevance than did any of the three upper-class groups. (Item # 15, First Year = 3.31, Second Year = 2.91, Third Year = 2.94, Fourth Year = 2.96, F = 3.809, p = .010). As was consistent across these three significant findings, the upper-class students did not differ significantly from one another on any of the three variables, while departing significantly from the views of the freshmen in the sample. After evaluating many courses and professors, experienced students may be less likely to see the questions as relevant.
The regression analysis in which 14 process variables (Items 1-15) were regressed on the evaluation effectiveness omnibus item (Item 16) rendered a significant ANOVA result (F = 23.46, p < .000) with approximately, 51% of the variance explained by the model. Item 6 was dropped from the analysis because it was incorrectly coded into the database.
Five items emerged as statistically significant in their contribution to the understanding of students’ overall perceptions of the effectiveness of the faculty class evaluation process. The five items that loaded significantly on their relationship to perceived evaluation process
effectiveness were (in order of significance): 1) Evaluation Question Relevance in Reference to Teaching Evaluation (Item 15, t-value = 7.36, p > .000); 2) Importance of the Process, Student Perspective (Item 7, t-value = 4.26, p > .000); 3) Use of Evaluations by Faculty to Improve Performance (Item 10, t-value = 3.50, p = .001); 4) Use of Evaluations to Improve Performance (Item 11, t-value = 3.20, p = .002): 5) Comfort with Giving a Negative Evaluation of a Professor (Item 3, t-value = 2.16, p = .031). Item 8 (Professors Take Evaluation Comments Seriously was marginally significant when using the .05 convention of significance (t-value = 1.88, p = .06).
We, first, should address the pattern of means as positive negative or neutral. We would suggest that any item above a mean of about 2.7 is a positive indicator related to student perceptions of the particular items and anything below about 2.3 would be a negative worthy of commentary. Those in the middle between 2.3 and 2.7 probably aren't very meaningful in terms of understanding student perceptions. Overall, it would appear that the students have a fairly ambivalent collective perceptual set relative to the process and to the individual items related to the process. Means generally loading highest indicate that student generally take the process seriously, feel comfortable giving bad professors negative evaluations and the questions asked on the form are both clear and relevant.
Some specific item responses may warrant further discussion. One of the lowest rated items is # 9: perceived use evaluation data in tenure and salary decisions. Many students simply do not understand the importance and use of the data in the overall functioning or administration of faculty. The focus group which produced the instrument was also less aware of the use of the data than faculty generally perceive. In fact, some individual students actually disagreed that evaluative data was even used. Also, related to previous studies, students may be less susceptible to cognitive dissonance (giving a professor a poor evaluation in anticipation of making a bad grade) if they knew of the importance of the process. Just as supervisors are trained in performance appraisal, perhaps student should be given a statement of the importance and use of their evaluations. Finally, faculty should feel secure in the fact that, in general, students do take the process seriously (despite the misperceptions of some), and feel comfortable with giving a professor a poor evaluation. With all of the professional rancor and inconclusive research on exactly what is measured in using student evaluations, at least our data suggests that students bring a fair amount of authenticity to the process.
The regression analysis explained a significant level of variance (51%). The first ranked item is Question Relevance to students. The current Faculty/Course evaluation form used at our School of Business Administration is relatively simple with only six questions, using Likert-type scales, related to course objectives, learning environment, communication style, etc. and two general questions asking for a global rating of, first, the course and finally the instructor. The remainder of the evaluation form lists typical open-ended questions, such as, what students liked best/least and suggestions for improvement. This data indicates that simplicity may be the best approach in designing a course evaluation form, one that is generic enough to be used across disciplines. Students using the same form for all courses also may tend to take the process more seriously; faculty would use the feedback; and would be able to more easily compare professors and, therefore, feel more comfortable giving “bad” professors a negative evaluation. Other data and
studies provide some support for the above discussion.
Overall then, while the current FCEQ is hardly a perfect instrument, it has helped us to understand better how our students see the evaluation process. With further instrument development and replications of this original study, we hope to learn more about the perceptual bases of students in an effort to make the evaluation process more meaningful and relevant to them.
DIRECTIONS FOR FUTURE RESEARCH
Given the exploratory nature of this research, our intention is to do a refined replication in the near future. Our goal will be to increase sample size so that we can complete a factor analysis of the data. Further, we will rewrite a number of the questions to make them clearer and to eliminate compound sentences that may be confusing to subjects who complete the instrument. This effort may lead to the addition of several questions to the current instrument. Lastly, we are likely to change the instrument scaling to a 5-point Likert scale so that delineation of driving and restraining perceptions is a bit easier to delineate than with the current 4-point scaling approach.
QUESTIONS ON THE FACULTY AND COURSE EVALUATION QUESTIONNAIRE
As note above, each item was rated on a 1 to 4 rating scale with 1 representing a position of strong disagreement and 4 representing a position of strong agreement. Items are presented below as they appear on the FCEQ.
1. I take evaluating the professors in my courses seriously.
2. I tend to evaluate female professors higher than male professors.
3. I feel comfortable giving a negative evaluation for a bad professor.
4. I rate professors based on their personality and enthusiasm and not on what I learned.
5. The higher the grade that I expect to receive in a class, the more positive my evaluation
of the class and professor.
6. I don’t write many comments on the evaluation form for fear of being identified.
7. Overall, I think the professor and course evaluation process is important.
8. Professors take my evaluation and comments seriously.
9. My evaluations are used in professor tenure and salary raise decisions.
10. Professors use their evaluations to improve their courses.
11. When students give low evaluations, professors adjust to improve their teaching.
12. Professors adjust their behavior at the end of semesters to get better evaluations.
13. Completing the evaluation form at the beginning of a class period is better than later.
14. The questions asked on the form are clear to me.
15. The questions asked on the form are relevant to evaluating a course/professor.
16. Overall, I think the professor and course evaluation process is effective.
Darby, Jenny (2006 Mar) “The Effects of Elective or Required Status of Courses on Student
Evaluations.” Journal of Vocational Education & Training. Volume 58, Issue 1, 19-29.
Darby, Jenny (2006 Apr) “Evaluating courses: an Examination of the impact of Student Gender.”
Educational Studies, Vol. 32, Issue 2. 187-199.
Giles, Anna; Martin Sylvia; Bryce, Deborah and Hendry, Graham (2004) “Students as Partners in
Evaluation: Student and Teacher Perspectives.” Assessment and Evaluation in Higher
Education. Bolume 29, Number 6 (December). 681-685.
Heckert, Theresa (2006). “Relations among Student Effort, Perceived Class Difficulty
Appropriateness, and Student Evaluations of Teaching: Is It Possible to “Buy” Better
Evaluations Through Lenient Grading?.” College Student Journal. Volume 40, Issue 3.
Maurer, Trent (2006). “Cognitive Dissonance or Revenge? Student Grades and Course
Evaluations.” Teaching of Psychology. Volume 33, Issue 3. 176-179.
Mintu-Wimatt, Alma; Ingram, Dendra; Milward, Marry Anne; Russ, Courtney (2006) “On
different Teaching Delivery Methods: What Happened to Instructor Course
Evaluations?” Marketing Education Review. Volume 16, Number 3 (Fall). 49-57.
McPherson, Michael (2006). “Determinants of How Students Evaluate Teachers.” The Journal of
Economic Education. Volume 37, Number 1. (Winter) 3-20.
Sedlmeier, Peter (2006). “The Role of Scales on Student Ratings.” Learning and Instruction.
Volume 16, Issue 5 (October). 401-415.