Basic Concepts in Assessment
How can we use assessment as a tool to improve our teaching?
Assessments as Tools
• Assessment is a process of observing a sample of students’ behavior
and drawing inferences about their knowledge and abilities.
• We use a sample of student behavior to draw inferences about
Forms of Educational Assessment
• Informal vs. formal assessment
• Paper-pencil assessment vs. performance assessment
• Traditional assessment vs. authentic assessment
• Standardized test vs. teacher-developed assessment
Informal vs. formal assessment
• Informal assessments are spontaneous, day-to-day observations of
students’ performance in class.
• Formal assessment is planned in advance & used for a specific
purpose to determine what is learned in a specific domain.
Paper-pencil vs. Performance assessment
• Paper-pencil: asks students to respond in writing to questions.
• Performance: asks students to demonstrate knowledge or skills in
some other fashion. Students perform in some way.
Traditional vs. authentic assessment
• Traditional: assesses basic knowledge & skills separate from real-
• Authentic: assesses students’ ability to use what they’ve learned in
tasks similar to those in the outside world.
Standardized test vs. teacher-developed test
• Standardized test: developed by test experts, published for use in
• Teacher-developed tests: developed by a teacher for use in individual
Purposes for assessment
• Formative evaluation: assessing what students know before & during
instruction. We can redesign lesson plans as needed.
• Summative evaluation: assessment after instruction to determine what
students have learned, to compute grades.
• Assessments as motivators
• Assessments as mechanisms for review
• Assessments as influences on cognitive processing- studying more
effectively for types of test items.
• Assessments as learning experiences
• Assessments as feedback
Qualities of good assessments- RSVP
• The extent to which the instrument gives consistent information about
the abilities being measured.
• Reliability coefficient- correlation coefficient +1 to -1
Standard error of measurement
• SEM- shows how close a student’s score is to what it should be.
• A true score is the ideal score for a student on a subject based on past
• The test manual will compute common errors in the scoring. Scores
must be given within this range- the confidence interval.
Enhancing the reliability of classroom assessments
• Use several tasks in each instrument
• Define each task clearly enough so students know what is being asked.
• Use specific, concrete criteria
• Keep expectations out of judgment.
• Avoid assessing a child when s/he is ill, tired, out of sorts in some
• Use the same techniques and environment for assessing all kids.
• The concept that assessment instruments must have similar, consistent
content, format, & be administered & scored in the same way for
• Standardized tests reduce error in assessment results & are considered
to be more reliable.
• The extent an instrument measures what it is designed to measure. • Content validity- items are representative of skills described • Predictive validity- how well an instrument predicts future
performance. SAT, ACT
• Construct validity- how well an instrument measures an abstract,
internal characteristic- motivation, intelligence, visual-spatial ability.
Essentials of testing
• An assessment tool may be more valid for some purposes than for
• Reliability is necessary to produce validity.
• But reliability doesn’t guarantee validity.
• The extent to which instruments are easy to use.
• How much time will it take?
• How easily is it administered to a group of children? • Are expensive materials needed?
• How much time will it take?
• How easily can performance be evaluated?
• Criterion-referenced scores show what a student can do in accord with
• Norm-referenced scores compare a student’s performance with other
students on the same task.
• Norms are derived from testing large numbers of students.
Types of standardized tests
• Achievement tests- to assess how much students have learned of what
has been taught
• Scholastic aptitude tests- to assess students capability to learn, to
predict general academic success.
• Specific aptitude tests- to predict how students are likely to perform in
a content area.
Technology and Assessment
• Allows adaptive testing
• Can include animation, simulation, videos, audios
• Enables easy assessment of specific problems
• Assesses students’ abilities with varying levels of support
• Provides immediate scoring
Guidelines for choosing standardized tests
• Choose a test with high validity for your purpose & high reliability. • Be sure the test’s norm group is relevant to your population.
• Follow directions closely.
Types of test scores
• Raw scores- based on number of correct responses.
• Criterion-referenced scores- compare performance to criteria or
standards for success.
• Norm-referenced scores- compare student’s performance to the
average of students the same age.
Grade-equivalents and age-equivalents compare a student’s
performance to the average performance of students at the same age/ grade.
Percentile ranks- show the percentage of students at the same age/ grade who made lower scores than the individual.
Standard scores- show how far the individual performance is from the mean by standard deviation units.
• Normal distribution- bell curve
• Standard deviation- variability of a set of scores.
• IQ scores
• ETS scores
• IQ scores- mean of 100, SD of 15
• ETS scores- (Educational Testing Service tests- SAT, GRE)
mean of 500, SD of 100
• Stanines- for standardized achievement tests- mean- 5, SD- 2
• z-scores- mean of 0, SD of 1- used statistically
Norm- vs. criterion-referenced scores
• Norm-referenced scores- grading on the curve, based on the class
average. Sets up a competitive environment, not a sense of community.
May be used in performance tests- who gets to be first chair in band.
• Criterion-referenced scores show if students have mastered objectives.
Interpreting test scores
• Compare 2 norm-referenced test scores only when those scored come
from equivalent norm groups.
• Have a clear rationale for cutoff scores for acceptable performance.
• Never use a single test score to make important decisions.
High-stakes testing and accountability
• High-stakes testing- Making major decisions on the basis of a single
• Accountability- holding teachers, administrators responsible for
students’ performance on those tests.
• Some tests have determined passing a grade or graduation.
Problems with high-stakes testing
• Tests don’t always show instructional objectives.
• Teachers spend time teaching to the tests.
• Low achievers or special ed students are often not included. • Criteria often bias against students from lower SES. • Not enough emphasis on helping schools/ students improve.
Potential solutions to the problems
• Identify what is most important for students to know. • Educate the public about what tests scores can do. • Look at alternatives to tests.
• Use multiple measures in making high-stakes decisions. Identify what
is most important for students to know.
• Educate the public about what tests scores can do. • Look at alternatives to tests.
• Use multiple measures in making high-stakes decisions.
Confidentiality & communication of test results
• Family Educational Rights & Privacy Act- limits testing to
achievement/ scholastic aptitude.
• Restricts test results to students, parents, & teachers.
• Restricts students grading others’ papers, posting scores
publicly, or going through student papers to find one’s own
• Parents/ students can review test scores & school records.
Communicating classroom assessment results
• Assessment is primarily to help students learn & achieve more
• Class results must be communicated to parents to enable student
Explaining standardized test results
• Be sure you understand the test results yourself. • It may be sufficient to explain test results in general terms. • Use percentile ranks rather than IQ or grade equivalents. • Describe the SEM & confidence intervals if you know them.
Taking student diversity into account
• Developmental differences
• Test anxiety
• Cultural bias
• Language differences
Accommodating students with special needs
• Modify format of test
• Modify response format
• Modify timing
• Modify setting
• Administering part, not all test
• Use instruments that are more compatible with students’ level