EPSY440 - Evaluation


Chapter Eleven Notes (Nitko, 2001)


Performance assessments are more appropriate when learning calls for the combination of complex thinking and skills and their application.

Performance assessment - present hands-on tasks and use clearly defined criteria for evaluation.

These usually require having students make something, produce a report, or give a demonstration.

Rule-of-Thumb: Simple learning targets require simple assessment; complex learning targets require complex assessment.

Every performance assessment requires a task and criteria.

Adequate criteria will improve the reliability and validity of your evaluation.

Having criteria informs the student of what is expected, what constitutes good and poor performance, and it allows the assessor to monitor and evaluate progress.

Performance task - an activity which requires the student to demonstrate achievement by producing an extended written or spoken answer, engaging in group or individual activities, or creating a product.

Two aspects of a student's performance that can be assessed are the process and the product.

Because performance tasks are complex, they offer an opportunity to assess students on several learning targets and at different levels of correctness.

Scoring rubric - a coherent set of rules used to assess a performance, usually in the form of a checklist or rating scale.

For scales, each point on the continuum is accompanied by verbal descriptions of what is required for that score (see figure 11.2 on p. 243).

General scoring rubric - not specific to a particular task but serves to develop specific rubrics.

Specific scoring rubric - a scoring scale that applies a general scoring rubric to a specific situation.

General scoring rubrics that can be applied across situations help to develop specific scoring rubrics for each situation, and increase the reliability and validity of evaluation (see figure 11.2 on p. 243).

When performance tasks relate to real-world experiences (e.g., planning a trip, making budgets, comparing local politicians' viewpoints from the press), they help students connect classroom learning to real contexts.

Again (this is review), validity is improved when assessment tasks are aligned with the learning targets, and they measure the range of targets taught.

Also as a review point, multiple assessment formats lead to the greatest reliability and validity in evaluating achievement of the range of learning targets.

Alternative assessment and authentic assessment are not necessarily the same as performance assessment:

Authentic assessments mean presenting students with tasks that are educationally meaningful and connect learning to real-world scenarios (e.g., reviewing literature to from an opinion).

Alternative assessments are just another way of referring to assessments that are opposed to standardized or objective-based assessments.

Performance assessments are alternative assessments, but they are not always authentic assessments.

Authentic assessments require:

  1. emphasis on application,
  2. focus on direct assessment,
  3. use realistic problems, and
  4. encourage open-ended thinking.

 

Howard Gardner's Theory of Multiple Intelligences

Howard Gardner proposed a theory of seven intelligences (then eight, now nine) instead of the usual unified intelligence that is usually postulated. Many educators feel students perform differently based on proficiency at different levels in these intelligences, and they also feel performance assessment is a valid way of measuring these different intelligences. That is, performance assessment is especially amenable to the theory of multiple intelligences. These intelligences are as follows:

Linguistic - the capacity to use one or more languages for self-expression and understanding others.

Logical-mathematical - capacity to understand scientific/logical principles, or to use quantitative and mathematical reasoning.

Spatial - ability to represent the world spatially in your mind the way pilots, chess players, artists, or architects do.

Bodily-kinesthetic - using parts or all of the body to solve a problem, make a product, or perform in ways similar to athletes, actors, or dancers.

Musical - ability to mentally process music in a way that recognizes and remembers patterns, and manipulates music to solve problems or express understanding.

Interpersonal - the ability to understand and meaningfully relate to other people through an ability to understand others.

Intrapersonal - the ability to understand an know yourself, your limitations, strengths, weaknesses, goals, etc.

Naturalist - the capacity to understand nature and the modern world by being able to discriminate and classify living, non-living, and human-made things.

Existential - this is an intelligence still under investigation that consists of a natural inclination to ask questions about the world and our existence, and to explore the answers to these.

These intelligences:

  1. are necessary for survival,
  2. are evolutionary capacities,
  3. and can be detected by specific brain activity

and inform teaching practices by teaching students (and teachers) that:

  1. students can be smart in several different ways,
  2. students can value the differences of intelligences in themselves and others,
  3. students can use practice to improve their intelligences to the best of their abilities, and
  4. students can demonstrate achievement of learning targets using two or more intelligences.

Multiple Intelligence Assessment Menu - the menu lists several types of activities that can be used to assess students on the various intelligences (see figure 11.4 on p. 246).

Although there is much literature regarding the multiple intelligences, and it is a favorite theory of teachers and students alike, the research does not support the validity of the theory, although it does seem to enhance motivation when activities and assessment are linked to the theory, which indirectly leads to higher achievement.

 

Types of Performance Assessments

Structured, on-demand tasks - the teacher decides on the materials, specifies the instructions for performance, describes the outcomes students should strive for, and gives students opportunities to prepare for the assessment.

paper & pencil tasks - focus is on a written product or the process a student uses in solving a problem; can be a closed-response (the question constrains the answer) or open-response (multiple acceptable answers expressed in a variety of ways).

example - write an alternative ending to a story

tasks requiring other equipment and resources - require students to do something with equipment and resources other than paper and pencil.

example - perform a chemical experiment by mixing substances and explaining the results

Naturally occurring (typical performance) tasks - waiting for the performance to occur naturally and assessing it at that point.

Example- watching how a student reacts when faced with mental disequilibrium or conflict

Long-term projects - activities assessed over a length of time integrating many skills and learning targets.

individual student projects - an activity that results in a product, model, functional object, substantial report, or a collection.

Example - constructing a poster board representing major life events occurring over a semester

group projects - an assessment technique that evaluates the ability of students to work together cooperatively and appropriately to produce a high-quality project.

Example - students reading a chapter then having to "teach" the class the material

combined group-individual projects - activities where students work collaboratively on a project then prepare individual reports without assistance of group members.

Example - conducting a group experiment then each individual writing their own results and conclusions based on the findings

Portfolios - a limited collection of a student's work to either present his/her best work or show growth or progress over time.

best-work portfolios - focuses on presenting the student's best final products.

Example - the portfolios common to artists and architects

growth and learning-progress portfolio - focuses on monitoring learning and thinking progression to diagnose difficulties and guide new learning and thinking.

Example - a creative writer periodically hands in the portfolio for critique and feedback, then makes revisions and incorporates new learning into future pieces

Demonstrations - an on-demand performance where a student is required to show he/she can use knowledge and skills to complete a well-defined complex task.

Example - baking a cake or making pasts in a home economics course

Experiments - another on-demand performance where students are required to plan, conduct, and interpret the results of an empirical research study, then report the results.

Example - conducting research as part of an advanced placement biology course

Oral Presentations and Dramatizations - permitting students to verbalize knowledge and use oral skills via interviews, speeches, or oral presentations.

Example - two debate teams arguing the pros and cons of military reaction to terrorism

Simulations and contrived situations - on-demand events that happen under controlled conditions and attempt to mimic naturally occurring events.

standardized patient format - an actor is trained to display symptoms of a particular disorder or other malady.

Example - the tasks common in CPR courses

computerized adaptive audiovisual simulations - multimedia simulations using the latest technologies to recreate real world scenarios to which students must respond.

Example - flight simulators or car driving scenarios/simulations

computerized adaptive text scenarios - similar to audiovisual simulations except text replaces multimedia.

Example - computerized "invasion" scenario where students are presented ongoing dilemmas and must respond to a choice of options, which then result in the consequent scenario

All of the above types of performance assessments (except those requiring groupwork) can be used for individuals, groups, or a combination of both.

Shortcomings of naturally occurring events include:

  1. consuming large amounts of time waiting for the event to happen,
  2. having little control over when and how performances will occur,
  3. not being able to ensure all students will perform the same task,
  4. not being able to ensure students will be performing under the same conditions, and
  5. inefficiently using the teacher's time.

Projects require creativity, originality, and a sense of aesthetics. Their usefulness as an assessment tool depends on:

  1. the teacher and students being clear that the project focuses on one or more learning targets,
  2. each student doing his or her own work,
  3. each student having equal access to resources, and
  4. fairly evaluating each project irrespective of type and materials used.

Inequalities and biases in evaluating individual projects can be overcome by:

  1. explicitly defining the learning targets being assessed,
  2. identifying specific characteristics and qualities of the final project most strongly linked to learning target(s),
  3. defining levels of quality for each characteristic,
  4. defining the weight given to each characteristic,
  5. defining the scoring rubric you will be using, and
  6. limiting the resources with which students can complete the project.

Group projects require management goals that include:

  1. monitoring individual students to make sure they are making real progress,
  2. mentoring students to help them overcome operational problems beyond their control (e.g., attrition),
  3. mentoring students to keep them focused on the project, and
  4. monitoring procedures and processes students are using to assure they will address the learning targets.

Group projects require management strategies that include:

  1. clarifying the outcome you expect,
  2. putting your expectations in writing,
  3. clarifying standards to be used for evaluation,
  4. letting students participate in setting up the standards,
  5. clarifying deadlines,
  6. requiring progress reports, and
  7. minimizing plagiarism opportunities.

In group projects, both group and individual standards must be set up. Research has shown that group and individual accountability leads to higher achievement.

A general scoring rubric (see Figure 11.9 on p. 254) can be used to assess collaboration and cooperation standards fro group projects.

Portfolios are usually assessed using annotated holistic scoring rubrics where the student is placed into levels depending on the match between the level description and the student's work (refer to Figure 11.10 on p. 256 used to assess the mathematics portfolio outlined in Table 11.3 on p. 255).

Portfolio culture model - a model of conceptual change used to assess growth and learning-progress portfolios where the portfolio is made the center of instructional planning and activities such that students interact intensively with the portfolio contents. These portfolios should include:

  1. authentic work,
  2. a record of conceptual development, and
  3. a record of reflective activity.

Not all performance activities conducted during a course will be assessment activities as well, but they can be used for both formative assessment and the final (summative ) assessment evaluation.

In order for your performance assessment tasks to be authentic, they must:

  1. require students to use knowledge to do a meaningful task,
  2. be complex and require students to use a combination of different skills, knowledge, and abilities,
  3. require high-quality, polished, complete, and justifiable responses, performances, or products,
  4. clearly specify standards and criteria for assessing the possibly multiple correct answers, performances, or products,
  5. simulate the way students will use their knowledge, skills, and abilities in the real world, and
  6. present students with "ill-structured" activities representative of the real world where problems are not simple ones requiring clear-cut answers.

Advantages of performance assessments include:

  1. clarifying the meaning complex learning targets,
  2. assessing the ability "to do,"
  3. consistency with the modern learning theory of constructivism,
  4. requiring the integration of knowledge, skills, and abilities,
  5. the closer linking of activities with assessment,
  6. broadening the approach to student assessment, and
  7. letting teachers assess the processes students use as well as the products.

Performance assessments also have several disadvantages including:

  1. difficulty in constructing complex tasks,
  2. difficulty in constructing high-quality rubrics,
  3. completion of tasks requiring large amounts of time,
  4. scoring tasks requiring large amounts of time,
  5. scoring having lower reliability (can be ameliorated with well-constructed rubrics),
  6. performance on one task providing little evidence of performance on other tasks (generalizability),
  7. performance tasks not being able to assess all learning targets well (e.g., recall),
  8. complex tasks becoming discouraging to lower-performing students,
  9. tasks underrepresenting the learning of different cultural groups (likely to accentuate differences), and
  10. performance assessments becoming corruptible through "coaching."

Back to course notes.

Back to course homepage.

This Webpage designed and updated (11/4/01) by Ron Dugan, University at Albany, State University of New York.