EPSY440 - Evaluation


Chapter Eight Notes (Nitko, 2001)


multiple-choice items - test items that contain a stem followed by alternatives where the student must choose the correct/best alternative that answers the stem.

stem - the part of the item that contains the question or statement.

alternatives, responses, choices, options - different names for the list of possible answers provided for the stem.

keyed answer, keyed alternative, key - alternative names for the correct or best answer to the stem.

distractors, foils - the incorrect alternatives which are designed to distract or foil the less knowledgeable student.

Foils and distractors need to be plausible for the test taker, especially the student who doesn't have the required knowledge.

interpretive exercises - multiple-choice items that depend on additional information (interpretive material), such as graphs, charts, paragraphs, objects, or pictures, to assess higher levels of knowledge (e.g., application).

Remember that valid assessments require student information from several different formats (as in this class, you take objective tests, write reflections, and participate in in-class exercises for observation).

The nice thing about multiple-choice (MC) items is you can adjust their difficulty level by adjusting the homogeneity of the alternatives to the stem. That is, learning for any specific target falls on a continuum that can be measured through MC items.

The basic purpose of assessment is to identify students who have attained the appropriate level of knowledge regarding any particular learning target.

Varieties of MC formats include:

The most common are the best-answer and correct-answer type, but educators should also be familiar with the other varieties (refer to Figure 8.4 in textbook).

The MC item is appropriate for certain types of skills, such as comprehension and application, but other formats (e.g., essays) are more appropriate for measuring integration, organization, and other types of skills.

Advantages of MC items include:

  1. assessment of a greater variety of learning targets,
  2. minimization of bluffing or "dressing up" answers,
  3. a focus on reading and thinking,
  4. less of a chance for guessing compared to true-false items, and
  5. choice of distractor allows diagnosis of learning deficiencies.

Disadvantages of MC items include:

  1. choice of fixed options eliminating creativity and expression,
  2. poorly written, superficial items measuring only recall of factual knowledge,
  3. brighter students being penalized by having to select only one choice from poorly constructed items,
  4. giving false impression that knowledge is standardized, that there's one correct answer to problems, and
  5. misuse of sole reliance upon MC tests for high stakes testing.

decontextualized items - items are taken from artificial contexts that don't resemble real life.

Educators need to make sure they match the assessment task to the learning targets and student achievement they want to measure.

Some researchers recommend not using MC items when:

An exception to the above restrictions is when there is a need to assess a large number of students over a large number of learning targets and machine scoring is available.

Other researchers suggest not using the MC item when there are only a few students and the test will only be used once. However, if you plan on teaching the same subject (as I do), then you want to develop a pool of tested items.

Proper crafting of MC items requires 5 basic skills as follows:

  1. focusing items to assess specific learning targets,
  2. making the stem a question or problem to be solved,
  3. writing concise alternatives free of ambiguity,
  4. writing plausible distractors, and
  5. editing items to remove flaws.

Editing test items is a necessity, and one should allow review of his/her test by colleagues or peers knowledgeable of the subject and/or testing.

Suggestions for crafting quality MC items include:

  1. Asking a direct or implied question to make clear the intent of the item.
  2. Putting alternatives at the end of the stem to avoid confusion and extra mental work.
  3. Controlling sentence structure and vocabulary to assess students at the appropriate level.
  4. Avoiding "window dressing" that detracts form the intent of the item.
  5. Avoiding negatively worded items that confuse (or highlighting the negative wording).
  6. Avoiding asking for personal opinions which are neither correct nor incorrect.
  7. Avoiding the use of verbatim wording from the text which encourages memorization (vs. comprehension).
  8. Avoiding dependent items where answers are clued by previous items (except for interpretive).
  9. Putting definitions in the stem (note the textbook suggests in the alternatives - this is incorrect).
  10. Avoiding the use of specific determiners (always, never, every, often, usually, frequently).

When crafting the alternatives or foils:

  1. Make sure alternatives are plausible and functional.
  2. Construct homogeneous alternatives which are appropriate to the stem.
  3. Place repeated words in the stem to increase readability.
  4. Use correct and consistent punctuation to avoid clueing and increase readability.
  5. Arrange the alternatives in some type of logical, numerical, or alphabetical order.
  6. Make sure the alternatives are grammatically consistent with the stem (or vice-versa).
  7. Avoid overlapping alternatives.
  8. Avoid using a set of true-false alternatives that would be more appropriate for Multiple True-False format.
  9. Avoid "none of the above" options that are less reliable and make the item more difficult.
  10. Avoid "all of the above" or use it sparingly as it gives clues to those with partial knowledge.
  11. Avoid verbal clues in the alternatives that may "clang" or associate with the stem.
  12. Avoid use of technical or unfamiliar wording that are beyond the knowledge base of the student.
  13. Don't make the distractor so plausible it could be the correct answer.

Additional suggestions for crafting the correctly alternative:

  1. Make sure there is only one correct or best answer.
  2. Ensure knowledgeable others can agree on the correct answer.
  3. Make the correct answer grammatically correct with the stem.
  4. Avoid correct answers following any type of pattern.
  5. Don't use textbook wording or stereotype phrasing (except with adult students).
  6. Make the correct answer the same length as the incorrect alternatives.

The idea behind these suggestions is to make sure the items is not measuring the wrong thing, and that less knowledgeable students cannot answer the items correctly based on clues or other faults.

Use the following checklist to assess the quality of MC items constructed by you or others:

  1. Does the item assess an important aspect of the relevant learning target?
  2. Does the item match the test blueprint (i.e., the Table of Specifications)?
  3. Does the item ask a direct question of a specific problem?
  4. Is the item paraphrased (vs. verbatim from the textbook)?
  5. Are vocabulary and sentence structure at an appropriate and nontechnical level?
  6. Are the alternatives plausible?
  7. Are alternatives based on common misconceptions of the less knowledgeable?
  8. Is the correct answer independent of answers to other items?
  9. Are alternatives a homogeneous set and appropriate to the content of the stem?
  10. Is there minimal use of "none of the above" and "all of the above"?
  11. Is there only one correct or best answer?

matching exercise - an item that provides a list or premises (statements, concepts) and a list of responses, as well as the directions for matching the two.

Matching exercises can have more premises than responses, more responses than premises, but they should never have an equal amount of both (perfect matching) because this leads to the last choice being a given.

Matching exercises are similar to multiple-choice items in that each premise is a separate item (and should be numbered accordingly) but they have the same list of alternatives to choose from.

Matching exercises should be used when you have several questions that require the same alternatives.

Advantages: Matching exercises are space-saving, compact, objective, able to assess associations and relationships, and can match words and phrases with pictures and objects or locations on maps and diagrams.

Disadvantages: Matching exercises encourage rote memorization of lists and are often used by teachers to assess rote associations of names and dates.

One of the hard things about constructing good matching exercises is coming up with a list of homogeneous premises and responses that refer to the same category of things as the alternatives should be plausible for each premise.

Table 8.8 in the textbook provides a checklist for assessing the quality of matching exercises constructed by you or someone else. These are discussed below with question numbers in parentheses:

(1 & 2) - Items should measure what is important and match the assessment plan (i.e., Table of Specifications).

(3) - Matching exercises should be homogeneous, although the degree depends on the level of maturity and educational attainment of the students.

(4) - The basis for matching should be clearly explained and long directions should be avoided.

(5) - All responses should be functional alternatives for each premise and usual clues should be eliminated.

(6) - The lists of premises and responses should be short because:

(7) - Perfect matching should be avoided by including one or two responses that don't match any premise.

(8) - Use longer phrases in premises, shorter phrases in responses to make items easier to read.

(9) - Responses should be arranged in some type of logical order to add to clarity.

(10) - Identify premises with numbers (as each is a separate item) and responses with letters.

Back to course notes.

Back to course homepage.

This Webpage designed and updated (10/14/01) by Ron Dugan, University at Albany, State University of New York.