EPSY440 - Evaluation
Chapter
Nine Notes (Nitko, 2001)
Essays have a long (perhaps the longest) history among paper and pencil tests.
Teachers use essays to assess higher-order thinking skills such as explanation, communication, comparison, contrasting, analysis, synthesis, and evaluation, as well as to assess writing skills.
The two major types of essays include:
restricted response - the item restricts or limits what the student is required to answer.
extended response - items that allow students free reign on their expression of ideas and the relationship/organization of these ideas, where no single answer is correct.
Essay items should ask students for more than simple recall; they should ask students to apply their knowledge to new situations.
interpretive exercises (context-dependent) - the student is required to write the essay based on accompanying material.
Extended response essays allow students to express subject-matter knowledge and general writing ability.
The unique quality of essays is they allow students an opportunity to show their ability to write about, organize, express, and explain interrelationships among concepts and ideas.
Preparing for essays encourage the studying of broad concepts (vs. studying for objective tests which encourages the studying of facts).
There is conflicting evidence on the usefulness of essays for improving writing skills - some argue essays increase this skill and students write more, while others argue students may write more, but the writing isn't necessarily of any better quality.
A continually emphasized point throughout this course is that multiple assessments lead to more valid interpretations.
Essays limit the range of content that can be covered, but they allow more in-depth coverage of the learning target being assessed.
To compensate for the narrow range of targets covered by essays, multiple essays should be used over an extended period of time.
Factors that affect reliability of scoring essays include:
Scoring essays carelessly because of restrictions on time and large number of essays to grade is a violation of professional ethics and responsibilities.
A decision needs to be made early whether essays are the appropriate method for assessment to avoid careless scoring.
Using well defined criteria and scoring rubrics can assist in scoring essays appropriately and in a timely manner.
Checklist for judging the quality of essay items:
Again, revision is an absolute and necessary step in the construction of items.
Items #1 & #2 above are considerations in any item format. It is best to focus the type of response you want versus how well the learning target is stated.
Higher-level thinking is assessed best when knowledge is applied to new situations; otherwise you may be assessing the simple recall of factual material.
Your assessment plan should cover the wide range of content and thinking skills (i.e., the range of levels on Bloom's Taxonomy) that make up your assessment plan.
Because essays take time to both complete and score, they should be balanced with objective items that cover a range of skills and learning targets.
If essay items are not focused it will be impossible to distinguish those who know the material from those who do not.
It is good suggested practice to have colleagues or students to review your essay items to ensure similar interpretation of what is being asked.
As with other item formats, the wording and vocabulary should be carefully controlled to avoid confusion and allow for maximum readability.
You should use short-answer, completion, T-F, MC, or matching items if you are assessing simple recall or recognition, and essay items for higher-order thinking (e.g., application, synthesis, evaluation).
Make sure the framework of the item is clearly outlined so students know what they are to respond to, in what amount of time, at what level, and to what audience they are responding for.
Students need to know how they will be evaluated so they can focus their response accordingly.
Some key words and phrases that assist in constructing various types of essay items usually include:
Optional essay questions (a choice of which ones to respond to) have been found to lead to inequities in assessment because students cannot be compared as they are answering different questions.
If the only thing being evaluated is students' general writing ability, then the use of optional items would be appropriate.
Two general methods for scoring essays are analytic and holistic scoring rubrics.
analytic scoring rubric - an outline of the major elements students need to include in the response, along with the points assigned for each specific (assignment of partial credit should also be outlined).
holistic scoring rubric - a judgment is made of the overall quality of the response.
Holistic scoring rubrics are usually more appropriate for evaluating extended response essays which assess synthesis and creativity.
Holistic scoring rubrics can be set up by:
Holistic scoring rubrics help to score papers faster and view the paper as a working whole, but they don't point out details and can introduce your own bias and errors.
Analytic scoring rubrics can give more detailed information about student strengths and weaknesses by seeing which areas gave students the most problems, and some elements can be weighed more heavily than others. However, they are also slower, well-defined elements are more difficult to come up with, and the time to prepare them can be frustrating.
annotated holistic rubric - a combination (hybrid) of the analytic and holistic scoring rubric quality levels are defined and papers are scored holistically, then brief comments are made pointing out strengths and weaknesses of the response.
More scoring suggestions:
This Webpage designed and updated (10/29/01) by Ron Dugan, University at Albany, State University of New York.