Faculty Development

Writing Assessment Questions

The primary goals of assessment in a CME activity are to ensure that the learner has a satisfactory understanding of the material presented, to promote active learning, and to identify areas of deficiency in need of further learning. These goals cannot be achieved unless careful thought is given to the assessment questions so that they test important concepts/follow the activity objectives, evaluate application of knowledge rather than recall of isolated facts, are unambiguous, and do not introduce irrelevant difficulty.

In general, single-best answer (multiple choice) questions are preferable to true-false questions since the latter have a tendency to assess recall of random facts and they often require the examinee to guess what the item writer had in mind because the statement is not either completely true or completely false but somewhere in between. One type of single-best answer question that is popular to write but equally problematic is the question framed in the form of “Which of the following statements is correct?” or “Each of the following is correct EXCEPT,” which are actually thinly veiled true-false questions. Finally single-best answer questions that include a “None of the above” option also turn the item into a true-false question since each option has to be evaluated as more or less true than the universe of unlisted options. In addition, you can’t tell if the learner really knew the correct answer.

Here are the basic principles of writing a high-quality single-best answer question:

  1. Each item should address ONE important concept that is aligned with an activity objective

  2. Items assessing application of knowledge are superior to those testing recall of an isolated fact.

  3. Questions should be FOCUSED. An excellent way to evaluate whether an item is focused is the “cover-up test”: cover up the answer options and determine if the question is clear and if a knowledgeable examinee could come up with the answer based only on the stem. “Which of the following statements is correct?” is unfocused and fails the cover-up test.

  4. Distractors (incorrect answer options) should be attractive to the uninformed examinee. Thus they should be homogenous (in the same category—e.g., diagnoses, tests, treatments, or prognoses--as the correct answer), plausible, grammatically consistent and logically compatible with the stem, listed in logical or alphabetical order, and of the same relative length as the correct answer. If you cannot come up with 4 plausible distractors, go with 3 rather than include implausible ones. If 3 options are collectively exhaustive (e.g., Increases, Decreases, Remains the same), that’s all you should use. Use of 4- and 3-option items does not adversely impact reliability and discrimination. Avoid "double options" (e.g., do Y then Z) unless the correct answer and all distractors are double options.

  5. Technical item flaws that benefit the “testwise” examinee and issues that introduce irrelevant difficulty should be avoided. The rationale here is that technical flaws and irrelevant difficulty distract the examinee from the actual content of interest and can impact performance. Examinees may use test-taking strategy (e.g., selecting the longest, most comprehensive option, which is statistically more likely to be the correct answer) rather than content knowledge. In fact, flawed items have been demonstrated to disproportionately penalize those examinees with better content knowledge (i.e., those who score higher on the non-flawed items).

Technical flaws.

  1. Correct answer is the longest or most comprehensive
  2. Grammatical cues: distractor doesn’t follow grammatically from the stem
  3. Repetition of words in the stem and in the correct answer
  4. Absolute terms (e.g., "always" or "never")
  5. Convergence strategy: correct answer includes the most elements in common with other options
  6. "All of the above": often an obvious give-away answer, which can encourage guessing

Issues Related to Irrelevant Difficulty.

  1. Negatively phrased questions (e.g., those with “except” or “not”)
  2. Tricky or overly complicated stems
  3. Long, complicated, or double options
  4. Inconsistently stated numerical data
  5. Vague terms in the options (e.g., "rarely" or "usually")
  6. "None of the above" as an option
  7. “K-type” items, where one or more options may be correct (e.g., A; B; C; A & B; A, B, & C) 


Let’s look at a few examples that highlight the issues associated with true-false questions and multiple choice questions that include technical flaws/irrelevant difficulty.


True-False

The sun sets in the West. True or False?

This seems like a straightforward question with an obvious answer: everybody knows the sun “rises in the east and sets in the west.” But in fact, that is only a generalization. The informed student of astronomy knows that the sun sets due West on only two days of the year: March 21 and September 21 (the spring and fall equinoxes); on other days of the year, it will set a little north or south of “due west.” So how should a test-taker answer this question? Either answer could be correct (the statement is true two days of the year and sort of true but not exactly the other days), so they have to guess how much precision the instructor wants.


None of the Above

Where does the sun set?

  1. North
  2. South
  3. East
  4. West
  5. None of the above

As mentioned earlier, “none of the above” turns the item into a true-false question, where the test-taker has to evaluate each of the options against all the possible options that were not listed and determine how much truth the instructor wants. The informed student who realizes that the sun actually sets in the northwest or southwest most days of the year will struggle between d) West and e) None of the above since either one could be correct. The partially informed student who has heard the saying of where the sun rises and sets will choose D, and the completely uninformed student may play the odds and select None of the above.


Which of the following statements is correct?

Which of the following statements is correct?

  1. The sun sets in the West.
  2. The sun sets north of true West during the summer.
  3. The sun sets south of true West during the winter.
  4. During most days of the year, the sun sets north or south of true West.

Again, we are dealing with a true-false paradigm, and since very few things in life are absolute certainties, the examinee needs to decide which of the statements is more true than the others. The sun sometimes sets due West and generally sets in a western (southwest or northwest) direction, so (a) could be correct; (b) and (c) are both true statements if you live in the northern hemisphere, but not if you live in Australia; (d) seems like a better option than the others, but it too is not always true since it doesn’t apply to the North or South Pole. So the examinee is left trying to figure out which statement the instructor thought had the most truth.


K-type questions

 Where does the sun set?

  1. North
  2. South
  3. West
  4. A and C
  5. B and C
  6. A, B, and C

K-type questions, which have a variable number of correct answers, similarly turn a multiple choice question into a true-false one. Should (c) West be the correct answer? Or (f) A, B, and C to account for northwest and southwest setting and the Poles? K-type questions have been demonstrated to be less discriminating, less reliable, and more likely to “cue” examinees to the correct answer.


All of the Above

Where does the sun set?

  1. North
  2. South
  3. West
  4. All of the above

For the same reason as with the K-type question, the knowledgeable examinee will be unsure whether to choose (c) West or (d) All of the above.


Negative phrasing

When does the sun NOT set in the west?

  1. In Australia
  2. At the North pole
  3. At the fall equinox
  4. At the spring equinox

Negatively phrased questions (e.g., ones that include EXCEPT or NOT in the stem) are popular because such questions are often easier to write, but they are problematic because they increase unnecessary cognitive load on the examinee. Moving through the options, examinees can easily forget they are supposed to choose the worst answer and instead instinctively choose the best answer, as they are accustomed to doing. So they may get the question wrong not because they didn’t know the material but because of irrelevant technical difficulty.


Effective Assessment Items

With that background and your (possibly new) knowledge of sun setting behavior, how would you design an effective single-best answer assessment item: an UNAMBIGUOUS question that is FOCUSED and addresses ONE important concept, with homogeneous, attractive distractors and no technical flaws or irrelevant difficulty?


Evaluate this one:

In the United States, the sun sets due west on the 21st day of which of the following months?

  1. January
  2. March
  3. June
  4. August
  5. December

This question tests a single concept and is focused, as evidenced by the cover up test. Knowing that the sun sets due west only at the spring equinox (May 21) and the fall equinox (September 21), the knowledgeable examinee can answer this question without even looking at the options. Distractors are homogenous, plausible, and attractive to the uninformed examinee, who might recognize the first days of summer and winter. Inclusion of location (United States) reduces ambiguity by eliminating the exceptions of the North and South Poles. None of the technical flaws or issues of irrelevant technical difficulty discussed above are present.


And this one:

Which of the following best describes the position of the setting sun in the southern hemisphere at the time of the winter solstice compared to its position at the fall equinox?

  1. Sets further north
  2. Sets further south
  3. Sets at the same position

This question also tests a single concept and passes the cover test with no technical flaws. It is superior to the last one, though, because it tests application of knowledge (how the sun’s apparent motion across the sky is affected by seasonal changes and hemisphere) rather than recall of an isolated fact (date of the spring equinox). It includes only 3 answer options because there are only 3 plausible relative positions of the setting sun. As discussed earlier, use of lower-option items does not reduce psychometric quality.



Powered By AI 4.5