Yale Center for Teaching and Learning

Designing Quality Multiple Choice Questions

Multiple choice questions remain a commonly used tool to assess student learning because of their ease of implementation. They allow instructors to sample from a wide range of course material, and are also relatively easy to grade. Additionally, instructors often have access to test bank questions that they can use in their courses.  In designing or choosing multiple choice questions for assessments, instructors can take steps to make sure that their questions are accurate and reliable measures of student achievement.

With regards to format, multiple choice questions are composed of (1) a question stem and (2) several choices including distractors plus one correct option.  In general, a minimum of 3 answer choices is reported to be sufficient in the literature (Haladyna et al. 2002).  Instructors can consider the general guidelines below when designing high quality stems and distractors for their multiple choice questions.

In particular, good stems:

  • Are clear and brief.
  • Include the central idea.
  • Are worded in the positive, and avoid words such as “not” or “except.”  

Good distractors:

  • Target major student misconceptions on the material.
  • Are plausible. 
  • Avoid phrases such as “all of the above.”
  • Limit usage of the phrase “none of the above,” using it only cautiously. 
  • Do not overlap with one another in terms of content. 
  • Are presented in a logical order.  
  • Alternate the location of the correct answer. 
  • Are written carefully so as not to give clues that would make them easily ruled out. For example, they are grammatically correct and of similar format/length.  

Examples

In the examples below, basic recall multiple choice questions are used to illustrate stems and distractors of varying quality. Consider in these examples that one of the course learning objectives is for students to be able to define potential energy, the energy that is stored by an object.

#1. Good Stem, Poor Distractors

Potential energy is: 

  • a) the energy of motion of an object.
  • b) not the energy stored by an object.
  • c) the energy stored by an object.
  • d) not the energy of motion of an object.

In this scenario, the stem is of good quality in that it is clear, brief, contains the central idea of the question and is worded in the positive sense. However, the distractors are confusing in that b and d are written in the negative sense. Additionally options c and d have overlapping content, so it will be challenging for the students to use process of elimination. Further, the answer choices are organized in an illogical order, and would benefit from the grouping of similar content.

#2. Poor Stem, Good Distractors

Potential energy is not the energy: 

  • a) of motion of a particular object. 
  • b) stored by a particular object.  
  • c) relative to the position of another object. 
  • d) capable of being converted to kinetic energy.  

In this question, the stem contains the word “not.”  In reading the choices, this becomes more of a test of grammar rather than assessing student understanding. Ideally, the stem should be written in the positive sense.   

#3. Good Stem, Good Distractors 

Potential energy is: 

  • a) the energy of motion of an object. 
  • b) the energy stored by an object.  
  • c) the energy emitted by an object.  

In this example, both the stem and the distractors are of decent quality.   

Recommendations

  • Align MC questions with course learning objectives and class activities. Use Backward course design to first write course learning objectives. Next, develop class activities and assessments including multiple choice questions that align with course learning objectives. (Wiggins and McTighe, 2005)
  • Use Bloom’s Taxonomy to develop MC questions that assess different levels of cognition. There are six levels of Bloom’s Taxonomy, revised in 2001, which assess lower- to higher-level thinking skills (Remember, Understand, Apply, Analyze, Evaluate and Create). Develop or select questions that assess different levels of cognition, keeping in mind that it may be more challenging to develop multiple choice questions at the Evaluate and Create levels. (Anderson et al., 2001; Krathwohl, 2002)
  • Provide opportunities for students to practice taking MC questions before exams. A common misconception is that students learn optimally by solely reviewing their course notes and/or reading the textbook. Research has shown that more active cognitive involvement, in particular, practicing retrieving material, leads to better student learning. Instructors can support such learning in several ways. They can use multiple choice questions in class in a low-stakes manner to allow students to practice material as they prepare for the test and use the outcomes to modify their own instruction. Additionally, instructors can encourage students to take practice multiple choice questions aligned with course material outside of class while studying for tests.  (Karpicke & Blunt, 2011)
  • Use best practices for MC item writing. Haladyna et al. (2002) reviewed the literature and published a revised taxonomy for developing multiple choice questions. These guidelines are useful to review when designing MC questions.
  • Perform item analysis statistics to determine which questions do not perform well for item difficulty and item discrimination measures, and revise as appropriate. Many institutions have statistical software programs that provide reports on the quality of multiple choice questions. Two common tests are item difficulty and item discrimination. Item difficulty is typically reported as the percentage of those taking the test who chose the correct answer for an item. In classical testing theory, McCowan and McCowan (1999) describe how the optimal difficulty guideline for a 4-option multiple choice test is 63%. Item discrimination is the measure of how an item detects differences between higher and lower scores on a test. Typically, this is measured by the point-biserial, which is a correlation coefficient.  Items that discriminate well are answered correctly more often by the higher scoring students and have a higher positive correlation.  Good discrimination indexes are 0.4 and above, poor are 0.2 and below. Instructors can use this data from the test to remove or revise lower quality MC questions. 
  • Yale’s learning management system is Canvas.    

References  

Anderson, L.W. (Ed.), Krathwohl, D.R. (Ed.), Airasian, P.W., Cruikshank, K.A., Mayer, R.E., Pintrich, P.R., Raths, J., & Wittrock, M.C. (2001). A taxonomy for learning, teaching, and assessing: A revision of Bloom’s Taxonomy of Educational Objectives (Complete edition). New York: Longman.

Haladyna TM, Downing SM, Rodriguez MC. (2002). A review of multiple-choice item-writing guidelines for classroom assessment. Applied Measurement in Education 15(3): 309-334

Karpicke JD and Blunt J. (2011). Retrieval practice produces more learning than elaborative studying with concept mapping. Science 331: 772-775

Krathwohl DR. (2002). A revision of Bloom’s Taxonomy. Theory Into Practice 41(4): 212-218. 

McCowan RJ and McCowan SC.  (1999). Item Analysis for Criterion-Referenced Tests

Wiggins GP, McTighe J, Hawker Brownlow Education.  (2005).  Understanding by design.  Moorabbin, Vic:  Hawker Brownlow Education.