The following guide to developing questionnaire items is based on best practice (Gehlbach & Brinkworth, 2011; Gehlbach & Artino Jr., 2018). These best practices have been tested across over 40 years of research (Krosnick & Presser, 2010; Schwarz, 1999).
Word items as questions rather than statements and avoid “agree-disagree” response options
Agree-disagree response options may introduce acquiescence bias, which is the tendency to agree with an item regardless of its content (Wright, 1975). Asking respondents to rate their level of agreement to different statements can be cognitively demanding, which increases respondent error and reduces respondent effort (Fowler, 2009). Instead, use verbally labelled response options that reinforce the underlying topic (e.g., the responses for “How happy are you?” would be not at all happy, slightly happy, somewhat happy, quite happy, extremely happy). Empirical evidence demonstrates that agree-disagree response options diminish item quality (Saris, Revilla, Krosnick, Schaeffer, & Shaeffer, 2010), and are among the “worst ways to present items” (Gehlbach & Artino Jr., 2018, p. 361).
Use verbal labels for each response option
Use verbal labels for each response option,rather than labelling only the end points of the response options or labelling with both numbers and verbal labels. This helps to focus the attention of the respondent and reduce measurement error (Artino, Jr. & Gehlbach, 2012).
Ask about one idea at a time
Ask about one idea at a time rather than using double-barrelled items, which ask about two or more ideas in the same question (e.g., instead of asking, “How happy and engaged are you?” ask two questions, one about happiness and one about engagement). If you use double-barrelled items, you risk students responding to only one part of that item (Dillman, Smyth, & Christian, 2014).
Phrase questions with positive language
Phrase questions with positive language rather than using reverse scored or negative language, which students tend to have trouble understanding. Negative words are more difficult to process cognitively, which leads these items to take longer to answer and leads to misresponses (Swain, Weathers, & Niedrich, 2008).
Use at least five response options per scale
Use at least five response options per scale to capture a wider range of perceptions. Research indicates that the “sweet spot” of the number of response anchors is about five (Weng, 2004; Nielsen, Makransky, Vang, & Danmeyer, 2017). A five-item scale that assesses a representative cross-section of a student’s experience should improve measurement (Gehlbach & Artino Jr., 2018).
Maintain equal spacing between response options and use additional space to visually separate non-substantive response options
Maintain equal spacing between response options, and use additional space to visually separate non-substantive response options.This will reinforce the notion that conceptually, there is equal distance between each response option, which yields less biased responses. Moreover, this will help align the visual midpoint with the conceptual midpoint, reducing measurement error (Artino, Jr. & Gehlbach, 2012). This is especially important if you are administering your questionnaire on paper. Electronic questionnaire administrators such as Qualtrics will space response options equally, and you will have to be aware to add an extra space to separate non-substantive response options (e.g., ‘N/A’). To see some examples, check out the resources for evaluating self-efficacy and take a look at this visual guide.
References
Artino, Jr., A. R., & Gehlbach, H. (2012). AM Last Page: Avoiding Four Visual-Design Pitfalls in Survey Development. Academic Medicine, 87(10), 1452. Retrieved from https://www.researchgate.net/profile/Hunter_Gehlbach/publication/231210670_AM_Last_Page_Avoiding_Four_Visual-Design_Pitfalls_in_Survey_Development/links/5a835de6aca272d6501eb6a3/AM-Last-Page-Avoiding-Four-Visual-Design-Pitfalls-in-Survey-Development.pdf
Dillman, D. A., Smyth, J. D., & Christian, L. M. (2014). Internet, Phone, Mail, and Mixed-Mode Surveys: The Tailored Design Method (4th ed.). Hoboken, New Jersey: John Wiley & Sons, Inc.
Gehlbach, H., & Artino Jr., A. R. (2018). The survey checklist (manifesto). Academic Medicine, 93(3), 360-366. Retrieved from https://journals.lww.com/academicmedicine/fulltext/2018/03000/The_Survey_Checklist__Manifesto_.18.aspx#pdf-link
Gehlbach, H., & Brinkworth, M. E. (2011). Measure twice, cut down error: A process for enhancing the validity of survey scales. Review of General Psychology, 15(4), 380-387. Retrieved from https://dash.harvard.edu/bitstream/handle/1/8138346/Gehlbach%20-%20Measure%20twice%208-31-11.pdf?sequence=1&isAllowed=y
Krosnick, J. A., & Presser, S. (2010). Question and questionnaire design. In P. V. Marsden, & J. D. Wright (Eds.), Handbook of Survey Research. Bingley, England: Emerald Group Publishing.
Nielsen, T., Makransky, G., Vang, M. L., & Danmeyer, J. (2017). How specific is specific self-efficacy? A construct validity study using Raschmeasurement models. Studies in Educational Evaluation, 53, 87-97.
Saris, W. E., Revilla, M., Krosnick, J. A., Schaeffer, E. M., & Shaeffer, E. M. (2010). Comparing questions with agree/disagree response options to questions with item-specific response options. Survey Research Methods, 4, 61-79.
Schwarz, N. (1999). Self-reports: how the questions shape the answers. American Psychology, 54, 93-105.
Swain, S. D., Weathers, D., & Niedrich, R. W. (2008). Assessing three sources of misreponse to reversed Likert items. Journal of Marketing Research, 45, 116-131.
Weng, L. -J. (2004). Impact of the number of response categories and anchor labels on coefficient alpha and test-retest reliability. Educational and Psychological Measurement, 64, 956-972. Retrieved from https://journals.sagepub.com/doi/pdf/10.1177/0013164404268674
Wright, J. D. (1975). Does acquiescence bias the 'Index of Political Efficacy?'. The Public Opinion Quarterly, 39(2), 219-226.