Estimating Sample Size for Usability Testing
One strategy used to assure that an interface meets user requirements is to conduct usability testing. When conducting such testing one of the unknowns is sample size. Since extensive testing is costly, minimizing the number of participants can contribute greatly to successful resource management of a project. Even though a significant number of models have been proposed to estimate sample size in usability testing, there is still not consensus on the optimal size. Several studies claim that 3 to 5 users suffice to uncover 80% of problems in a software interface. However, many other studies challenge this assertion. This study analyzed data collected from the user testing of a web application to verify the rule of thumb, commonly known as the “magic number 5”. The outcomes of the analysis showed that the 5-user rule significantly underestimates the required sample size to achieve reasonable levels of problem detection.
Caulton, D. (2001). Relaxing the homogeneity assumption in usability testing. Behaviour & Information Technology, 20(1), 1-7.
Faulkner, L. (2003). Beyond the five-user assumption: Benefits of increased sample sizes in usability testing. Behavior Research Methods, Instruments, & Computers, 35(3), 379-383.
Hertzum, M., & Jacobsen, N. (2001). The evaluator effect: A chilling fact about usability evaluation methods. International Journal of Human-Computer Interaction, 13(4), 421-443.
Hwang, W., & Salvendy , G. (2010). Number of people required for usability evaluation: the 10±2 rule. Commun. ACM, 53(5), 130-133.
Lewis. (1994, June). Sample sizes for usability studies: Additional considerations. Human Factors: The Journal of the Human Factors and Ergonomics Society, 36(2), 368-378.
Lewis. (2000). Overestimation of p in problem discovery usability studies: How serious is the problem. Tech. Rep.
Lewis. (2000). Validation of Monte Carlo estimation of problem discovery. Raleigh, NC: International Business Machines Corp.
Lewis. (2001). Evaluation of procedures for adjusting problem-discovery rates estimated from small samples. International Journal of Human-Computer Interaction, 13(4), 445-479.
Lindgaard, G., & Chattratichart , J. (2007). Usability testing: what have we overlooked? In ACM (Ed.), Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, (pp. 1415-1424). San Jose, California, USA.
Nielsen, J., & Landauer, T. (1993). A Mathematical Model of the Finding of Usability Problems. In ACM (Ed.), Proceedings of the INTERACT '93 and CHI '93 Conference on Human Factors in Computing Systems, (pp. 206-213). Amsterdam, The Netherlands.
Spool, J., & Schroeder , W. (2001). Testing web sites: five users is nowhere near enough. In ACM (Ed.), CHI '01 Extended Abstracts on Human Factors in Computing Systems, (pp. 285-286). Seattle, Washington.
Turner, C., Lewis, J., & Nielsen, J. (2006). Determining usability test sample size. International encyclopedia of ergonomics and human factors, 3(2), 3084-3088.
Virzi, R. (1992, August). Refining the test phase of usability evaluation: How many subjects is enough? Human Factors: The Journal of the Human Factors and Ergonomics Society, 34(4), 457-471.
Woolrych, A., & Cockton, G. (2001). Why and when five test users aren't enough. In C. Editions (Ed.), Proceedings of IHM-HCI 2001 conference, 2, pp. 105-108. Toulouse, FR.
This work is licensed under a Creative Commons 4.0 License.