Questionnaire administration via the WWW: A validation & reliability study for a user satisfaction questionnaire.

     Ben Harper Dept. of        Laura Slaughter College       Kent Norman Dept. of       
     Psychology Lab for             of Library and             Psychology Lab for        
    Automation Psychology        Information Services         Automation Psychology      
   University of Maryland        (CLIS) University of        University of Maryland      
 College Park, MD Tel: (301)    Maryland College Park,    College Park, MD  Tel: (301)   
405-5936 bharper@wam.umd.edu    MD Tel: (301) 405-5938     405-5924  kn8@umail.umd.edu   
                                  lauras@wam.umd.edu                                     

The Questionnaire for User Interaction Satisfaction (QUIS) was created to gauge the satisfaction aspect of software usability in a standard, reliable, and valid way. The QUIS was first implemented as a standard paper and pencil form using a nine point Likert scale (Chin, Diehl, & Norman, 1988). Several computer based versions of this questionnaire have been created in the past and share the same reliability as the paper and pencil version (Slaughter, Harper & Norman; 1994). These computer versions aided data collection and were somewhat configurable. Unfortunately they proved difficult to maintain, distribute and customize. A pending update to the QUIS content provided an opportunity to migrate the QUIS software into a web based form that helps address the shortcomings of previous versions. An early version of the updated QUIS was implemented using HTML forms and extensive JavaScript for data validation and processing. This questionnaire was then administered to a large number of participants gathered from the WWW in order to test the validity and reliability of the new QUIS sections. Web based data collection proved to be very effective for this purpose, but a number of limitations were found.

The Questionnaire for User Interaction Satisfaction (QUIS)

The QUIS focuses on the user's perception of interface usability by as it is expressed in specific aspects of the interface (i.e., overall reaction to the system, screen factors, terminology and system feedback, learning factors, system capabilities) .Each of the specific interface factors and optional sections has a main component question followed by related sub-component questions. Each item is rated on a scale from 1 to 9 with positive adjectives anchoring the right end and negative anchoring the left. In addition, "not applicable" is listed as a choice. Additional space which allows the rater to make comments is also included within the questionnaire. The comment space is headed by a statement that prompts the rater to comment on each of the specific interface factors.

The currently used version (5.5) has been proven reliable and valid when applied to many interface styles (Chin, Diehl, and Norman, 1988). Though the QUIS 5.5 is a powerful tool for interface evaluation, the current version has limited use in the assessment of "cutting edge" systems. An additional four sections covering technical manuals & on-line help, on-line tutorials, multimedia, Internet access, and software installation have been proposed as additions to the current QUIS. These sections contain 48 new hierarchically organized questions that now need to be evaluated for their effectiveness as a predictor of satisfaction with these particular types of interface.

Development of the Web Based QUIS

The questionnaire was implemented using standard HTML forms that let users select items from pull-down lists, click on check boxes and radio buttons, and enter text and comments into text areas. The style is very similar to the paper version of the questionnaire, displaying multiple questions per page and comment areas at the end of each section. In order to insure that users considered each question, a rating or the answer "Not Applicable" were required for each question. Client-side JavaScript was used to both validate the user's responses and gather them into a consistent and standardized format. The data for each section of the QUIS was time stamped and recorded on the client computer using Magic Cookies. At the end of the questionnaire the data from all sections of the QUIS was gathered back into the browser and sent as a single piece to the server. This method of data collection insures that only completed questionnaires were entered, and prevents the data from concurrent users from becoming mixed. A secondary advantage of this form of data collection is that all the "work" of data cleaning is done on the client computer allowing a simple commercial off the shelf CGI script to be used for final data collection.

Web Based Validation of the Updated QUIS

The World Wide Web (WWW) was used to collect data for a reliability and validation assessment of a new version of the Questionnaire for User Interaction Satisfaction(QUIS). The use of the WWW for this experiment provided an appropriate population of users to test this particular type of questionnaire, standard questionnaire administration to participants, and effortless data processing. In addition, this method of testing was done at a lower cost and took less time for data collection than a traditional experiment of the same nature. All of the above facts contributed to our ability to include a greater number of subjects. This, in turn, allows more reliable statistical tests.

Eighty-eight participants (61 males and 27 females), voluntarily completed the on-line questionnaire which was accessible from the Internet. They ranged in age from 14 to 76. Fifty-seven percent stated they had worked more than six months with the software they were rating. Fifty-eight participants rated a WWW browser of their choice, 14 rated a software product they disliked and 16 rated a software product they liked. A total of 29 different software products were evaluated.

The QUIS 7.0 is an updated and expanded version of the previously validated QUIS 5.5 [1]. The Questionnaire for Interaction Satisfaction (QUIS) version 7.0 is arranged in a hierarchical format and contains: (1) a demographic questionnaire, (2) six scales that measure overall reaction ratings of the system, (3) four measures of specific interface factors: screen factors, terminology and system feedback, learning factors, system capabilities, and (4) optional sections to evaluate specific components of the system: technical manuals and on-line help, on-line tutorials, multimedia, Internet access and software installation . Each of the specific interface factors and optional sections has a main component question followed by related sub-component questions. Each item is rated on a scale from 1 to 9 with positive adjectives anchoring the right end and negative anchoring the left. In addition, "not applicable" is listed as a choice. Additional space which allows the rater to make comments is also included within the questionnaire. The comment space is headed by a statement that prompts the rater to comment on each of the specific interface factors.

The on-line questionnaire was made available through the World Wide Web(WWW). The subjects learned of the study through advertisements at Yahoo, the mail list utest@hubcap.clemson.edu, and the newsgroups - comp.human-factors, www.infosystems.browsers.mac, www.infosystems.browsers.windows, www.infosystems.browsers.x, comp.cog-eng. The subjects began the questionnaire with two introductory pages, the first explained what the experiment was about and the second gave directions for completing the questionnaire. The subjects were able to quit the questionnaire at any time and progress at their own speed. The subjects were also instructed to complete all the questions in each section. The subject was not permitted to go to the next page without completing all the sections. At the end of the questionnaire, a comment page was available to the participants to give feedback and ask questions about the questionnaire. Subject's were tracked by their IP number, and only one entry from each IP number was allowed.

Results of the Validation

Subject Characteristics: Demographic data for 89 subjects revealed that 70% of the subjects were male, 82% of them ranged in age from 20-45 years. 62% of respondents had between 1 month and 1 year experience with the product they were evaluating, while 26% had more then a year experience.

Reliability: The overall reliability of the 48 new questions and 6 summate questions for the QUIS 7.0 is Cronbach's alpha of 0.95. The mean question scores varied from 4.85 to 8.07 with standard deviations ranging between 1.34 and 2.68. The mean correlations between sub-items was 0.86 with a standard deviation of 0.06 while the mean correlation between sub-items and their parent items was 0.86 with a standard deviation of 0.09. Items within a component section had a mean correlation of 0.66 with a standard deviation of 0.15.

Validity: Construct validity was measured by correlating item scores with the 6 concurrent general satisfaction questions validated in previous studies. The mean correlations between each main item and a general satisfaction scale ranged between .49 and .61 (SD .09-.12). This suggests that there is good agreement between the new sections of the QUIS and general satisfaction while not being so derivative as to be redundant.

Comparable Results to Previous Versions

The reliability of this extension to the QUIS (alpha=0.95) yielded similar results as the previous versions of the QUIS (alpha=0.96 & 0.88) [1], and is significantly greater then the minimum reliability suggested by Lewis [4] (alpha= 0.70). The strong relationship between sub-items and items, and then among items in composite sections suggests that there is a hierarchical structure to the questionnaire.

Self-Selective Subjects

Although the demographics of this sample are very similar to those found by other surveys of the Internet population for the same time-frame, there is no way of knowing how this sample might differ from the total user population. Without convergent demographic measurements of the Internet population from on-line questionnaires and other conventional survey methods we will not be able to determine the agreement between random samples of users and the volunteer samples that can be collected from the Internet. Until this information is available, generalizability of results will be somewhat limited.

The Need for a Web Based QUIS

Previous versions of the QUIS were implemented in Apple's Hypercard(TM) and Spinnaker's PLUS(TM) with the hopes that customers could add or subtract questions from the QUIS to suite their particular needs. Unfortunately this required them to purchase copies of these tools which was rarely done. In addition, these solutions proved to either support a single platform well, or support several poorly. A web based solution offered us a truly multi-platform solution would not need to be "ported" to support a range of operating systems. It also offered the advantage of being extensible by anyone with a basic knowledge of HTML and a text editor.

Another advantage of a web based questionnaire is that data can be collected automatically from a number of users in a number of locations simultaneously. Moreover, data can be properly formatted as it is collected eliminating the costly, time consuming, and error prone process of data entry. Responses from the form should be directly accessible to statistical packages or spreadsheets in a standard form so that a standard suite of descriptive statistics can be created and applied automatically to all data sets.

The QUIS is often used to gather user feedback in a lab setting after a set exposure to a standard application or interface. The real advantage of a web based questionnaire though is it's ability to be linked directly to other web based content and, collect information from real users of that content in their natural habitat. That can either be the on-campus users of a web based directory, or the intended end user of a web based application who is located across the Atlantic. In either case you have access to a highly specialized user population that may be difficult to query personally, and the delay between exposure to the interface and the measure of satisfaction is minimal.

REFERENCES

1. Chin, J.P., Diehl, V.A., & Norman, K.L.(1988). Development of an instrument measuring user satisfaction of the human-computer interface. in CHI `88 Conference Proceedings: Human Factors in Computing Systems (New York, 1988), ACM Press, pp. 213-218.

2. Harper, B.D. & Norman, K.L. Improving user satisfaction: The questionnaire for user interaction satisfaction version 5.5. in Proceedings of Mid Atlantic Human Factors Conference, (Virginia Beach, February 23-26), pp.224-228.

3. Ives, B. Olson, M. H., Baroudi, J. J. The measurement of user information satisfaction. Communications of the ACM, 26, (1983), 785-793.

4. Lewis, James R. IBM Computer Usability Satisfaction Questionnaires: Psychometric Evaluation and Instructions for Use. Internatioal Jourmal of Human-Computer Interaction. 7, 2 (1995), 57-78.

5. Slaughter, L.A., Harper, B.D. & Norman, K.L. Assessing the equivalence of the paper and on-line formats of the QUIS 5.5. in Proceedings of Mid Atlantic Human Factors Conference (Washington DC: February 23-26, 1994), pp.87-91

6. Slaughter, L.A., Norman, K.L. & Shneiderman, B. Assessing Users' Subjective Satisfaction with the Information System for Youth Services (ISYS). in Proceedings of Mid Atlantic Human Factors Conference (Blacksburg: March 23-26, 1995), pp.22-26.