RESNA 26th International Annual Confence

Technology & Disability: Research, Design, Practice & Policy

June 19 to June 23, 2003
Atlanta, Georgia

The Process of ASSESSMENT Test Development and Validation

Laura J. Cohen PT, ATP1,2; Shirley Fitzgerald PhD1,2; Michael L. Boninger MD1,2,3
1Human Engineering Research Laboratories,
VA Pittsburgh Healthcare System, Pgh, PA 15206
2Department of Rehab. Sciences & Technology,
SHRS, University of Pittsburgh, Pgh, PA 15261
3Department of Physical Medicine & Rehab.,
University of Pittsburgh, Pgh, PA 15213


There is a scarcity of assessment tests available to evaluate the impact of educational experiences or clinical practice on the ability to make clinical decisions. New assessment tools and outcome measures are needed. The purpose of this paper is to provide an illustration of the steps one would undertake to develop and validate an assessment tool. As an example, the Seating and Mobility Script Concordance Test (SMSCT) development and validation process will be used to illustrate the process.


A review of the literature reveals a dearth of research related to effective means of increasing the competence and expertise of professionals working in the field of seating and mobility (SM), or for that matter in any other area of medical practice (1). With ever changing and emerging technologies available in the area of SM and the medical fields, it is imperative that clinicians continually update their knowledge, skills and clinical competencies in order to provide quality care. Still, at present, the most widely accepted means to participate in upgrading professional training is participation in continuing education activities. While the need to train more skilled practitioners is clear, the most effective means of training and the tools to assess the effectiveness of training programs have yet to be identified.


Due to the shortage of assessment tools, there are limited ways to assess the impact of educational experiences or clinical practice on the ability to make clinical decisions. New assessment tools are needed so that greater understanding of expertise and different levels of practice will enable better professional preparation and ultimately impact quality of care to clients served.


The design of this process of test development and validation is intended to follow standards for educational and psychological testing in the specification and development of tests (2).


The first and most important step in test development is to delineate the purpose of the test and the nature of the inferences intended from test scores (2). A clear statement of test purpose contributes significantly to appropriate test use in practical contexts (3) and also provides the test developer with an overall framework for test specification, item development, trial and review (2). Table 1 provides a brief overview of the SMSCT test development process.


Test validity applies to the process of gathering evidence to support the ways a test is interpreted but is not an assessment of the actual test instrument itself (2-4). Assessment results have different degrees of validity for different purposes and for different situations. Judgments about the validity of interpretations can only be made after several types of validity evidence have been studied. Table 2 provides a description of different types of validity evidence. Table 3 provides a brief description of the SMSCT validity studies indicating the category of validity evidence collected.

Table 1: SMSCT development process

SMSCT Purpose:

The SMSCT is designed to be an assessment tool to probe whether the organization of clinical knowledge (i.e., whether the nature of links between items of knowledge) allows adequate clinical decisions. The development of this test is based on the work of Charlin (5, 6).

Intended Use:

The SMSCT is intended 1) to distinguish between individuals with novice, intermediate and expert knowledge of SM for individuals with SCI, 2) to assess the results of pre and post service educational programs, and 3) as a competency test for student and novice therapists.

SMSCT Content:

SM practice encompasses a large breadth of knowledge and content; for the purposes of this project, the content domain was narrowed to items pertaining to individuals with traumatic SCI's.

Item Development:

Two expert clinicians were asked to describe problematic clinical situations that are representative of the field of adult SM problems common to individuals with SCI. They were asked to specify for each situation:

• the relevant hypothesis, assessment strategies, or intervention options;

• the questions they ask, physical examinations they perform, and interventions they would use in order to solve the problem; and

• what clinical information, positive or negative, they would look for in these inquiries (5).

Test items were built using the materials obtained at this stage.

Content and Item Review and Revision:

Draft SMSCT test items were reviewed by 3 expert clinicians for content to determine if they reflect real assessment and intervention situations, and to determine if clinicians typically pose these types of questions to themselves during practice. SMSCT items were also reviewed by 3 clinicians for clarity, terminology, and brevity. Based on the feedback, items were revised and the content and item review process repeated once again for the revised items.

Development of Scoring System:

The SCMST test will be mailed to a select group of 15 SM experts involved in prescribing SM equipment for SCI patients. Answers of the expert SCI clinicians will be used to prepare the scoring system by assigning a weighted value corresponding to the proportion of experts who select a given response (6). A score of 100 signifies that the examinee gives on each item the answer that most experts provide, and the lower the score, the farther the examinees are from agreement with the experts'.


Table 2: Types of validity evidence

Content Evidence

refers to the content representativeness and relevance; how well the assessment task represents the domain of important content.

External Structure Evidence

refers to the relationship between an assessment and a comparison to a criterion (test or known quality such as experience) and determines if the assessment results diverge or converge with the results in the expected manner.

Internal Structure Evidence

investigates the relationships among the assessment tasks. Do all the assessment tasks contribute toward assessing the quality of interest?

Generalization Evidence

is collected to determine if there is any significant differences in results when used with subjects of different backgrounds or abilities.

Consequential Evidence

refers to studies conducted to describe the intended outcomes of the given assessment procedure and to determine the degree to which these outcomes are attained for all subjects.


Table 3: SMSCT validity studies and specification of type of validity evidence

Qualitative Study

Four "expert" PT's and OT's who regularly prescribe seating and wheeled mobility equipment to individuals with SCI participated in a ~ 60 minute interview. One researcher conducted all interviews, transcribed the audiotapes and analyzed the data with a qualitative analysis program to develop a theoretical model based on similarities across all of the interviews. (Content Evidence)

Expert Study

10 orthopedic PT experts will be recruited and compared to a group of 10 SM experts. Orthopedic PT experts were chosen in order to recruit a homogenous group with expertise including the musculoskeletal spine but not specific to SM service provision. Demographic information about participants will be collected and the SMSCT administered to all volunteers. (External Structure Evidence)

Phase 1

The SMSCT will be administered to 100 volunteer PT and OT clinicians with varying levels of expertise. Demographic information about participants will be collected. (Internal Structure Evidence, External Structure Evidence, Generalization Evidence)

Phase 2

A 20-30 item subset of the SMSCT test will be administered in a pretest/posttest format to 100 seating and mobility clinicians participating in an 8-hour continuing education session. (Consequential Evidence)


Validity of an assessment depends on the appropriateness of the scores, their intended use and the social consequences of their use (3). Initial psychometric properties of the Script Concordance tests done by Charlin et al (5) and administered to physicians, show encouraging results in terms of reliability, content and internal structure evidence (5). It is intended that the SMSCT will demonstrate similar results so that there will be a greater understanding of SM expertise and different levels of practice which will enable better professional preparation for the next generation of SM clinicians. Other purposes of measuring and assessing SM expertise may include: professional credentialing to enable public protection, benchmarking of best practice, and establishment of clinical pathways. Links between different levels of practice and client outcomes may then be explored in terms of arguing for clinical effectiveness or demonstrating the value of professional practice (7).


  1. Davis, D., O'Brien, M. A., Freemantle, N., Wolf, F. M., Mazmanian, P., and Taylor-Vaisey, A. Impact of Formal Continuing Medical Education: Do Conferences, Workshops, Rounds, and Other Traditional Continuing Education Activities Change Physician Behavior or Health Care Outcomes? JAMA 9-1-1999; 282 (9): 867-74.
  2. American Education Research Association, American Psychological Association, and National Council on Measurement in Education, Standards for educational and psychological testing. Washington, D.C.: American Educational Research Association; 1999.
  3. Nitko, Anthony J., Educational assessment of students. 3 ed. Saddle River, NJ: Prentice-Hall; 2001.(Kevin M.Davis.
  4. Wass, Val, Van der Vleuten C, Shatzer, J., and Jones, R. Assessment of Clinical Competence. The Lancet 2001; 357: 945-9.
  5. Charlin, B., Roy, L., Brailovsky, C., Goulet, F., and Van, der, V. The Script Concordance Test: a Tool to Assess the Reflective Clinician. Teach.Learn.Med. 2000;12(4):189-95.
  6. Charlin, B., Desaulniers, M., Gagnon, R., Blouin, D., and Van, der, V. Comparison of an Aggregate Scoring Method With a Consensus Scoring Method in a Measure of Clinical Reasoning Capacity. Teach.Learn.Med. 2002; 14(3):150-6.
  7. Manley, Kim and Garbett, Rob. Paying Peter and Paul: Reconciling Concepts of Expertise With Competency for a Clinical Career Structure. Journal of Clinical Nursing 2000; 9(3): 347-60.


The VA Center for Excellence for Wheelchair and Related Technology, F2181C and the University of Pittsburgh Model Center on Spinal Cord Injury. Special thanks to Barbara Crane and Jean Minkel for their assistance with SMSCT item and content development and revision.


Laura J. Cohen PT, ATP
Human Engineering Research Laboratories
VA Pittsburgh Healthcare System 151R-1
7180 Highland Drive, 151R-1
Pittsburgh, PA 15206

RESNA Conference Logo