RESNA 26th International Annual Confence

Technology & Disability: Research, Design, Practice & Policy

June 19 to June 23, 2003
Atlanta, Georgia


Lin Ma, Dr. Richard C. Simpson,
Dr. Marcus J. Huber, LieLa Garcia
Dept. of Rehab. Science & Technology, University of Pittsburgh, Pittsburgh, PA 15260
Intelligent Reasoning Systems,
4976 Kasseb Drive, Oceanside, CA 92056
Freedom Scientific Blind/Low Vision Group,
31st Court North, St.Petersburgh, FL 33716 


The goal of the intelligent screen reader project is to increase the effectiveness of screen readers in accessing information on the world wide web (WWW). The design and current status of the software are presented in this paper. Future work includes continuing development of the software and evaluating the software with visually impaired users.


Currently, only 46% of individuals of working age (21-64 years old) with visual impairments are employed [1]. A survey performed by Earl and Leventhal found that many screen reader users had difficulty performing tasks associated with web browsing, which is a critical computer skill [2]. Another study revealed that visually impaired computer users were significantly slower at performing several web-browsing tasks compared to able-bodied users [3]. Thus, a screen reader that provides effective information access will benefit visually impaired users in the domains of education and employment.


There are many screen readers in today's market, such as JAWS for Windows, IBM's homepage reader, and Window-Eyes, and a lot of active research focus on making computer more accessible to people who are blind [4] [5] [6]. But none of the current products or research incorporates knowledge about what the user is actually doing. JAWS does provide a mechanism to adapt its behavior, called scripts. Unfortunately, the scripting language is very complex, so most users never take advantage of it. We want to develop software that (1) observes what the user is doing, (2) creates a script that can perform that task automatically in the future, and (3) tells the user when a script exists that is relevant to what the user is doing manually.



Figure 1. Overview of Intelligent Screen Reader Software
Figure 1 shows the overview picture of the intelligent screen reader project.  The JAWS macro recorder will record users actions during web browsing and save them as scripts in the script library. These observations will be fed into Plan Recognition Engine. The Plan Recognition Engine then uses Plan Recognition Networks (PRNs) in the PRN Library to predict the user's intention based on observations of the user's actions captured by JAWS. The Automated Synthesis of Plan Recognition Networks (ASPRN) generates PRNs by using the scripts in the script library. The Script Generation Interface (SGI) makes use of the plan recognized by the Plan Recognition Engine and outputs an optimized script. This script will be added into the script library for future use.

The components of the system are shown in Figure 1. The script library will be built to store scripts that perform common tasks. Two kinds of scripts are collected into the library. Small scripts perform basic task units such as following a link or looking for a keyword. These task units will be very useful in the script generation and dynamic plan recognition process. Large scripts in the library record higher-level tasks such as retrieving a weather forecast or finding the price of a stock.

The Script Generation Interface (SGI) is capable of generating a script from a list of users actions with or without intervention from the user. It takes a "raw" JAWS script file (created by a Macro Recorder built into JAWS) as input and outputs an optimized script. The SGI allows the user to manually modify and optimize the script line by line. It also provides automatic script generation using Plan Recognition Networks (PRNs), which can reduce a set of lines into a single equivalent function call. The user interface of the SGI is shown in Fig. 2.

The Automated Synthesis of Plan Recognition Networks (ASPRN), developed by Intelligent Reasoning Systems, provides a powerful representation and reasoning scheme for plan recognition. It is a collection of theories and algorithms that takes a script representation as input and outputs a specially-constructed belief network (known as Plan Recognition Networks). An example of a simple script representation and its PRN is shown in Fig.3.

The Plan Recognition Engine uses PRNs in the PRN Library to predict the user's intention based on observations of the user's actions captured by JAWS. Once it finds a script that matches the observed actions with high probability, it will prompt the user that there is a script which closely matches his or her current activity. The user can then modify the existing script or use it unchanged. Scripts can be added into the script library for future use.


The following components have been developed to date.

Macro Recorder

A macro recorder that can record all the user's keystrokes as well as some events (such as waiting for a new window to be brought up) has been implemented by Freedom Scientific. The macros are saved as JAWS script files and playback can be triggered by an assigned hot key combination. The scripts generated by the macro recorder serve as the observations of user behavior used by the Plan Recognition Engine.

Script Generation Interface (SGI)

The SGI, developed by University of Pittsburgh, now has both features of manual and automatic script generation. For example (see Fig.2), suppose a user performs several Tab keys to a specific link, then presses the Enter key to follow the link, and then waits for the new web page to be displayed. The SGI realizes that the user is most likely performing a "follow-link" task and replaces the original subset with a single scriptcalled "Perform Script FollowLink( )".


As indicated in the design and development sections, there is still much work to be done. Future work will include three main aspects. First, the dynamic plan recognition feature needs to be implemented in SGI. Second, the components of the system need to be fully integrated with each other. And finally, we would like to evaluate the developed software by carrying out user trials. The hypothesis of the testing is that users of the intelligent JAWS screen reader will take significantly shorter time than that of users of the regular JAWS screen reader in performing typical web browsing tasks.

Figure 2. Screenshot of SCI
Figure 2 shows the screenshot of SGI. The left window shows all the user actions either from the JAWS macro recorder or from dynamic capture. The middle window shows all the features associated with each action; for example, the line number at which this action is performed, the text content of the line and whether or not the line will be spoken. The right window shows the optimized script. There are some radio buttons in the SGI that enable manual modification of a script. Users can delete actions, reorder a sequence of actions, and specify whether a line of text should be spoken.  In the situation shown in the figure, the user has pressed the Tab key several times until he found the desired link, then pressed the Enter key to follow this link. He then pressed the Tab key in the new window and ended by pressing the Enter key. The user has manually specified that the text associated with some of the Tab key presses should be read out. The SGI theen used PRNs to automatically replace several of the recorded actions with a single function call to Followlink( ).
Figure 3. A Script Representation and its' Plan Recognition Network
Figure 3 shows an example script representation and its' PRN. This script has three actions in its' body. When converted into PRN construction, the root node is this script. It connects three nodes that represent action1 to action3. Each action node has a connection to its' evidence node. In addition, there is an arrow from the action1 node to the action2 node and an arrow from the action2 node to the action3 node. These arrows indicate the action sequence in the script representation.


To effectively access desired information in web browsing, it is not enough for us to improve the accessibility of a web page by adhering to established guidelines. It is also necessary for us to examine what the user is doing and to modify the screen reader so as to better achieve their goal.


  1. J.McNeil. Americans with disabilities:1991-92. U.S. bureau of the census. Current population reports, p70-33,U.S.Government Printing Office,Washington, DC, 1993.
  2. C.Earl, J.Leventhal. A Survey of Windows Screen Reader Users: Recent Improvements in Accessibility. Journal of Visual Impairment and Blindness. 1999;93.
  3. J.Gunderson, R.Mendelson. Usability of World Wide Web Browser by Persons with Visual Impairments. In Proceedings of the RESNA'97 Annual Conference, 1997. p330-2.
  4. Okada-Yoshihiko, Yamanaka-Katsuhiro, Kaneyoshi-Akio. Screen reader 'CounterVision' with multi-access interface. NEC-Research-and-Development. 1998;39:335-41.
  5. Ebina-Tsuyoshi, Igi-Seiji, Miyake-Teruhisa, Takahashi-Hiroko. GUI object search method using a tactile display. Electronics-and-Communications-in-Japan,-Part-III:-Fundamental-Electronic-Science-(English-translation-of-Denshi-Tsushi 1999;82:40-9.
  6. Watanabe-Tetsuya, Okada-Shinichi, Ifukube-Tohru. Development of a GUI screen reader for blind persons, Systems-and-Computers-in-Japan. 1998;29:18-27


This study is supported by a Phase II Small Business Innovation Research (SBIR) Grant from The National Science Foundation (Grant #91590).

RESNA Conference Logo