RESNA Annual Conference - 2012


Michael G. Melonis, BSEE, University of Colorado Anschutz Medical Campus
Alex Mihailidis, Ph.D P.Eng., University of Toronto
Ryan Keyfitz, BSc., University of Toronto
Marek Grzes, PhD., University of Waterloo
Jesse Hoey, PhD., University of Waterloo
Cathy Bodine, PhD CCC-SLP, University of Colorado Anschutz Medical Campus


Many employment barriers exist for people with cognitive disabilities [1-5]. However, one of the most significant barriers is support for learning tasks associated with a specific job [4-7]. Challenges for many individuals with cognitive disabilities include difficulties reading instructions, following directions, memory, attention, and more.

Recent literature indicates that individuals with cognitive disabilities have the lowest collective employment rate at 42%[8]. Individuals with cognitive disabilities often experience challenges in following multi-step or complex directions. Jobs in which task sequences can vary depending on environmental conditions can confuse workers with cognitive disabilities. In addition, the salaries available for job coaches, and subsequently the number of available job coaches are declining. Many people with cognitive disabilities also have a variety of literacy challenges. They experience difficulty in both processing and producing written output. Examples include comprehension, symbol recognition, generating written output, etc.

The incorporation of context aware prompting technologies into the work environment is one possible solution that can help to reduce the confusion experienced by workers with cognitive disabilities. Context aware prompting technologies integrate task prompting systems and environmental sensors into a single solution. The task prompting systems provide instructional aides, much like a job coach might, and task sequence support which identify the next actions that must be performed on the worksite. Meanwhile, the environmental sensors provide contextual awareness. The sensors can indicate whether a given sequence was actually performed or identify any situational anomalies (i.e. a widget placed in the wrong location).


Figure 1 A Sample of the Final Assembly Images d

The development of the N-CAPS solution requires the “seamless integration” of human activity, work environment processes, vision tracking technology, and a probabilistic decision making modeling. The solution will provide adequate prompts and instructional support to ensure workers with cognitive disabilities can perform their jobs. When the solution is unable to correctly prompt a worker through a given situation, the solution will request human assistance to resolve the condition.

The assembly job chosen for this development/research project requires constructing Chocolate Crisis First Aid Kits (See Figure 1). These unique kits are sold in gift shops around the world and their assembly offers a potential real- world employment option for individuals with cognitive disabilities. The job requires the placement of two bottles of chocolates and two boxes of candy bars in a case that resembles a first aid kit. The position of the bottles and boxes are critical in the final assembly. Once the contents have been correctly placed, the lid is snapped shut and the Chocolate First Aid Kit is moved to a shipment container.


The functional architecture components of the N-CAPS solution are categorized into the following layers:

  • Data Presentation Layer
  • Data Assimilation Layer
  • Data Collection Layer

Data Presentation Layer

The single most important focus of the system is that appropriate information must be communicated at the appropriate time using a communication mechanism that works for the human that is involved [6, 9]

The N-CAPS solution interacts with people with cognitive disabilities through an interactive touch-screen computing device. The device provides task prompts through pictures and video. These prompts adapt to relevant situational events. In addition, an Animated Agent job coach is available to provide additional audio prompting using a friendly avatar with human like facial qualities for delivery. Using task prompting capabilities and animated avatars (Animated Agent), N-CAPS can instruct users and give them appropriate encouragement and other feedback. If certain exception conditions occur, it may be necessary to notify personnel, such as job coaches, of any situational difficulties. For example, if a given widget tray is empty; the N-CAPS would notify an inventory management system to restock the widget tray.

Data Assimilation Layer

The Data Assimilation Layer is the heart of the N-CAPS solution. The Data Assimilation layer is responsible for aggregating the sensor events into observations, prioritizing the events, and determining the next course of action based on these data. It consists of two functional systems: 1) Vision Tracking; and 2) Decision Making.

Figure 2 – Visual Tracking Overlays d

The Vision Tracking Component detects and tracks objects within the work environment (illustrated in Figure 2). It works by having a fixed camera above the assembly environment. Using a variety of image processing techniques, including flocking, background subtraction, and optical flow, the tracker is able to determine the location of the user’s hands and the objects that the person is interacting with. Background subtraction first extracts the user’s hands and relevant objects from the background so that erroneous objects are ignored. The flocking algorithm then tracks the extracted hands and allows for occlusion to occur, such as one hand blocking the other. The hand tracker determines whenever the user’s hands passes through “hot zones” (as shown in Figure 2 as rectangular boxes), which determine the various objects that are being used. The optical flow tracks when objects have been removed from their corresponding slots, or when the slots are empty, as indicated in Figure 2 by the small arrows. The primary objects that are tracked include the items in the input inventory slots (first aid case, chocolates bottles, and chocolate boxes), the placement of the input items into the assembly area, and the movement of the workers hand. The Vision Tracking detects when a workers reaches for an inventory slots, has an object in their hand, and has placed the object. In addition, the system recognizes when an inventory slot is empty or the item is incorrectly placed.

NCAPS employs a probabilistic decision model, specifically a Partially Observable Markov Decision Process (POMDP). POMDPs can calculate optimal control policies when the environment is not fully observable (e.g., brief occlusion of the user’s hands), recognize when actions have stochastic effects (e.g., uncertain reaction to a prompt), and recognize when multiple, conflicting objectives must be balanced (e.g., task completion vs. user frustration) [10-12].

These policies determine when and what type of prompting should take place based on observations from the vision tracking component. Once a decision has been made, the system uses the memory retrieval paradigm [13] to determine the type of prompt to give next. The paradigm defines three methods of memory retrieval:

  1. Recall (Rl) – Remembering something on your own without any cues. No prompt necessary.
  2. Recognition (Rn) – Remembering something with the help of some form of stimulus. Audio or visual prompt provided.
  3. Affordance (Af) – Demonstrating the skill, guiding a person through the steps, and rewarding their completion. Video demonstration will be provided.

The POMDP attempts to determine the user’s abilities for each of these models, and then uses this estimation to determine the type of prompt to provide. If the POMDP decides that providing a prompt is the best action to take (based on the calculated control policies), the module will determine the prompt characteristics and style that are most appropriate for the current scenario and user, such as the action to be prompted, the level of detail, and the form that the prompt should take (e.g., verbal or visual). These characteristics define the various non-linear prompting strategies that will be used by the system to provide assistance to a user. Note the next course of action, or strategy, could be the sequential next step, a suggestion to fix the last step, a call to the supervisor for human assistance, or congratulations on a job well done.


Table 1: Example results for the simulation of step 2 [14]. d

Policies were generated for each of the required assembly steps. They were simulated by the authors for three different types of clients, those with mild, moderate, and severe cognitive impairment. Table 1 is the output of sample timestamps for step 2 for a user with severe cognitive impairment. The goal of this step is for a user to remove a chocolate bottle (named bottle 1) from an inventory slot (labeled as orange slot), and place the bottle in the first aid kit (labeled as the white bin). [14] Probabilities of the belief state are represented as the height of the solid black bars in corresponding columns of each time step (taller bar indicates a higher probability). In this specific example, the system is more active in its prompting based on the fact that the user is assumed to have diminished abilities with respect to the different aspects that need to be completed. For example (t=1), the worker has deteriorating ability to recognize that the inventory slot that holds the required chocolate bottle is empty. As such, the system correctly prompts the worker to recognize that the slot is empty and needs to be filled. In another example (t=5), the system recognizes that the worker has not placed the bottle in its correct location in the white bin, and provides a prompt for the person to recall that the bottle needs to be in that position in order to reach the final goal state. When the worker does not respond to this prompt, the system decides (t=6) to play a different, more detailed, prompt (a prompt related to the affordance ability).


The development of NCAPS is a first step in determining if context aware technologies have work specific applications for individuals with cognitive disabilities. In the long-term, use of contextual information to customize prompting routines could have a tremendous cost saving benefit if it decreases dependence on job coaches and increases employment opportunities. The use of context aware prompting systems could help workers stay on task and manage work related job responsibilities. New employment opportunities may be available as individuals with cognitive disabilities are able to accomplish more complex tasks with less supervision. This offers the potential for increased employment options and, greater success and stability in the workforce. Employment provides all individuals with the potential for increased independence, financial stability, higher self-esteem, and ultimately, improved quality of life.

Information gained from this project will also be invaluable in informing future development of this technology for persons with cognitive disabilities.


  1. Gerber, P., Price, L., Persons with learning disabilities in the workplace: What we know so far in the Americans with Disabilities Act era. Learning Disabilities Research & Practice, 2003. 18(2): p. 132- 136.
  2. Schaller, J. and N.K. Yang, Competitive employment for people with autism: correlates of successful closure in competitive and supported employment. Rehabilitation Counseling Bulletin, 2005. 49(1): p. 4- 16.
  3. Schartz, H.A., et al., Workplace accommodations: Empirical study of current employees. Mississippi Law Journal, 2006. 75: p. 917-943.
  4. Government Accountability Office, U.S., Vocational Rehabilitation: Better measures and monitoring could improve the performance of the VR program. 2005, GAO. United States Government Accountability Office.
  5. FAQ on Intellectual Disability. 2009 [cited 2009 August];Available from:
  6. Davies, D.K., Stock, S. E., & Wehmeyer, M.L., Enhancing independent task performance for individuals with mental retardation through use of a handheld self-directed visual and audio prompting sytem. Eucation and Training in Mental Retardation and Development Dissabilities, 37(2), 2002a: p. 209-218.
  7. Gouvier, W., Sytsma-Jordan, S., Mayville, S., Patterns of discrimination in hiring job applicants with disabilities: The role of disability type, job complexity, and public contact. Rehabilitation Psychology, 2003. 48(3): p. 175-181.
  8. Survey of Income and Program Participation, Disability Status, Employment, Monthly Earnings, and Monthly Family Income Among Individuals 21 to 64 Years Old. 2005, U.S. Census Bureau.
  9. Gillete, Y. and R. DePompei, The potential of electronic organizers as a tool in the cognitive rehabilitation of young people. NeuroRehabilitation, 2004. 19(3): p. 233-243.
  10. Poupart, P. and C. Boutilier, VDCBPI: an approximate scalable algorithm for large scale POMDPs. Proc. of NIPS, 2004.
  11. Zhang, N.L. and W. Zhang, Speeding up the convergence of value iteration in partially observable Markov decision processes. Journal of AI Research, 2001. 14: p. 29-51.
  12. Pineau, J., G. Gordon, and S. Thrun, Point-based value iteration: An anytime algorithm for POMDPs. Proc. of IJCAI., 2003.
  13. Ratcliff, R., A theory of memory retrieval. Psychological Review, 1978. 85(2): p. 59-108.
  14. Grzes, M., et al. Relational Approach to Knowledge Engineering for POMDP-based Assistance Systems with Encoding of a Psychological Model. in ICAPS 2011 Workshop on Knowledge Engineering for Planning and Scheduling (KEPS),. 2011.


The authors would like to thank all colleagues and students who contributed to this study. We are grateful to our Research Assistant, Matthew Lanning, who developed the initial prototype. Funding provided by NIDDR RERC for Advancing Cognitive Technology, H133E090003.