Recent Assistive Technology Research at CARRT

R. Mounir, U. Trivedi, A. Aguirrezabal, D. Ashley, S. Sundarrao, R. Alqasem, R. Dubey

University of South Florida


Assistive Technology (AT) devices have become an integral part of life for people with disabilities. These devices aim at recovering and improving individuals’ functionality to maintain independence and self-sufficiency. This paper introduces four different assistive technology devices developed by the Center for Assistive, Rehabilitation and Robotics Technologies (CARRT) at the University of South Florida. The devices are designed to assist people with physical disabilities, including speech impairment, visual impairment and mobility impairment.


Assistive Technology devices are mainly designed for individuals with disabilities to overcome their limitations which increase their chances in receiving better education and in enhancing their social lives. AT devices are capable of maintaining or increasing the functional capabilities of individuals with disabilities.

CARRT has developed and provided numerous assistive technology devices for people with cognitive disabilities; including Autism Spectrum Disorder (ASD) and Traumatic Brain Injury (TBI) using Virtual Reality for Vocational Rehabilitation (VR4VR) system [2]. CARRT has also developed solutions, for individuals with physical disabilities, such as a 9-DOF wheelchair-mounted robotic arm (WMRA) system [3, 4] and reactive Brain-Computer Interface (BCI-P300) control for the robotic arm [5, 6]. Other research has been conducted for grasping and intention recognition [7] among other works.

This paper introduces four different AT devices; Speech Assistance using ANN, Wearable Devices for persons with visual impairment, Mobility Assistance using potential field path planning technique, Hands-Free Wheelchair control.

Speech Assistance

This work focuses on the research related to enabling individuals with speech impairment to use speech-to-text software to recognize and dictate their speech. Automatic Speech Recognition (ASR) tends to be a challenging problem for researchers because of the wide range of speech variability. Some of the variabilities include different accents, pronunciations, speeds, volumes, etc. It is very difficult to train an end-to-end speech recognition model on data with speech impediment due to the lack of large enough datasets, and the difficulty of generalizing a speech disorder pattern on all users with speech impediments. This work highlights the different techniques used in deep learning to achieve ASR and how it can be modified to recognize and dictate speech from individuals with speech impediments.

Wearable Devices

Assistive technologies have become a significant part of life for the people with visual impairment. Devices of this technology help people with visually impairment to avoid collision with objects and people while walking. It is important to know the obstacle dimensions, and the distance to the obstacle, to be able to safely navigate from one location to another. CARRT has designed stereovision glasses with two high definition cameras to detect and identify obstacles in real time. CARRT has also developed a modified version of the wearable haptic belt.

Mobility Assistance

Power wheelchair users often face difficulties in maneuvering their wheelchairs to navigate through crowded environments, and occasionally bump into objects or people around them. Users usually need to continuously be aware of all traffic around them to actively avoid all collisions. Without visual aids, this is challenging since many wheelchair users are unable to view everything around them and have a good estimate of how far they are from various objects and people. The objective of this project is to create a sensor ring/arc around the base of the wheelchair that will detect objects within a certain radius of the chair. Once the obstacles are detected, the joystick input motion vector (if pointed towards the obstacle) will be modified to move the chair in a direction that won't collide with the obstacle, while moving in the general direction commanded by the user through the joystick. The goal is to successfully maneuver around a cluttered environment with less cognitive load on the user.

The US Census Bureau reports that 54.4 million of U.S. population has disabilities, of which 64% is classified as severe disability [1].

Hands-Free Wheelchair

The wireless, hand-free mobile wheelchair control is designed for people with disabilities Such that it allows the user to maneuver the wheelchair without manipulating the joystick. This device allows the wheelchair bound user to control the wheelchair hands-free, and it provides amputees the means to drive the wheelchair using their residual limb or other body parts. Successful implementation of the design was completed using a custom-made controller that communicates with an android smartphone via Bluetooth which acts as the joystick. The user’s upper body parts, such as head, chest, forearm, residual arm or other possible areas, were used to attach the android smartphone for controlling the wheelchair. The smartphone’s inbuilt accelerometer sensor was used to detect the gravity vector. The generated data was used to calculate pitch and roll motion. These motion values were converted into motor commands and sent to the custom wheelchair controller via Bluetooth. The wheelchair controller was designed in such a way that it accepts the Bluetooth motor commands from the android phone and moves the wheelchair accordingly. To avoid any interference with the actual joystick of the wheelchair, the android controller was designed to toggle between both controllers using a switch.


Figure 1 shows a Long Short-Term Memory (LSTM) cell which is a modification of the vanilla Recurrent Neural Network architecture. The LSTM cell has selective memory functions; it can forget or remember specific information from a sequence of data. The figure shows how the forget, Input and output gates are connected within a LSTM cell.
Figure 1: Long Short-Term Memory (LSTM) Cell; A modification for the vanilla Recurrent Neural Network design
The project is split into three consecutive processes; ASR to phonetic transcription, edit distance and language model. The ASR is the most challenging due to the complexity of the neural network architecture and the preprocessing involved. We apply Mel-Frequency Cepstrum Coefficients (MFCC) to each audio file which results in 13 coefficients for each frame. The labels (text matching the audio) is converted to phonemes using the CMU arpabet phonetic dictionary. The Network is trained using the MFCC coefficients as inputs and phonemes’ IDs as outputs. The Network architecture implemented is a Bidirectional Recurrent Deep Neural Network (BRDNN), it consists of 2 (one in each direction) LSTM cells (Figure 1) with 100 hidden blocks in each direction. The network is made deep by stacking two more layers, which results in a 3 layers network in depth. Two fully connected layers were attached to the output of the recurrent network with 128 hidden units in each. This architecture resulted in a 38.5% Label Error Rate (LER) on the Test set.

Figure 2 shows the number of words per sentence found at each edit distance for different accents. Asian and Indian accents, as well as no accents are tested. The results show a significant increase in words per sentence for all accents as the edit distance increases. The increase is more significant for data with Asian and Indian accents than for the data with no accents.
Figure 2: Words/sentence found at each edit distance for different accents
Levenshtein edit distance is used to generate potential words from phonemes. Edit distance of one means a maximum change of one phoneme is allowed, edit distance of two means a change of one or two phonemes is allowed when generating the potential words, and so on. These changes can be inserts, deletes or replacements. The language model uses the potential words to generate sentences with the most semantic meaning. The language model is another recurrent neural network model trained on full sentences. The model outputs the probability of a word occurring after a given word or sentence. It is simpler than the main speech recognition model because it is not bidirectional and not as deep. The language model uses beam search decoding to find the best sentences. The results in Figure 2 shows the number of words found per sentence at every edit distance.

subjects having no accent found significantly more words, at an edit distance of one, than subjects with accents. As we increase the edit distance, the words/sentence found increase for all the data points. This concludes that it is recommended to increase the edit distance for data with speech impediment to acquire better results (given a good language model).

Wearable Devices

Alternative Text: Figure 3 shows a picture of the vibrotactile belt and another picture of the stereovision system.
Figure 3: Vibrotactile belt and Stereovision System
The wearable haptic belt uses high performance ultrasound sensors and an accelerometer to measure the user’s walking speed and other navigation-related information. The acquired data is used by a microcontroller to control 2

the vibrotactile stimulation. The stereovision feedback system detects different types of obstacles using an object recognition algorithm and outputs the best approach to avoid collision using audio feedback. The results gathered from these technologies proved that the stereovision system has plethora of advantages over the vibrotactile belt.

Vibrotactile Belt

CARRT developed a modified feedback system which was able to provide feedback by altering frequency, time lapse, and amplitude of vibration to overcome many limitations such as, spatial masking, temporal effect, and spatiotemporal interaction. The vibrotactile assistive waist belt consists of the following components: 3D printed module containing the ultrasonic sensors, circuit board and microcontroller, 5 MB1000 LV-MaxSonar ® -EZ0TM, Arduino Nano Atmega328 microcontroller, ADXL345 3-axis accelerometer, DRV2605l motor controller with TCA9548A I2C multiplexer, 4 Haptic motors, and 9V battery (see Figure 3)

Stereovision System

Figure 4 shows the sensor input data when detecting a wall at a 45 degree angle in front of the wheel chair. The sensor data is marked in red dots and three of the sensor data points form a straight-line at a 45 degree angle in front of the wheelchair to show where the wall is located.
Figure 4: Sensor input when detecting a wall at 45 degrees
To perceive depth information, it was necessary to use two cameras separated by a constant distance. These two cameras were mounted to the sides of a 3D printed goggle frame as shown in figure 3. The computer interface converted the image captured by an individual camera into the depth information which was later converted to a point cloud. The motion of the cluster nearest to the user presented the warning about incoming or nearby obstacles. To avoid repeated calibrations, the camera attachment was placed as close to the goggle frame bridge as possible. The cameras are two 180-degree fisheye lens cameras (model no. ELP-USBFHD01M-L180) mounted on the side of the goggles such that they focus to infinity. The Robot Operating System (ROS) infrastructure was used for software development.


The average time taken by the subjects to complete the given task using the vibrotactile belt was within 8 seconds of the time when the stereovision system was used. Based on these results, it can be concluded that both of the systems performed closely in terms of basic functionalities of the instruments. Object recognition percentage for the vibrotactile belt is 87% and 89% outdoor and indoor, respectively. These numbers are higher than the object recognition percentage of the vision system, which are 77% and 83%, respectively, for outdoor and indoor. The advantage of using the stereovision system include persons recognition, obstacle recognition, street signs recognition, early warning of farther objects and signs.

Mobility Assistance

Figure 5 shows a picture of the hands-free chair with the Android phone mounted on a hat.
Figure 5: Hands-Free chair
This device uses the potential field algorithm for motion planning [8]. The sensors used for this work were LV-MaxSonar-EZ ultrasonic sensors. A multiplexer/demultiplexer IC (integrated chip) was used to allow for more sensors to work at the same time using one Arduino (microcontroller). A custom sensor bracket was designed to hold the sensors at specific angles.

To navigate in the X direction, the goal vector is calculated as:

Figure 6 is a bar graph showing the time needed by each subject to complete a specific task using different control methods. The control methods are joystick, Android hand held, Android on a Hat, and Android strapped to upper left arm. Ten subjects’ results are plotted.
Figure 6: A Time vs. Control method bar graph showing the results from each subject for the obstacle navigation operation
A similar equation was developed for the repulsive force due to obstacles. Figure 4 shows the sensor input when detecting a 45 degrees wall/obstacle. The results for the assigned tasks were determined by time to completion. Each user was timed on how long it took them to complete the task from start to finish. Eight different tasks were given and the results show that the assistive wheelchair operation is slightly slower than operating a wheelchair without the assistive device, but with a dedicated real-time operating system and some control algorithm modifications, the reaction time of the device can be enhanced to allow the users to move at normal wheelchair speeds even with the assistive device actively avoiding objects. The assistive control technology is capable of keeping the wheelchair away from collisions and increasing the safety of the user while operating the chair.

X = strength_factor * JoyStick_Distance * cos(Ѳ). where Ѳ is the angle between the goal and the wheelchair position. 3

Hands-Free Wheelchair

The power wheelchair, which was used for the subject testing, is a Pride Mobility Jazzy 600. The wheelchair’s existing motor connections were rerouted through the Android Control System (ACS). With the ACS turned off, these signals routed directly through the ACS and were able to return to their previously existing contacts on the manufacturer’s control system. The main components of the ACS consist of an Android phone, Arduino BT-V06 micro-controller board, and a Sabertooth 2X60 motor controller. Figure 5 shows the Android phone on a hat for head control, and the custom controller on the back of the wheelchair seat.

The android phone was placed in different configurations to test for the best attachment method. The results show that users were able to avoid obstacles more efficiently when using the joystick (See figure 6); however, users were able to do rotation tasks more efficiently when using the “Android hand held” control method.



  1.  Brault, M. W. (2012). Americans with disabilities: 2010 (pp. 1-23). Washington, DC: US Department of Commerce, Economics and Statistics Administration, US Census Bureau.
  2.  Bozgeyikli, L., Bozgeyikli, E., Clevenger, M., Raij, A., Alqasemi, R., Sundarrao, S., & Dubey, R. (2015, July). VR4VR: vocational rehabilitation of individuals with disabilities in immersive virtual reality environments. ACM International Conference on PErvasive Technologies Related to Assistive Environments (p. 54). ACM.
  3.  Alqasemi, R., & Dubey, R. (2010). A 9-DoF wheelchair-mounted robotic arm system: design, control, brain- computer interfacing, and testing. In Advances in Robot Manipulators. InTech.

  4. Alqasemi, R., & Dubey, R. (2009, March). Kinematics, control and redundancy resolution of a 9-DoF wheelchair-mounted robotic arm system for ADL tasks. In Mechatronics and its Applications, 2009. ISMA'09. 6th International Symposium on (pp. 1-7). IEEE.
  5.  Palankar, M., De Laurentis, K. J., Alqasemi, R., Veras, E., Dubey, R., Arbel, Y., & Donchin, E. (2009, February). Control of a 9-DoF wheelchair-mounted robotic arm system using a P300 brain computer interface: Initial experiments. In Robotics and Biomimetics, 2008. IEEE International Conference on (pp. 348-353). IEEE.
  6.  Pathirage, I., Khokar, K., Klay, E., Alqasemi, R., & Dubey, R. (2013, July). A vision based P300 brain computer interface for grasping using a wheelchair-mounted robotic arm. In Advanced Intelligent Mechatronics (AIM), 2013 IEEE/ASME International Conference on (pp. 188-193). IEEE.
  7. Khokar, K., Alqasemi, R., Sarkar, S., Reed, K., & Dubey, R. (2014, May). A novel telerobotic method for human-in-the-loop assisted grasping based on intention recognition. In Robotics and Automation (ICRA), 2014 IEEE International Conference on (pp. 4762-4769). IEEE.
  8. Goodrich, M. A. (2002). Potential fields tutorial. Class Notes, 157.


Audio Version PDF Version