Case for Realtime Captioning

Back to Articles, Research Papers, Whitepapers

The Case for Realtime Captioning in Classrooms

By Aaron Steinfeld
At the recent Alexander Graham Bell Association for the Deaf and Hard of Hearing (AG Bell) convention, I spoke up during a session supporting realtime captioning in classrooms. After all, I had recently finished my Ph.D. on this topic and knew of many studies validating their use. As soon as the session was over, I was asked by parents and educators for a reference to a specific study I had mentioned (Caldwell, 1973). Many also wanted references to any study supporting the use of realtime captioning (RTC).

This essay is an attempt to distill these requests out of my dissertation (Steinfeld, 1999). I apologize in advance for the number of references; the benefit of captions has been shown in many research papers.

Realtime Captions

The idea of a real-time voice-to-text translation for the deaf is not a new one (Gates, 1971; Houde, 1979; McCoy and Shumway, 1979; Stuckless, 1981; Block and Okrand, 1983; and Cutler, 1990). Stuckless (1981, p. 292) refers to the concept as the “computerized near-instant conversion of spoken English into readable print.” He also describes the possibility of using realtime captioning in the classroom. Furthermore, he points out that text displays of this type are not bound by the same temporal characteristics as speech since there can be a visual buffer of some sort (akin to “instant replay”). This feature should help RTC fill the role of a reference tool as missed information is often needed shortly after it is originally presented.

As many readers already know, speechreading is lipreading with whatever auditory assistance that is available. In a strictly oral environment, the person who uses speechreading to follow a conversation may be using a process similar to reading to perceive content. Instead of reading characters from a page, lipreaders read “visemes” (the building blocks of mouth movements during speech, Jackson, 1990). The important difference is the potential loss of information during speechreading. Only exceptional speechreaders are able to perceive spoken words near 90 percent accuracy.

Good speechreaders use educated guesses to fill in the words they are unable to perceive. By introducing a text-based supplementary information device, the user can refer to another source of information under difficult speechreading conditions. Thus, there should be less demand for working memory resources during the speechreading process (e.g., less guessing). As a result, RTC should free up additional mental capacity for higher level cognitive processes.

Supporting Research

The closest approximation of RTC is the use of captions in videotaped educational material. Several studies describe test procedures where captioned presentations were compared to non-captioned presentations.

An early study by Gates (1971) examined the recall of deaf students who watched video presentations (no audio) with all combinations of speaker, signed translation, and captioned formats. The seven different combinations grouped by performance fell into a superior (combinations that used captioning) and an inferior set. The combination of all three at the same time produced the same performance as the speaker with captions combination. Thus, signed translation may not add much beyond captioning.

Other studies have also found that deaf students demonstrate a significant improvement when captions are included. Boyd and Vader (1972) found that deaf students who viewed a captioned program did better on tests than a matched group that viewed a program that was not captioned. Murphy-Berman and Jorgensen (1980) also observed that comprehension increased when captions were included in video presentations to deaf students.

In another study, Nugent (1983) found that both hearing and deaf students performed better on presentations with visuals and captions than they did on presentations with either component alone. Additionally, Nugent stated that neither the deaf nor the hearing groups showed unique abilities to learn from the presentations, implying that both groups have similar cognitive processes. This led to a conclusion that the differences in performance between the deaf and hearing groups were due to inferior entry-level knowledge for the deaf students. For example, Van Biema (1994) reported that due to communication barriers, the deaf community had a substantially reduced awareness of AIDS and HIV. The improvement by the hearing students suggests that they may also benefit from redundant, captioned information.

The use of redundant, captioned information to improve learning capabilities in learning-disabled hearing students was also examined by Koskinen, et al (1986). The experiment revealed no statistical difference between captioning and no captioning. However, several other findings led the authors to speculate that the captioned presentations may help develop word recognition skills.

In a later study, Neuman and Koskinen (1992) examined bilingual seventh and eighth grade students. They found that captioning presentations led to significant advances in word knowledge. They also found higher scores on sentence anomaly tests (detection of improper word usage) for the captioned group. Markham (1989) and Seriwong (1992) also found that English as a Second Language students exhibited better performance when captions were present.

I will not repeat findings from my dissertation here as the focus was on how to display RTC as opposed to its merits. However, one finding from my first experiment (Steinfeld, 1998) provided clear reinforcement of the studies listed above. A 9.8 percent increase in recall accuracy was seen from a traditional presentation (the speaker’s face and voice only) to the RTC conditions for the hearing subjects. The decrease in perception difficulty was clearly beneficial to the students who were deaf, with a 149.6 percent accuracy increase from the traditional condition to the RTC conditions. The real world impact of these results is that providing captions will clearly help deaf students. Furthermore, captions will also assist their hearing classmates. This is especially true for rooms with poor acoustics where hearing students have perception difficulties similar to their deaf and hard-of-hearing counterparts due to the environment.

Based on these studies, it is apparent that the use of captions in classroom environments provide significant, measurable assistance to a wide spectrum of students. These studies also support the basic hypothesis that a realtime captioned environment has the potential to increase the learning potential of not just the deaf and hard of hearing, but all students.

Specific Comments on Reading Levels
One of the more commonly repeated characteristics of deaf students is their lower level of reading performance compared to hearing students of the same age. Conrad (1977) reported that deaf high school students have a reading level about seven years younger than the reading level of matched hearing students. In later studies, reduced levels of reading were found to negatively affect ability to comprehend captioned programs (Braverman, 1981; Maxon and Welch, 1992).

However, it is important to point out that a greater degree of hearing loss has not been linked to a decreased level of language skills (Maxon and Welch, 1992). In addition, Caldwell (1973) found that students who were exposed to captions that were higher than their reading levels displayed a significant jump in reading level after a five-week period. Interestingly, there was no decrease in the interest levels of the students over this time period. Thus, continued exposure to more difficult reading levels of captions produced a rapid increase in performance. This finding is especially important as it suggests that long-term use of RTC in a classroom environment has the potential to significantly improve reading levels.

As previously noted, the inclusion of captions in a classroom dramatically increases a deaf or hard-of-hearing person’s ability to comprehend the speaker. In addition, providing captions to hearing people also seems to enhance verbal comprehension. The increased comprehension for both hearing and deaf students will likely lead to a better learning environment and improved information transfer between the teacher and the students.

About the Author:
Aaron Steinfeld, Ph.D., is a researcher at the National Robotics Engineering Consortium at Carnegie Mellon University. The material from this article is drawn from his dissertation, The Benefit of Real-Time Captions in a Classroom Environment. You can reach him at

Block, M. H. and Okrand, M. (1983). Real-time closed-captioned television as an educational tool. American Annals of the Deaf, 128(5), 636-641.

Boyd, J. and Vader, E. A. (1972). Captioned television for the deaf. American Annals of the Deaf, 117, 34-37.

Braverman, B. B. (1981). Television captioning strategies: A systematic research and development approach. American Annals of the Deaf, 126, 1031-1036.

Caldwell, D. C. (1973). Use of graded captions with instructional television for deaf learners. American Annals of the Deaf, 118(4), 500-507.

Conrad, R. (1977). The reading ability of deaf school-leavers. British Journal of Education Psychology, 47, 138-148.

Cutler, W. B. (1990). Beyond the hearing aid: Assistive listening devices and systems. International Journal of Technology and Aging, 3(2), 101-109.

Gates, R. R. (1971). The reception of verbal information by deaf students through a television medium — a comparison of speechreading, manual communication, and reading. Proceedings of the Convention of American Instructors of the Deaf, Little Rock, 513-512.

Houde, R. (1979). Prospects for automatic recognition of speech. American Annals of the Deaf, 124(5), 568-572.

Jackson, P. L. (1988). The theoretical minimal unit for visual speech perception: visemes and coarticulation. Volta Review, 90(5), 99-115.

Koskinen, P.; Wilson, R. M.; Gambrell, L. B.; and Jensema, C. (1986). Using closed captioned television to enhance reading skills of learning disabled students. National Reading Conference Yearbook, 35, 61-65.

Markham, P. L. (1989). The effects of captioned television on the listening comprehension of beginning, intermediate, and advanced ESL students. Educational Technology, 29(10), 38-41.

Maxon, A. B. and Welch, A. J. (1992). The role of language competence on comprehension of television messages by children with hearing impairment. Volta Review, 95, 315-326.

McCoy, E. and Shumway, R. (1979). Real-time captioning — promise for the future. American Annals of the Deaf, 124(5), 681-690.

Murphy-Berman, V. and Jorgensen, J. (1980). Evaluation of a multilevel linguistic approach to captioning television for hearing-impaired children. American Annals of the Deaf, 125, 1072-1081.

Neuman, S. B. and Koskinen, P. (1992). Captioned television as comprehensible input: Effects of incidental word learning from context for language minority students. Reading Research Quarterly, 27(1), 95-106.

Nugent, G. C. (1983). Deaf students’ learning from captioned instruction: The relationship between the visual and caption display. The Journal of Special Education, 17(2), 227-234.

Seriwong, S. (1992). A pilot study on the effects of closed-captioned television on English as a second language students’ listening comprehension (television). Doctoral dissertation, Southern Illinois University at Carbondale. University Microfilms Inc.

Steinfeld, A. (1999). The Benefit of Real-Time Captions in a Classroom Environment. Dissertation, University of Michigan, Ann Arbor. (Currently being prepared for distribution by Bell & Howell Information and Learning, formerly known as UMI.)

Steinfeld, A. (1998). The benefit of real-time captioning in a mainstream classroom as measured by working memory, Volta Review, 100(1), 29-44.

Stuckless, R. E. (1981). Real-time graphic display and language development for the hearing-impaired. Volta Review, 83(5), 291-300.

Van Biema, D. (1994). AIDS. Time, 143(14), 76-77.

Benefits of CARTe
  • Equal access
  • Complete communication access
  • Flexibility
  • Independent learning/understanding
  • Increased learning potential
  • Full participation
  • Choice of notetaking options
Learn More
  • Accessibility
  • CART/Captioning
  • Turnkey Solutions
  • eLearning/Classroom
  • Meetings
  • Events
  • Executive Support
  • Employee Support
  • Webinars
  • Media Conversion
  • Time Coding
  • Web Captioning