Listen Up: Speech Recognition Technology Spreads in EHR Era

Elizabeth Barton;

Listen Up: Speech Recognition Technology Spreads in EHR Era

March 7, 2011

Oncology Live®, January 2011, Volume 12, Issue 1

Medical records have changed significantly in the past decade, with powerful servers and digital archives replacing rooms jammed with filing cabinets and folders stuffed with hand-scribbled notes.

Medical records have changed significantly in the past decade, with powerful servers and digital archives replacing rooms jammed with filing cabinets and folders stuffed with hand-scribbled notes. Although many physicians’ offices and hospitals still retain paper records, growing pressure from the federal government to convert to electronic health records (EHRs) is making speech recognition transcription software an increasingly popular tool in medical facilities.

Will this technology simplify the transition to fully digitalized health records, including in oncology settings? Experts say it already has revolutionized the creation of patient records in radiology and that it will continue to spread across the healthcare spectrum. Adopting such technology, though, is more nuanced than loading software into a computer, according to healthcare industry observers and recent research.

David Hirschorn, MD, director of Radiology Informatics at Staten Island University Hospital, New York, New York, said the technology is a “fait accompli” in radiology. “It’s a matter of time before it happens in the rest of healthcare,” said Hirschorn, who also conducts research in the field at Massachusetts General Hospital in Boston, during an interview with Oncology Net Guide. “There are drivers that are pushing for speech-based documentation.”

Jan P.H. van Santen, PhD, director of the Center for Spoken Language Understanding (CSLU) at Oregon Health and Science University in Portland, said automated speech recognition (ASR) is widely used for medical dictation these days.

“Several companies have sold very good systems. These systems work better— none are perfect!— the smaller the vocabulary is and the more technical and longer the words are,” he said in an email. “Radiology, including radiation oncology, has always been among the bigger users of ASR, in part for these reasons and also, of course, because it has stricter reporting requirements than, for example, psychiatry.”

Yet Van Santen already is looking beyond the use of speech technology for straightforward dictation to exciting new applications.

“One is where the physician talks into the system while interacting with the patient. This is much more difficult because the vocabulary will be larger and more unpredictable, and there is going to be background noise (eg, the patient talking),” he said in the email. “But it is enormously important now that EHRs almost force the physician to be typing away while at the same time interacting with the patient— not good, in particular in oncology, where intense and empathic physician-patient communication is so important.”

He also said CSLU is studying ways in which speech technology can be used to analyze speech patterns that signal disease. “For example, throat cancer may cause specific acoustic markers that are different from those resulting from regular hoarseness; certain physicians can hear the difference, but most can't,” he said.

“Another example is brain metastases. Some of these may be too small to be picked up by imaging, yet by being lodged in a functionally critical area may manifest themselves behaviorally, and one of these behavioral manifestations can be speech (about 35% of brain cancer cases present with one or more of clearly audible dysphasia, aphasia, or slurred speech). I underline here ‘clearly audible,’ because we are convinced that computer analysis of speech will be capturing many cases that are not audible to the human ear.”

Advocates See Many Advantages

Advocates of speech recognition technology say it has many advantages. It makes hands-free computing an option for clinical documentation, test result management, speech-enabled EHRs, and diagnostic reporting. It is also helpful for fully documenting patient feedback and extemporaneous explanations since a patient is generally more comfortable speaking responses than typing them, and the information is generally easier to understand if taken from a patient’s natural speech rather than a typed response. These full-text responses can be attached to a patient’s medical report and forwarded to subsequent practitioners.

In radiology practices, dramatic savings have been reported. At Staten Island University Hospital, the adoption of speech recognition technology cut annual reporting costs from $300,000 to $50,000, Hirschorn said during a presentation at the Society of Imaging Informatics in Medicine conference in June. He said the radiology department was able to cut 7 full-time transcription jobs.

In addition, the turnaround time for generating reports dropped from 2 weeks to less than 10 minutes.

The cost of traditional dictation can easily exceed $12,000 per physician per year while voice recognition software costs approximately $1000 to $1500 per physician, according to Bruce Kleaveland, president of Kleaveland Consulting, a healthcare technology management consulting firm in Seattle, Washington. ^{[Physicians Practice story]}

The ability to better analyze information is another advantage likely to become more significant for radiation oncologists, Hirschorn said in an interview. As physicians grow accustomed to using the system, their word choice tends to grow more standardized, resulting in a searchable database where therapy trends can be analyzed. “Outcomes research is definitely rising on the horizon of importance,” he said.

Human Factors Still Count

Although the benefits might seem evident, there are human factors that influence the performance and adoption of the technology.

Researchers at the University of North Carolina Chapel Hill recently studied the impact of the technology on 30 faculty members in the Department of Radiology. Krishnaraj et al found the average turnaround time for radiology report processing decrease from 28 to 12.7 hours with the use of speech recognition technology.^{[AJR2010; 195:194-197]} The volume of verified reports also increased by 5%.

At the same time, however, the researchers found that the work habits of the faculty members played a significant role in the results. They ranked the faculty members by report turnaround time before and after the technology was adopted and classified the members by work types.

“Faculty members who had type1 work habits, that is, reviewed, revised, and finalized reportsat the time of image review, benefited the most from use ofvoice recognition,”

the researchers said. “… Improvement in report turnaround time does not correlate with workload but does correlate with work habits, suggesting human behavior may play a role in determining the outcome of adopting a productivity-enhancing technology.

Kleaveland observed that physicians learning such systems will have to change the way they produce records since there’s no medical transcriptionist upon whom they can rely. “It requires a bit of patience during the training phase, demands a much more controlled environment in terms of microphone placement and ambient noise compared with traditional dictation, and requires the physician to correct mistakes made by the program,” he said.

Likewise, Hirschorn noted that communication about the benefits of speech recognition technology and training in how to use it are important factors in its acceptance and adoption among physicians. “There’s definitely a hurdle to learn how to use it and change the way you work,” he said.

Substantial Growth Anticipated

Although Alexander Graham Bell began experimenting with ways to translate sounds into images in the mid-1870s, it was not until 1952 when Bell Labs developed the first effective over-the-phone speech recognizer. Technology that could transcribe spoken words into text did not hit commercial markets until the 1980s. Initially, the primary users were disabled people who could not use a keyboard or a mouse, according to Datamonitor, an independent market analysis firm. By the late 1990s, though, use of the technology began growing, with the healthcare industry as the primary customer.

In a 2008 report, Datamonitor estimated that sales of speech technology services and licenses in healthcare settings would grow from $171.4 million in 2007 to $601.2 million in 2013, with radiologists and pathologists as its prime audiences.

Today, Nuance Communications alone reports $449.3 million in revenue for 2010 from its Dragon Medical and other speech recognition products. The Burlington, Massachusetts-based company has become the leading provider of the software, having acquired several other companies. Officials have said the company’s healthcare products are used in more than 5,000 organizations.

The company sees more potential in the US government’s drive for EHRs. In October, Nuance announced that it would work with IBM to develop natural language technologies as a core component of those records.

There are 2 main types of speech transcription technology. With front-end technology, the user speaks into a microphone and the computer converts the spoken word in real time; the user makes corrections as needed, since the text appears on the workstation screen as the user speaks. With back-end technology, the user dictates into a digital recording device and the sound file is routed along with the original sound file to a transcriptionist who then makes the necessary edits and finalizes the report. This hybrid system allows the transcriptionist to perform basic editing or correction, rather than using her time entering the full body of the dictated text. [http://blog.dragonelearning.com/]

Faster computer processing speeds and increased drive storage have supported increased vocabulary capacity and paved the way for a marked improvement in accuracy. One of the most popular speech recognition programs, Dragon NaturallySpeaking’s Medical Version 10.1, costs $1550 and includes a 30,000-word medical vocabulary and more specialized terminology. There are 15,000 general medical terms and 15,000 medical terms specific to the user’s specialty. There are currently 58 specialties available.

According to tech reviewers Edward C. Baig and Lamont Wood, Nuance Communication’s Dragon NaturallySpeaking has come a long way since its initial release in 1997 and in the numerous updated versions since its introduction. The “training” process that familiarized the user’s computer with his or her voice took approximately 45 minutes and was accurate only 75 percent of the time. Prior to 1998, the technology required the user to pause after every spoken word.

[http://www.usatoday.com/printedition/money/20100729/baig29_st.art.htm]

[http://www.computerworld.com/s/article/9019220/Hands_Off_Using_Dragon_NaturallySpeaking]

By contrast, the most recent version takes mere minutes to be trained to an individual with accuracy topping 95%. Company officials have attributed the dramatic increase in accuracy to improved algorithms and more powerful, faster computers.

[http://www.computerworld.com/s/article/9019224/Is_speech_recognition_finally_good_enough_]

Another example of the products now on the market is a GE Healthcare system that weaves voice recognition technology into the radiology workflow. Centricity RISi/PACS Postprocessing with SpeechMagic™ handles all functions from order entry to the image report and distribution, with a front-end voice recognition system that the company says eliminates the need for a transcriptionist. The system is designed for use in hospitals.

Medical Transcriptionists Still in Picture

Will the growth of speech recognition technology mean the end of the medical transcription profession? The Association for Healthcare Documentation Integrity, a professional society for medical transcriptionists, sees a continuing need for qualified transcriptionists.

Likewise, the Bureau of Labor Statistics (BLS), a division of the US Department of Labor, is forecasting 11% growth in employment for the medical transcription field from 1008 through 2018. The BLS said demand will rise as the nation’s elderly population grows and consumes more medical services and as the need for electronic documentation rises.

“Growing numbers of medical transcriptionists will be needed to amend patients’ records, edit documents from speech recognition systems, and identify discrepancies in medical reports,” the BLS said in its Occupational Outlook Handbook, 2010-2011 Edition.

There were approximately 105,200 medical transcriptions in the United States in 2008, with about 36% percent working in hospitals and another 23% based in physician offices, the BLS said.

Technology, meanwhile, is moving onward. In December, Nuance announced its latest product for the medical market—a smartphone app that enables doctors to create medical records on the go by dictating to a secure, speech-enabled recording system.