Automated speech recognition (ASR) systems, which use sophisticated machine-learning algorithms to convert spoken language to text, have become increasingly widespread, powering popular virtual assistants, facilitating automated closed captioning, and enabling digital dictation platforms for health care. This technology is employed in myriad applications used by millions of individuals worldwide. Some examples include virtual assistants built into mobile devices, home appliances, and in-car systems; digital dictation for completing medical records; automatic translation; automated subtitling for video content; and hands-free computing. Over the last several years, the quality of these systems has dramatically improved, due both to advances in deep learning and to the collection of large-scale datasets used to train the systems. Some concern exists, however, that these tools do not work equally well for all subgroups of the population.
As described in an article published in the April 7, 2020 issue of the journal Proceedings of the National Academy of Sciences of the United States of America, researchers examined the ability of five state-of-the-art ASR systems developed by Amazon, Apple, Google, IBM, and Microsoft to transcribe structured interviews conducted with 42 white speakers and 73 black speakers. This corpus in total spans five U.S. cities and consists of 19.8 hours of audio matched on the age and gender of the speaker. The study indicates that all five ASR systems exhibited substantial racial disparities, with an average word error rate (WER) of 0.35 for black speakers compared with 0.19 for white speakers. The investigators trace these disparities to the underlying acoustic models used by the ASR systems as the race gap was equally large on a subset of identical phrases spoken by black and white individuals in the corpus. They conclude by proposing strategies, such as using more diverse training datasets that include African American Vernacular English, to reduce these performance differences and ensure speech recognition technology is inclusive.
More April 2020 TRENDS Articles
CALLING ALL CARS AND HEALTH DETECTIVES
Indicates the important role that epidemiologists play in explaining what is transpiring at key stages of COVID-19. Read more
PRESIDENT’S CORNER
ASAHP President Phyllis King discusses how with the thrust into the digitization of healthcare, the question for higher education is how fast can we understand, adapt, anticipate and project patient care needs and healthcare innovations to prepare our students and meet the needs of this new world? Read more
FAST CHANGING LEGISLATIVE ENVIRONMENT
Depicts efforts by the federal government to provide additional funding through Paycheck Program Protection legislation, along with an increasing concern that the U.S. is too dependent on other nations for supplying minerals used in the production of pharmaceuticals and medical devices. Read more
HEALTH REFORM DEVELOPMENTS
Points out how the existence of accountable care organizations (ACOs) is threatened by the current pandemic; describes COVID-19 surveillance activities in relation to the Fourth Amendment of the U.S. Constitution; and loosening by the Center for Medicare & Medicaid Services (CMS) of telehealth and scope of practice regulations. Read more
DEVELOPMENTS IN HIGHER EDUCATION
Describes a recent ASAHP webinar on clinical education; a statement of principles on academic credit; and whether regional higher education accreditation should go national. Read more
QUICK STAT (SHORT, TIMELY, AND TOPICAL)
Lifetime Prevalence Of Self-Reported Work-Related Health Problems Among U.S. Workers
National Health Expenditure Projections, 2019-2028
Skin-Interfaced Biosensors For Wireless Physiological Monitoring In Neonatal And Pediatric Intensive-Care Units
Bacterial Colonization Reprograms The Neonatal Gut Metabolome Read more
AVAILABLE RESOURCES ACCESSIBLE ELECTRONICALLY
Brain Health Across The Lifespan
Leading In A Time Of Crisis: Corporate America And COVID-19
Confronting Rural America’s Health Care Crisis Read more
RACIAL DISPARITIES IN AUTOMATED SPEECH RECOGNITION SYSTEMS
Mentions how these tools do not work equally well for all subgroups of the population, with study results showing that all five ASR systems in an investigation exhibited substantial racial disparities, with an average word error rate (WER) of 0.35 for black speakers compared with 0.19 for white speakers. Read More
ESTABLISHING HIGH PERFORMING TEAMS: HEALTH CARE LESSONS
Refers to a study that shows while both Functional Change and Cultural Change processes were individually important for enhancing team-based health care, they were most effective when mobilized in tandem. Read more