Automated speech recognition (ASR) systems, which use sophisticated machine-learning algorithms to convert spoken language to text, have become increasingly widespread, powering popular virtual assistants, facilitating automated closed captioning, and enabling digital dictation platforms for health care. This technology is employed in myriad applications used by millions of individuals worldwide. Some examples include virtual assistants built into mobile devices, home appliances, and in-car systems; digital dictation for completing medical records; automatic translation; automated subtitling for video content; and hands-free computing. Over the last several years, the quality of these systems has dramatically improved, due both to advances in deep learning and to the collection of large-scale datasets used to train the systems. Some concern exists, however, that these tools do not work equally well for all subgroups of the population.

As described in an article published in the April 7, 2020 issue of the journal Proceedings of the National Academy of Sciences of the United States of America, researchers examined the ability of five state-of-the-art ASR systems developed by Amazon, Apple, Google, IBM, and Microsoft to transcribe structured interviews conducted with 42 white speakers and 73 black speakers. This corpus in total spans five U.S. cities and consists of 19.8 hours of audio matched on the age and gender of the speaker. The study indicates that all five ASR systems exhibited substantial racial disparities, with an average word error rate (WER) of 0.35 for black speakers compared with 0.19 for white speakers. The investigators trace these disparities to the underlying acoustic models used by the ASR systems as the race gap was equally large on a subset of identical phrases spoken by black and white individuals in the corpus. They conclude by proposing strategies, such as using more diverse training datasets that include African American Vernacular English, to reduce these performance differences and ensure speech recognition technology is inclusive.

Return to april 2020 TRENDS

More April 2020 TRENDS Articles

CALLING ALL CARS AND HEALTH DETECTIVES

Indicates the important role that epidemiologists play in explaining what is transpiring at key stages of COVID-19. Read more

PRESIDENT’S CORNER

ASAHP President Phyllis King discusses how with the thrust into the digitization of healthcare, the question for higher education is how fast can we understand, adapt, anticipate and project patient care needs and healthcare innovations to prepare our students and meet the needs of this new world? Read more

FAST CHANGING LEGISLATIVE ENVIRONMENT

Depicts efforts by the federal government to provide additional funding through Paycheck Program Protection legislation, along with an increasing concern that the U.S. is too dependent on other nations for supplying minerals used in the production of pharmaceuticals and medical devices. Read more

HEALTH REFORM DEVELOPMENTS

Points out how the existence of accountable care organizations (ACOs) is threatened by the current pandemic; describes COVID-19 surveillance activities in relation to the Fourth Amendment of the U.S. Constitution; and loosening by the Center for Medicare & Medicaid Services (CMS) of telehealth and scope of practice regulations. Read more

DEVELOPMENTS IN HIGHER EDUCATION

Describes a recent ASAHP webinar on clinical education; a statement of principles on academic credit; and whether regional higher education accreditation should go national. Read more

QUICK STAT (SHORT, TIMELY, AND TOPICAL)

Lifetime Prevalence Of Self-Reported Work-Related Health Problems Among U.S. Workers
National Health Expenditure Projections, 2019-2028
Skin-Interfaced Biosensors For Wireless Physiological Monitoring In Neonatal And Pediatric Intensive-Care Units
Bacterial Colonization Reprograms The Neonatal Gut Metabolome Read more

AVAILABLE RESOURCES ACCESSIBLE ELECTRONICALLY

Brain Health Across The Lifespan
Leading In A Time Of Crisis: Corporate America And COVID-19
Confronting Rural America’s Health Care Crisis Read more

RACIAL DISPARITIES IN AUTOMATED SPEECH RECOGNITION SYSTEMS

Mentions how these tools do not work equally well for all subgroups of the population, with study results showing that all five ASR systems in an investigation exhibited substantial racial disparities, with an average word error rate (WER) of 0.35 for black speakers compared with 0.19 for white speakers. Read More

ESTABLISHING HIGH PERFORMING TEAMS: HEALTH CARE LESSONS

Refers to a study that shows while both Functional Change and Cultural Change processes were individually important for enhancing team-based health care, they were most effective when mobilized in tandem. Read more