Dates
Wednesday, May 08, 2024 - 01:00pm to Wednesday, May 08, 2024 - 02:30pm
Location
NCS 115
Event Description

Abstract: Speech is a promising biomarker for cognitive impairment and dementing illness.































Compared to traditional biomarkers, digital biomarkers are often less
invasive, cheaper to
measure, and require less instruction. They also allow continuous and
longitudinal data
acquisition while offering more objectivity. However, the scarcity of
large speech datasets
annotated with age-related information makes this application challenging. This
dissertation aims to study how well audio and language features
predict health outcomes
among older veterans, with the goal to improve diagnostic and
intervention strategies.
First, a large audio dataset is curated from the Veteran History
Project and is matched
with death databases for age and mortality information. Deep learning
techniques are
then applied to extract signals from this self-curated audio dataset,
which are proven to be
valuable indicators of aging and overall health.
Language analysis is also applied on the transcripts derived from
audio recordings with
the goal to uncover nuanced linguistic patterns and potential
indicators of mental and
physical health conditions that might not be immediately evident. To
enrich the analysis
and achieve a greater degree of accuracy, we incorporate data from the
National Death
Index with specific causes of death, which enables the establishment
of a more direct
correlation between extracted features and the prevalence of certain
diseases. By
integrating these data sources, our capacity to identify potential
disease indicators is
enhanced.
In addition to vocal aging, this dissertation also discusses the
evaluation of large language models on word definitions. The emergence
of large language models brings new opportunities to industries and
new directions for research. We conduct an exploratory study of the
degree of alignment between word definitions from classical
dictionaries and these newer computational artifacts, in which
distance correlation metrics are applied to compare word embeddings
and sentence embeddings of both dictionary and generated definitions
in different dimensional spaces.










ReplyForward
Event Title
Ph.D. Thesis Defense: 'A Study of Aging through Speech and Language Analysis,' Yunting Yin