Expanding Speech Audio Databases for Varied Representations
wanna get your hands on the new speech dataset for studying trustworthy speech? lemme break it down for ya!
researchers from the University of Essex recently dropped this bad boy, packin' about 1,000 audio clips from 'bout 100 speakers. each speaker recorded sentences in two tones: neutral and deliberately trustworthy. them clips got demographic details and vocal features like pitch and clarity labeled, too.
these researchers didn't play around, they made sure to capture speakers of different ages and ethnic backgrounds to fill that gap in voice perception research. it's about time we had more inclusive speech-based AI models, ain't it?
if you wanna get your mitts on this dataset, start by checkin' out large, reputable data repositories or specialized search tools. dat Google Dataset Search is your boy for this, with coverage across thousands of academic and public repositories [2][5].
the University of Essex Library's data portal's another good spot to look [2]. if it ain't available via open access, check if they got a subscription to it by searchin' their holdings or consultin' their support team.
open-access portals and national archives like DataPortals.org, ICPSR, Harvard Dataverse, and the UK Data Archive are winners [1][2]. if you find the dataset mentioned in recent pubs, it might be avail. directly from the publisher or repository [1].
for the specific dataset mentioned in PubMed, check the supplementary materials or data availability section of the publication [1].
this dataset's got some key features, too:- it's diverse AF, with 'bout 1,152 utterances from 96 speakers, covering white, black, and South Asian backgrounds, and split between younger (ages 18–45, N=60) and older (ages 60+, N=36) adults [1]- each speaker recorded both neutral (natural) speech and speech specifically intended to convey trustworthiness [1]- it's packed with acoustic and voice quality features, perfect for linear and non-linear classification models [1]- methods using this dataset achieved accuracies of around 70% in distinguishing between neutral and trustworthy speech [1]
the dataset improves on previous collections by havin' a broader range of ages and ethnic backgrounds, makin' research findings more generalizable and applicable to diverse populations [1]. all participants were untrained, too, so it's more naturalistic and varied [1]. best part? it's open-access, lowerin' barriers for researchers worldwide and supportin' broader, more inclusive research in voice perception and trustworthiness [1].
so go on, get ye to a data repository and find this dataset! it's a game-changer for speech research, especially if you're interested in trustworthy speech and AI models. good luck, fellow researcher!
To delve into the new speech dataset focusing on trustworthy speech, you can start by exploring large, reputable data repositories or utilizing specialized search tools like Google Dataset Search. The University of Essex Library's data portal is another potential source, especially if the dataset is accessible through their holdings.
This dataset stands out for its inclusivity, containing spoken utterances from a diverse range of ethnic backgrounds, including white, black, and South Asian, as well as speakers of various ages. Researchers interested in trustworthy speech and AI models will find this open-access dataset beneficial, as it offers acoustic and voice quality features that support both linear and non-linear classification models.