About MuHSiC
The Multilingual Hispanic Speech in California (MuHSiC) corpus is a collection of audio-recordings of Spanish-English bilinguals living in California. It is a robust and linguistically rich oral corpus of bilingual Spanish-English speech samples culled from naturalistic conversations among diverse social profiles and regional origins. The conversations cover a variety of topics, from (im)migration stories and traditional folktales to culinary recipes and instructions for speaking Spanglish. Speech samples were audio-recorded in high-quality, uncompressed digital formats that allow for the corpus to be used for analysis of a wide range of linguistic features. Data collection was carried out throughout California and was coordinated in three different hubs -Berkeley, Los Angeles, and Santa Cruz- from speakers of diverse ages and social profiles.
